Serverless computing for ML workloads: The convergence of on-demand resources and model deployment

Boorugula, Ramya (2025) Serverless computing for ML workloads: The convergence of on-demand resources and model deployment. World Journal of Advanced Engineering Technology and Sciences, 15 (2). pp. 918-924. ISSN 2582-8266

[thumbnail of WJAETS-2025-0637.pdf] Article PDF
WJAETS-2025-0637.pdf - Published Version
Available under License Creative Commons Attribution Non-commercial Share Alike.

Download ( 506kB)

Abstract

Serverless computing represents a transformative approach for machine learning deployments, offering event-driven execution, automatic scaling, and pay-per-use billing models that address longstanding operational challenges. This article explores the convergence of serverless architectures with machine learning workloads, examining how this integration reshapes deployment practices and operational economics. The global serverless architecture market continues rapid expansion, with ML deployments representing an increasingly significant segment. The evolution from traditional server-based deployments through containerization to serverless paradigms reveals quantifiable benefits in resource utilization, operational overhead reduction, and cost efficiency for intermittent workloads. Current serverless ML solutions demonstrate substantial improvements in cold start latencies, memory limitations, and specialized hardware access compared to earlier implementations. Performance analysis reveals nuanced tradeoffs between dedicated and serverless infrastructures across dimensions of latency, throughput, cost efficiency, resource utilization, and operational overhead. Implementation strategies including hybrid architectures, model optimization techniques, effective resource provisioning, and targeted cost management approaches collectively enable organizations to maximize benefits while mitigating limitations. This comprehensive article provides ML practitioners and architects with actionable insights to navigate the evolving serverless ML landscape and make informed decisions about where serverless approaches offer maximum value in deployment strategies.

Item Type: Article
Official URL: https://doi.org/10.30574/wjaets.2025.15.2.0637
Uncontrolled Keywords: Serverless Computing; Machine Learning Deployment; Automatic Scaling; Cost Optimization; Hybrid Architectures
Depositing User: Editor Engineering Section
Date Deposited: 04 Aug 2025 16:34
Related URLs:
URI: https://eprint.scholarsrepository.com/id/eprint/3629