SageMaker Serverless Inference illustrates Amazon’s philosophy for ML workloads

Amazon just unveiled Serverless Inference, a new option for SageMaker, its fully managed machine learning (ML) service. The goal for Amazon SageMaker Serverless Inference is to serve use cases with intermittent or infrequent traffic patterns, lowering total cost of ownership (TCO) and making the service easier to use.

VentureBeat connected with Bratin Saha, AWS VP of Machine Learning, to discuss where Amazon SageMaker Serverless fits into the big picture of Amazon’s machine learning offering and how it affects ease of use and TCO, as well as Amazon’s philosophy and process in developing its machine learning portfolio.

Inference is the productive phase of ML-powered applications. After a machine learning model has been created and fine-tuned using historical data, it is deployed for use in production. Inference refers to taking new data as input and producing results based on that data. For production ML applications, Amazon notes, inference accounts for up to 90% of total compute costs.

According to Saha, Serverless Inference has been an oft-requested feature. In December 2021, SageMaker Serverless Inference was introduced in preview, and as of today, it is generally available.

Serverless Inference enables SageMaker users to deploy machine learning models for inference without having to configure or manage the underlying infrastructure. The service can automatically provision and scale compute capacity based on the volume of inference requests. During idle time, it turns off compute capacity completely so that users are not charged.

Read the full article on VentureBeat

Join the Orchestrate all the Things Newsletter

Stories about how Technology, Data, AI and Media flow into each other shaping our lives. Analysis, Essays, Interviews, News. Mid-to-long form, 1-3 times/month.


Write a Reply or Comment

Your email address will not be published.