Which of the following deployment targets should you choose for the different machine learning models, given their specific requirements?

exams MLA-C01 MLA-C01 exam 0 Comments

The fraud detection model is a large model and needs to be integrated into serverless applications to minimize infrastructure management.

Which of the following deployment targets should you choose for the different machine learning models, given their specific requirements? (Select two)
A . Choose Amazon Elastic Container Service (Amazon ECS) for the recommendation model, as it provides container orchestration for large-scale, batch processing workloads with tight integration into other AWS services
B . Use AWS Lambda to deploy the fraud detection model, which requires rapid scaling and integration into an existing serverless architecture, minimizing infrastructure management
C . Deploy the real-time recommendation model using Amazon SageMaker endpoints to ensure low-latency, high-availability, and managed infrastructure for real-time inference
D . Deploy the generative AI model using Amazon Elastic Kubernetes Service (Amazon EKS) to leverage containerized microservices for high scalability and control over the deployment environment
E . Deploy all models using Amazon SageMaker endpoints for consistency and ease of management, regardless of their individual requirements for scalability, latency, or integration

Answer: C, D

Explanation:

Correct options:

Deploy the real-time recommendation model using Amazon SageMaker endpoints to ensure low-latency, high-availability, and managed infrastructure for real-time inference

Amazon EKS is designed for containerized applications that need high scalability and flexibility. It is suitable for the generative AI model, which may require complex orchestration and scaling in response to varying demand, while giving you full control over the deployment environment.

via – https://aws.amazon.com/blogs/containers/deploy-generative-ai-models-on-amazon-eks/

Deploy the generative AI model using Amazon Elastic Kubernetes Service (Amazon EKS) to leverage containerized microservices for high scalability and control over the deployment environment

Real-time inference is ideal for inference workloads where you have real-time, interactive, low latency requirements. You can deploy your model to SageMaker hosting services and get an endpoint that can be used for inference. These endpoints are fully managed and support autoscaling.

This makes it an ideal choice for the recommendation model, which must provide fast responses to user interactions with minimal downtime.

Incorrect options:

Use AWS Lambda to deploy the fraud detection model, which requires rapid scaling and integration into an existing serverless architecture, minimizing infrastructure management – While AWS Lambda is excellent for serverless applications, it may not be the best choice for a fraud detection model if it requires continuous, low-latency processing or needs to handle very high throughput. Lambda is better suited for lightweight, event-driven tasks rather than long-running, complex inference jobs.

Choose Amazon Elastic Container Service (Amazon ECS) for the recommendation model, as it provides container orchestration for large-scale, batch processing workloads with tight integration into other AWS services – Amazon ECS is a good choice for containerized workloads but is generally more appropriate for batch processing or large-scale, stateless applications. It might not provide the low-latency and real-time capabilities needed for the recommendation model.

Deploy all models using Amazon SageMaker endpoints for consistency and ease of management, regardless of their individual requirements for scalability, latency, or integration – Deploying all models using Amazon SageMaker endpoints without considering their specific requirements for latency, scalability, and integration would be suboptimal. While SageMaker endpoints are highly versatile, they may not be the best fit for every use case, especially for models requiring serverless architecture or advanced container orchestration.

References:

https://docs.aws.amazon.com/sagemaker/latest/dg/realtime-endpoints.html

https://aws.amazon.com/blogs/containers/deploy-generative-ai-models-on-amazon-eks/