Which scaling policy is the MOST SUITABLE for this scenario, and why?

exams MLA-C01 MLA-C01 exam 0 Comments

You are an ML engineer at a retail company that uses a SageMaker model to generate product recommendations for customers in real-time. During peak shopping periods, the traffic to the recommendation engine increases dramatically. The company needs to ensure that the model endpoint can handle these spikes in demand without compromising on response time or customer experience. At the same time, you want to optimize costs by scaling down resources during periods of low demand. You are evaluating different scaling policies to manage this dynamic workload effectively.

Which scaling policy is the MOST SUITABLE for this scenario, and why?
A . Use a manual scaling policy where you adjust the number of instances based on real-time monitoring of traffic, allowing you to fine-tune resource allocation as needed during high-demand periods
B . Use scheduled scaling to preemptively add or remove instances based on anticipated traffic patterns, such as known peak times during Black Friday, to ensure sufficient capacity is available when needed
C . Use a target tracking scaling policy that automatically adjusts the number of instances based on a predefined target metric, such as CPU utilization or invocations per instance, to maintain a steady level of performance during traffic spikes
D . Use a step scaling policy that adjusts the number of instances based on the size of the traffic spike,
adding a set number of instances for each level of increased demand

Answer: C

Explanation:

Use a target tracking scaling policy that automatically adjusts the number of instances based on a predefined target metric, such as CPU utilization or invocations per instance, to maintain a steady level of performance during traffic spikes

A target tracking scaling policy is ideal for handling dynamic and unpredictable traffic spikes, as it continuously adjusts the number of instances to maintain a predefined metric (e.g., CPU utilization, invocations per instance). This ensures that performance remains consistent even during high-demand periods like Black Friday, while also scaling down during quieter times to save costs.

via – https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-auto-scaling-prerequisites.html

Incorrect options:

Use a step scaling policy that adjusts the number of instances based on the size of the traffic spike, adding a set number of instances for each level of increased demand – Step scaling is useful for handling sudden, sharp increases in demand, but it may not respond as smoothly to varying levels of traffic. It’s less flexible than target tracking and might lead to over- or under-provisioning if traffic patterns are unpredictable.

Use scheduled scaling to preemptively add or remove instances based on anticipated traffic patterns, such as known peak times during Black Friday, to ensure sufficient capacity is available when needed – Scheduled scaling works well when traffic patterns are predictable, such as daily or weekly cycles. However, it’s less effective for managing unexpected traffic spikes, as it relies on predefined schedules rather than real-time data.

Use a manual scaling policy where you adjust the number of instances based on real-time monitoring of traffic, allowing you to fine-tune resource allocation as needed during high-demand periods – Manual scaling provides the most control but requires constant monitoring and intervention, which is impractical during high-traffic events like Black Friday. It also risks delays in scaling, which could lead to performance issues during traffic surges.

Reference: https://docs.aws.amazon.com/sagemaker/latest/dg/endpoint-auto-scaling-prerequisites.html