What should you do?

You support a production service that runs on a single Compute Engine instance. You regularly need to spend time on recreating the service by deleting the crashing instance and creating a new instance based on the relevant image. You want to reduce the time spent performing manual operations while following Site Reliability Engineering principles.

What should you do?
A . File a bug with the development team so they can find the root cause of the crashing instance.
B . Create a Managed Instance Group with a single instance and use health checks to determine the system status.
C . Add a Load Balancer in front of the Compute Engine instance and use health checks to determine the system status.
D . Create a Stackdriver Monitoring dashboard with SMS alerts to be able to start recreating the crashed instance promptly after it has crashed.

Answer: A

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments