Which solution meets these requirements in the MOST cost-effective manner?

A company is running an Apache Hadoop cluster on Amazon EC2 instances. The Hadoop cluster stores approximately 100 TB of data for weekly operational reports and allows occasional access for data scientists to retrieve data. The company needs to reduce the cost and operational complexity for storing and serving this data.

Which solution meets these requirements in the MOST cost-effective manner?
A . Move the Hadoop cluster from EC2 instances to Amazon EMR. Allow data access patterns to remain the same.
B. Write a script that resizes the EC2 instances to a smaller instance type during downtime and resizes the instances to a larger instance type before the reports are created.
C. Move the data to Amazon S3 and use Amazon Athena to query the data for reports.
Allow the data scientists to access the data directly in Amazon S3.
D. Migrate the data to Amazon DynamoDB and modify the reports to fetch data from DynamoDB. Allow the data scientists to access the data directly in DynamoDB.

Answer: C

Explanation:

"The company needs to reduce the cost and operational complexity for storing and serving this data.

Which solution meets these requirements in the MOST cost-effective manner?" EMR storage is ephemeral. The company has 100TB that need to persist, they would have to use EMRFS to backup to S3 anyway.

https://docs.aws.amazon.com/emr/latest/ManagementGuide/emr-plan-storage.html

100TB

EBS – 8.109$

S3 – 2.355$

You have saved 5.752$

This amount can be used for Athen. BTW. we don’t know indexes, amount of data that is scanned.

What we know is that tit will be: "occasional access for data scientists to retrieve data"

Latest SAP-C01 Dumps Valid Version with 684 Q&As

Latest And Valid Q&A | Instant Download | Once Fail, Full Refund

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments