Which solution meets these requirements?

An insurance company has raw data in JSON format that is sent without a predefined schedule through an Amazon Kinesis Data Firehose delivery stream to an Amazon S3 bucket. An AWS Glue crawler is scheduled to run every 8 hours to update the schema in the data catalog of the...

October 11, 2023 No Comments READ MORE +

Which action would MOST likely increase the performance of accessing log data in Amazon S3?

A media company has been performing analytics on log data generated by its applications. There has been a recent increase in the number of concurrent analytics jobs running, and the overall performance of existing jobs is decreasing as the number of new jobs is increasing. The partitioned data is stored...

October 10, 2023 No Comments READ MORE +

Which solution meets these requirements?

A company is planning to do a proof of concept for a machine learning (ML) project using Amazon SageMaker with a subset of existing on-premises data hosted in the company’s 3 TB data warehouse. For part of the project, AWS Direct Connect is established and tested. To prepare the data...

October 10, 2023 No Comments READ MORE +

Which solution meets these requirements?

A company analyzes its data in an Amazon Redshift data warehouse, which currently has a cluster of three dense storage nodes. Due to a recent business acquisition, the company needs to load an additional 4 TB of user data into Amazon Redshift. The engineering team will combine all the user...

October 10, 2023 No Comments READ MORE +

Which program modification will accelerate the COPY process?

A large company receives files from external parties in Amazon EC2 throughout the day. At the end of the day, the files are combined into a single file, compressed into a gzip file, and uploaded to Amazon S3. The total size of all the files is close to 100 GB...

October 10, 2023 No Comments READ MORE +

What is the most cost-effective solution?

Once a month, a company receives a 100 MB .csv file compressed with gzip. The file contains 50,000 property listing records and is stored in Amazon S3 Glacier. The company needs its data analyst to query a subset of the data for a specific vendor. What is the most cost-effective...

October 10, 2023 No Comments READ MORE +

Which solution meets these requirements?

A regional energy company collects voltage data from sensors attached to buildings. To address any known dangerous conditions, the company wants to be alerted when a sequence of two voltage drops is detected within 10 minutes of a voltage spike at the same building. It is important to ensure that...

October 9, 2023 No Comments READ MORE +

How should this data be stored for optimal performance?

A company that produces network devices has millions of users. Data is collected from the devices on an hourly basis and stored in an Amazon S3 data lake. The company runs analyses on the last 24 hours of data flow logs for abnormality detection and to troubleshoot and resolve user...

October 9, 2023 No Comments READ MORE +

Which approach should the data analytics team take to allow product owners to view only their products in the dashboard?

A retail company’s data analytics team recently created multiple product sales analysis dashboards for the average selling price per product using Amazon QuickSight. The dashboards were created from .csv files uploaded to Amazon S3. The team is now planning to share the dashboards with the respective external product owners by...

October 9, 2023 No Comments READ MORE +

What table design provides optimal query performance?

A large ride-sharing company has thousands of drivers globally serving millions of unique customers every day. The company has decided to migrate an existing data mart to Amazon Redshift. The existing schema includes the following tables. A trips fact table for information on completed rides. A drivers dimension table for...

October 9, 2023 No Comments READ MORE +