You have configured AUTO LOADER to process incoming IOT data from cloud object storage every 15 mins, recently a change was made to the notebook code to update the processing logic but the team later realized that the notebook was failing for the last 24 hours, what steps team needs to take to reprocess the data that was not loaded after the notebook was corrected?
You have configured AUTO LOADER to process incoming IOT data from cloud object storage every 15 mins, recently a change was made to the notebook code to update the processing logic but the team later realized that the notebook was failing for the last 24 hours, what steps team needs...
Which of the following approaches can the data engineer use to obtain a version-controllable con-figuration of the Job’s schedule and configuration?
Which of the following approaches can the data engineer use to obtain a version-controllable con-figuration of the Job’s schedule and configuration?A . They can link the Job to notebooks that are a part of a Databricks Repo. B. They can submit the Job once on a Job cluster. C. They...
Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?
Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches B. Databricks Repos can merge changes from a secondary Git branch into a main Git branch C....
When you drop an external DELTA table using the SQL Command DROP TABLE table_name, how does it impact metadata (delta log, history), and data stored in the storage?
When you drop an external DELTA table using the SQL Command DROP TABLE table_name, how does it impact metadata (delta log, history), and data stored in the storage?A . Drops table from metastore, metadata (delta log, history) and data in storage B. Drops table from metastore, data but keeps metadata...
Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?
A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime. Which of...
What is the purpose of a gold layer in Multi-hop architecture?
What is the purpose of a gold layer in Multi-hop architecture?A . Optimizes ETL throughput and analytic query performance B. Eliminate duplicate records C. Preserves grain of original data, without any aggregations D. Data quality checks and schema enforcement E. Powers ML applications, reporting, dashboards and adhoc reports.View AnswerAnswer: E...
Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?
Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?A . MERGE INTO table_name B. COPY INTO table_name C. UPDATE table_name D. INSERT INTO OVERWRITE table_name E. INSERT IF EXISTS table_nameView AnswerAnswer: A Explanation:...
You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?
You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?A . AUTO LOADER B. JOBS and TASKS C. SQL Endpoints...
You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time
You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup timeA . Setup a second job ahead of first job to start the cluster, so...
table("uncleanedSales")
table("uncleanedSales")View AnswerAnswer: B Explanation: The answer is