Holding all other variables constant and assuming records need to be processed in less than 10 seconds, which adjustment will meet the requirement?
A Structured Streaming job deployed to production has been experiencing delays during peak hours of the day. At present, during normal execution, each microbatch of data is processed in less than 3 seconds. During peak hours of the day, execution time for each microbatch becomes very inconsistent, sometimes exceeding 30...
Which statement correctly describes the outcome of executing these command cells in order in an interactive notebook?
A junior member of the data engineering team is exploring the language interoperability of Databricks notebooks. The intended outcome of the below code is to register a view of all sales that occurred in countries on the continent of Africa that appear in the geo_lookup table. Before executing the code,...
Which solution meets these requirements?
An upstream system is emitting change data capture (CDC) logs that are being written to a cloud object storage directory. Each record in the log indicates the change type (insert, update, or delete) and the values for each field after the change. The source table has a primary key identified...
When scheduling Structured Streaming jobs for production, which configuration automatically recovers from query failures and keeps costs low?
When scheduling Structured Streaming jobs for production, which configuration automatically recovers from query failures and keeps costs low?A . Cluster: New Job Cluster; Retries: Unlimited; Maximum Concurrent Runs: UnlimitedB . Cluster: New Job Cluster; Retries: None; Maximum Concurrent Runs: 1C . Cluster: Existing All-Purpose Cluster; Retries: Unlimited; Maximum Concurrent Runs:...
Which statement describes how the Delta engine identifies which files to load?
A Delta table of weather records is partitioned by date and has the below schema: date DATE, device_id INT, temp FLOAT, latitude FLOAT, longitude FLOAT To find all the records from within the Arctic Circle, you execute a query with the below filter: latitude > 66.3 Which statement describes how...
If the upstream system is known to occasionally produce duplicate entries for a single order hours apart, which statement is correct?
An upstream source writes Parquet data as hourly batches to directories named with the current date. A nightly batch job runs the following code to ingest all data from the previous day as indicated by the date variable: Assume that the fields customer_id and order_id serve as a composite key...
Which approach will ensure that this requirement is met?
The data architect has mandated that all tables in the Lakehouse should be configured as external Delta Lake tables. Which approach will ensure that this requirement is met?A . Whenever a database is being created, make sure that the location keyword is usedB . When configuring an external data warehouse...
Assuming that this code produces logically correct results and the data in the source tables has been de-duplicated and validated, which statement describes what will occur when this code is executed?
The data engineering team maintains the following code: Assuming that this code produces logically correct results and the data in the source tables has been de-duplicated and validated, which statement describes what will occur when this code is executed?A . A batch job will update the enriched_itemized_orders_by_account table, replacing only...
Which approach will allow this developer to review the current logic for this notebook?
A junior developer complains that the code in their notebook isn't producing the correct results in the development environment. A shared screenshot reveals that while they're using a notebook versioned with Databricks Repos, they're using a personal branch that contains old logic. The desired branch named dev-2.3.9 is not available...
Which statement describes the results of querying recent_orders?
A table is registered with the following code: Both users and orders are Delta Lake tables. Which statement describes the results of querying recent_orders?A . All logic will execute at query time and return the result of joining the valid versions of the source tables at the time the query...