What steps need to be taken to set up a DELTA LIVE PIPELINE as a job using the workspace UI?

What steps need to be taken to set up a DELTA LIVE PIPELINE as a job using the workspace UI?
A . DELTA LIVE TABLES do not support job cluster
B. Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook
C. Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the pipeline JSON file
D. Use Pipeline creation UI, select a new pipeline and job cluster

Answer: B

Explanation:

The answer is,

Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook.

Create a pipeline

To create a new pipeline using the Delta Live Tables notebook:

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?
A . Use Unity catalog to grant access to rows and columns
B. Row and column access control lists
C. Use dynamic view functions
D. Data access control lists
E. Dynamic Access control lists with Unity Catalog

Answer: C

Explanation:

The answer is, Use dynamic view functions.

Here is an example that limits access to rows based on the user being part managers group, in the below view if a user is not a part of the manager’s group you can only see rows where the total amount is <= 1000000

Dynamic view function to filter rows

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?
A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B. They can set up an Alert for the query to notify when the ELT job fails.
C. They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D. They can set up an Alert for the query to notify them if the returned value is greater than 60.
E. This type of alert is not possible in Databricks

Answer: D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.
A . Databricks notebooks support change tracking and versioning
B. Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C. Databricks notebooks can be exported into dbc archive files and stored in data lake
D. Databricks notebook can be exported as HTML and imported at a later time

Answer: A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Which of the following statements can be used to test the functionality of code to test number of rows in the table equal to 10 in python?

Which of the following statements can be used to test the functionality of code to test number of rows in the table equal to 10 in python?

row_count = spark.sql("select count(*) from table").collect()[0][0]
A . assert (row_count = 10, "Row count did not match")
B. assert if (row_count = 10, "Row count did not match")
C. assert row_count == 10, "Row count did not match"
D. assert if row_count == 10, "Row count did not match"
E. assert row_count = 10, "Row count did not match"

Answer: C

Explanation:

The answer is assert row_count == 10, "Row count did not match"

Review below documentation

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?
A . MERGE INTO table_name
B. COPY INTO table_name
C. UPDATE table_name
D. INSERT INTO OVERWRITE table_name
E. INSERT IF EXISTS table_name

Answer: A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

When writing streaming data, Spark’s structured stream supports the below write modes

When writing streaming data, Spark’s structured stream supports the below write modes
A . Append, Delta, Complete
B. Delta, Complete, Continuous
C. Append, Complete, Update
D. Complete, Incremental, Update
E. Append, overwrite, Continuous

Answer: C

Explanation:

The answer is Append, Complete, Update

•Append mode (default) – This is the default mode, where only the new rows added to the Result Table since the last trigger will be outputted to the sink. This is supported for only those queries where rows added to the Result Table is never going to change. Hence, this mode guarantees that each row will be output only once (assuming fault-tolerant sink). For example, queries with only select, where, map, flatMap, filter, join, etc. will support Append mode.

• Complete mode – The whole Result Table will be outputted to the sink after every trigger.

This is supported for aggregation queries.

• Update mode – (Available since Spark 2.1.1) Only the rows in the Result Table that were updated since the last trigger will be outputted to the sink. More information to be added in future releases.