Databricks Databricks Certified Data Engineer Professional Databricks Certified Data Engineer Professional Exam Online Training

Question #1

You were asked to setup a new all-purpose cluster, but the cluster is unable to start which of the following steps do you need to take to identify the root cause of the issue and the reason why the cluster was unable to start?

A . Check the cluster driver logs
B . Check the cluster event logs
(Correct)
C . Workspace logs
D . Storage account
E . Data plane

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Cluster event logs are very useful, to identify issues pertaining to cluster availability. Cluster may not start due to resource limitations or issues with the cloud providers.

Some of the common issues include a subnet for compute VM reaching its limits or exceeding the subscription or cloud account CPU quota limit.

Here is an example where the cluster did not start due to subscription reaching the quota limit on a certain type of cpu cores for a VM type.

Graphical user

interface, text, application, email

Description automatically generated

Click on event logs

Graphical user

interface, text, application, email

Description automatically generated

Click on the message to see the detailed error message on why the cluster did not start.

Graphical user

interface, text, application, email

Description automatically generated

Question #2

A SQL Dashboard was built for the supply chain team to monitor the inventory and product orders, but all of the timestamps displayed on the dashboards are showing in UTC format, so they requested to change the time zone to the location of New York.

How would you approach resolving this issue?

A . Move the workspace from Central US zone to East US Zone
B . Change the timestamp on the delta tables to America/New_York format
C . Change the spark configuration of SQL endpoint to format the timestamp to Ameri-ca/New_York
D . Under SQL Admin Console, set the SQL configuration parameter time zone to Ameri-ca/New_York
E . Add SET Timezone = America/New_York on every of the SQL queries in the dashboard.

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Under SQL Admin Console, set the SQL configuration parameter time zone to America/New_York

Here are steps you can take this to configure, so the entire dashboard is changed without

changing individual queries

Configure SQL parameters

To configure all warehouses with SQL parameters:

Question #2

A SQL Dashboard was built for the supply chain team to monitor the inventory and product orders, but all of the timestamps displayed on the dashboards are showing in UTC format, so they requested to change the time zone to the location of New York.

How would you approach resolving this issue?

A . Move the workspace from Central US zone to East US Zone
B . Change the timestamp on the delta tables to America/New_York format
C . Change the spark configuration of SQL endpoint to format the timestamp to Ameri-ca/New_York
D . Under SQL Admin Console, set the SQL configuration parameter time zone to Ameri-ca/New_York
E . Add SET Timezone = America/New_York on every of the SQL queries in the dashboard.

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Under SQL Admin Console, set the SQL configuration parameter time zone to America/New_York

Here are steps you can take this to configure, so the entire dashboard is changed without

changing individual queries

Configure SQL parameters

To configure all warehouses with SQL parameters:

Question #2

A SQL Dashboard was built for the supply chain team to monitor the inventory and product orders, but all of the timestamps displayed on the dashboards are showing in UTC format, so they requested to change the time zone to the location of New York.

How would you approach resolving this issue?

A . Move the workspace from Central US zone to East US Zone
B . Change the timestamp on the delta tables to America/New_York format
C . Change the spark configuration of SQL endpoint to format the timestamp to Ameri-ca/New_York
D . Under SQL Admin Console, set the SQL configuration parameter time zone to Ameri-ca/New_York
E . Add SET Timezone = America/New_York on every of the SQL queries in the dashboard.

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Under SQL Admin Console, set the SQL configuration parameter time zone to America/New_York

Here are steps you can take this to configure, so the entire dashboard is changed without

changing individual queries

Configure SQL parameters

To configure all warehouses with SQL parameters:

Question #2

A SQL Dashboard was built for the supply chain team to monitor the inventory and product orders, but all of the timestamps displayed on the dashboards are showing in UTC format, so they requested to change the time zone to the location of New York.

How would you approach resolving this issue?

A . Move the workspace from Central US zone to East US Zone
B . Change the timestamp on the delta tables to America/New_York format
C . Change the spark configuration of SQL endpoint to format the timestamp to Ameri-ca/New_York
D . Under SQL Admin Console, set the SQL configuration parameter time zone to Ameri-ca/New_York
E . Add SET Timezone = America/New_York on every of the SQL queries in the dashboard.

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Under SQL Admin Console, set the SQL configuration parameter time zone to America/New_York

Here are steps you can take this to configure, so the entire dashboard is changed without

changing individual queries

Configure SQL parameters

To configure all warehouses with SQL parameters:

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #6

You are currently asked to work on building a data pipeline, you have noticed that you are currently working on a very large scale ETL many data dependencies, which of the following tools can be used to address this problem?

A . AUTO LOADER
B . JOBS and TASKS
C . SQL Endpoints
D . DELTA LIVE TABLES
E . STRUCTURED STREAMING with MULTI HOP

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, DELTA LIVE TABLES

DLT simplifies data dependencies by building DAG-based joins between live tables. Here is a view of how the dag looks with data dependencies without additional meta data,

Question #26

When you drop a managed table using SQL syntax DROP TABLE table_name how does it impact metadata, history, and data stored in the table?

A . Drops table from meta store, drops metadata, history, and data in storage.
B . Drops table from meta store and data from storage but keeps metadata and history in storage
C . Drops table from meta store, meta data and history but keeps the data in storage
D . Drops table but keeps meta data, history and data in storage
E . Drops table and history but keeps meta data and data in storage

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

For a managed table, a drop command will drop everything from metastore and storage.

See the below image to understand the differences between dropping an external table.

Diagram

Description automatically generated

Question #27

Which of the following approaches can the data engineer use to obtain a version-controllable con-figuration of the Job’s schedule and configuration?

A . They can link the Job to notebooks that are a part of a Databricks Repo.
B . They can submit the Job once on a Job cluster.
C . They can download the JSON equivalent of the job from the Job’s page.
D . They can submit the Job once on an all-purpose cluster.
E . They can download the XML description of the Job from the Job’s page

Reveal Solution Hide Solution

Correct Answer: D

Question #28

What is the underlying technology that makes the Auto Loader work?

A . Loader
B . Delta Live Tables
C . Structured Streaming
D . DataFrames
E . Live DataFames

Reveal Solution Hide Solution

Correct Answer: C

Question #29

You are currently looking at a table that contains data from an e-commerce platform, each row contains a list of items(Item number) that were present in the cart, when the customer makes a change to the cart the entire information is saved as a separate list and appended to an existing list for the duration of the customer session, to identify all the items customer bought you have to make a unique list of items, you were asked to create a unique item’s list that was added to the cart by the user, fill in the blanks of below query by choosing the appropriate higher-order function?

Note: See below sample data and expected output.

Schema: cartId INT, items Array<INT>

Fill in the blanks:

Fill in the blanks:

SELECT cartId, _(_(items)) FROM carts

A . ARRAY_UNION, ARRAY_DISCINT
B . ARRAY_DISTINCT, ARRAY_UNION
C . ARRAY_DISTINCT, FLATTEN
D . FLATTEN, ARRAY_DISTINCT
E . ARRAY_DISTINCT, ARRAY_FLATTEN

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

FLATTEN -> Transforms an array of arrays into a single array.

ARRAY_DISTINCT -> The function returns an array of the same type as the input argument where all duplicate values have been removed.

Table

Description automatically generated

Question #30

When building a DLT s pipeline you have two options to create a live tables, what is the main difference between CREATE STREAMING LIVE TABLE vs CREATE LIVE TABLE?

A . CREATE STREAMING LIVE table is used in MULTI HOP Architecture
B . CREATE LIVE TABLE is used when working with Streaming data sources and Incremental data
C . CREATE STREAMING LIVE TABLE is used when working with Streaming data sources and Incremental data
D . There is no difference both are the same, CREATE STRAMING LIVE will be deprecated soon
E . CREATE LIVE TABLE is used in DELTA LIVE TABLES, CREATE STREAMING LIVE can only used in Structured Streaming applications

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, CREATE STREAMING LIVE TABLE is used when working with Streaming data sources and Incremental data

Question #31

You are tasked to set up a set notebook as a job for six departments and each department can run the task parallelly, the notebook takes an input parameter dept number to process the data by department, how do you go about to setup this up in job?

A . Use a single notebook as task in the job and use dbutils.notebook.run to run each note-book with parameter in a different cell
B . A task in the job cannot take an input parameter, create six notebooks with hardcoded dept number and setup six tasks with linear dependency in the job
C . A task accepts key-value pair parameters, creates six tasks pass department number as parameter foreach task with no dependency in the job as they can all run in parallel. (Correct)
D . A parameter can only be passed at the job level, create six jobs pass department number to each job with linear job dependency
E . A parameter can only be passed at the job level, create six jobs pass department number to each job with no job dependency

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Here is how you setup

Create a single job and six tasks with the same notebook and assign a different parameter for each task,

Graphical user interface, text, application, email

Description automatically generated

All tasks are added in a single job and can run parallel either using single shared cluster or with individual clusters.

Graphical user

interface, application, Teams

Description automatically generated

Question #32

Which of the following commands can be used to run one notebook from another notebook?

A . notebook.utils.run("full notebook path")
B . execute.utils.run("full notebook path")
C . dbutils.notebook.run("full notebook path")
D . only job clusters can run notebook
E . spark.notebook.run("full notebook path")

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is dbutils.notebook.run(" full notebook path ")

Here is the full command with additional options.

run(path: String, timeout_seconds: int, arguments: Map): String

Question #32

Which of the following commands can be used to run one notebook from another notebook?

A . notebook.utils.run("full notebook path")
B . execute.utils.run("full notebook path")
C . dbutils.notebook.run("full notebook path")
D . only job clusters can run notebook
E . spark.notebook.run("full notebook path")

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is dbutils.notebook.run(" full notebook path ")

Here is the full command with additional options.

run(path: String, timeout_seconds: int, arguments: Map): String

Question #34

You have configured AUTO LOADER to process incoming IOT data from cloud object storage every 15 mins, recently a change was made to the notebook code to update the processing logic but the team later realized that the notebook was failing for the last 24 hours, what steps team needs to take to reprocess the data that was not loaded after the notebook was corrected?

A . Move the files that were not processed to another location and manually copy the files into the ingestion path to reprocess them
B . Enable back_fill = TRUE to reprocess the data
C . Delete the checkpoint folder and run the autoloader again
D . Autoloader automatically re-processes data that was not loaded
E . Manually re-load the data

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is,

Autoloader automatically re-processes data that was not loaded using the checkpoint.

Question #35

John Smith is a newly joined team member in the Marketing team who currently has access read access to sales tables but does not have access to delete rows from the table, which of the following commands help you accomplish this?

A . GRANT USAGE ON TABLE table_name TO john.smith@marketing.com
B . GRANT DELETE ON TABLE table_name TO john.smith@marketing.com
C . GRANT DELETE TO TABLE table_name ON john.smith@marketing.com
D . GRANT MODIFY TO TABLE table_name ON john.smith@marketing.com
E . GRANT MODIFY ON TABLE table_name TO john.smith@marketing.com

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is GRANT MODIFY ON TABLE table_name TO john.smith@marketing.com, please note INSERT, UPDATE, and DELETE are combined into one role called MODIFY. Below are the list of privileges that can be granted to a user or a group, SELECT: gives read access to an object.

CREATE: gives the ability to create an object (for example, a table in a schema).

MODIFY: gives the ability to add, delete, and modify data to or from an object.

USAGE: does not give any abilities, but is an additional requirement to perform any action on a schema object.

READ_METADATA: gives the ability to view an object and its metadata.

CREATE_NAMED_FUNCTION: gives the ability to create a named UDF in an existing catalog or schema.

MODIFY_CLASSPATH: gives the ability to add files to the Spark classpath.

ALL PRIVILEGES: gives all privileges (is translated into all the above privileges

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #36

Which of the following SQL command can be used to insert or update or delete rows based on a condition to check if a row(s) exists?

A . MERGE INTO table_name
B . COPY INTO table_name
C . UPDATE table_name
D . INSERT INTO OVERWRITE table_name
E . INSERT IF EXISTS table_name

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

here is the additional documentation for your review. https://docs.databricks.com/spark/latest/spark-sql/language-manual/delta-merge-into.html

Question #59

return x

check_input(1,3)

A . 1
B . 2
C . 3 (Correct)
D . 4
E . 5

Reveal Solution Hide Solution

Correct Answer: C

Question #60

Which of the following is true, when building a Databricks SQL dashboard?

A . A dashboard can only use results from one query
B . Only one visualization can be developed with one query result
C . A dashboard can only connect to one schema/Database
D . More than one visualization can be developed using a single query result
E . A dashboard can only have one refresh schedule

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

the answer is, More than one visualization can be developed using a single query result. In the query editor pane + Add visualization tab can be used for many visualizations for a single query result.

Graphical user

interface, text, application

Description automatically generated

Question #61

What is the main difference between the bronze layer and silver layer in a medallion architecture?

A . Duplicates are removed in bronze, schema is applied in silver
B . Silver may contain aggregated data
C . Bronze is raw copy of ingested data, silver contains data with production schema and optimized for ELT/ETL throughput
D . Bad data is filtered in Bronze, silver is a copy of bronze data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Medallion Architecture C Databricks

Exam focus: Please review the below image and understand the role of each layer(bronze, silver, gold) in medallion architecture, you will see varying questions targeting each layer and its purpose.

Sorry I had to add the watermark some people in Udemy are copying my content.

A diagram of a house

Description automatically generated with low confidence

Question #62

You noticed that a team member started using an all-purpose cluster to develop a notebook and used the same all-purpose cluster to set up a job that can run every 30 mins so they can update un-derlying tables which are used in a dashboard.

What would you recommend for reducing the overall cost of this approach?

A . Reduce the size of the cluster
B . Reduce the number of nodes and enable auto scale
C . Enable auto termination after 30 mins
D . Change the cluster all-purpose to job cluster when scheduling the job
E . Change the cluster mode from all-purpose to single-mode

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

While using an all-purpose cluster is ok during development but anytime you don’t need to interact with a notebook, especially for a scheduled job it is less expensive to use a job cluster. Using an all-purpose cluster can be twice as expensive as a job cluster.

Please note: The compute cost you pay the cloud provider for the same cluster type and size be-tween an all-purpose cluster and job cluster is the same the only difference is the DBU cost.

The total cost of cluster = Total cost of VM compute (Azure or AWS or GCP) + Cost per DBU

The per DBU cost varies between all-purpose and Job Cluster

Here is the recent cost estimate from AWS between Jobs Cluster and all-purpose Cluster, for jobs compute its $0.15 cents per DBU v$0.55 cents per DBU for all-purpose

Graphical user

interface

Description automatically generated

How do I check how much the DBU cost for my cluster?

When you click on an exister cluster or when you look at the cluster details you will see this in the top right corner

Graphical user

interface, text, application, email

Description automatically generated

Question #63

Which of the following locations hosts the driver and worker nodes of a Databricks-managed cluster?

A . Data plane
B . Control plane
C . Databricks Filesystem
D . JDBC data source
E . Databricks web application

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

The answer is Data Plane, which is where compute(all-purpose, Job Cluster, DLT) are stored this is generally a customer cloud account, there is one exception SQL Warehouses, currently there are 3 types of SQL Warehouse compute available(classic, pro, serverless), in classic and pro compute is located in customer cloud account but serverless computed is located in Databricks cloud account.

Diagram,

timeline

Description automatically generated

Question #64

Which of the following data workloads will utilize a Bronze table as its destination?

A . A job that aggregates cleaned data to create standard summary statistics
B . A job that queries aggregated data to publish key insights into a dashboard
C . A job that ingests raw data from a streaming source into the Lakehouse
D . A job that develops a feature set for a machine learning application
E . A job that enriches data by parsing its timestamps into a human-readable format

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is A job that ingests raw data from a streaming source into the Lakehouse.

The ingested data from the raw streaming data source like Kafka is first stored in the Bronze layer as first destination before it is further optimized and stored in Silver.

Medallion Architecture C Databricks

Bronze Layer:

Question #64

Which of the following data workloads will utilize a Bronze table as its destination?

A . A job that aggregates cleaned data to create standard summary statistics
B . A job that queries aggregated data to publish key insights into a dashboard
C . A job that ingests raw data from a streaming source into the Lakehouse
D . A job that develops a feature set for a machine learning application
E . A job that enriches data by parsing its timestamps into a human-readable format

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is A job that ingests raw data from a streaming source into the Lakehouse.

The ingested data from the raw streaming data source like Kafka is first stored in the Bronze layer as first destination before it is further optimized and stored in Silver.

Medallion Architecture C Databricks

Bronze Layer:

Question #64

Which of the following data workloads will utilize a Bronze table as its destination?

A . A job that aggregates cleaned data to create standard summary statistics
B . A job that queries aggregated data to publish key insights into a dashboard
C . A job that ingests raw data from a streaming source into the Lakehouse
D . A job that develops a feature set for a machine learning application
E . A job that enriches data by parsing its timestamps into a human-readable format

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is A job that ingests raw data from a streaming source into the Lakehouse.

The ingested data from the raw streaming data source like Kafka is first stored in the Bronze layer as first destination before it is further optimized and stored in Silver.

Medallion Architecture C Databricks

Bronze Layer:

Question #64

Which of the following data workloads will utilize a Bronze table as its destination?

A . A job that aggregates cleaned data to create standard summary statistics
B . A job that queries aggregated data to publish key insights into a dashboard
C . A job that ingests raw data from a streaming source into the Lakehouse
D . A job that develops a feature set for a machine learning application
E . A job that enriches data by parsing its timestamps into a human-readable format

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is A job that ingests raw data from a streaming source into the Lakehouse.

The ingested data from the raw streaming data source like Kafka is first stored in the Bronze layer as first destination before it is further optimized and stored in Silver.

Medallion Architecture C Databricks

Bronze Layer:

Question #64

Which of the following data workloads will utilize a Bronze table as its destination?

A . A job that aggregates cleaned data to create standard summary statistics
B . A job that queries aggregated data to publish key insights into a dashboard
C . A job that ingests raw data from a streaming source into the Lakehouse
D . A job that develops a feature set for a machine learning application
E . A job that enriches data by parsing its timestamps into a human-readable format

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is A job that ingests raw data from a streaming source into the Lakehouse.

The ingested data from the raw streaming data source like Kafka is first stored in the Bronze layer as first destination before it is further optimized and stored in Silver.

Medallion Architecture C Databricks

Bronze Layer:

Question #69

What could be the expected output of query SELECT COUNT (DISTINCT *) FROM user on this table

A . 3
B . 2 (Correct)
C . 1
D . 0
E . NULL

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is 2,

Count (DISTINCT *) removes rows with any column with a NULL value

Question #70

A DELTA LIVE TABLE pipelines can be scheduled to run in two different modes, what are these two different modes?

A . Triggered, Incremental
B . Once, Continuous
C . Triggered, Continuous
D . Once, Incremental
E . Continuous, Incremental

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is Triggered, Continuous

https://docs.microsoft.com/en-us/azure/databricks/data-engineering/delta-live-tables/delta-live-tables-concepts#–continuous-and-triggered-pipelines

• Triggered pipelines update each table with whatever data is currently available and then stop the cluster running the pipeline. Delta Live Tables automatically analyzes the dependencies between your tables and starts by computing those that read from external sources. Tables within the pipeline are updated after their dependent data sources have been updated.

• Continuous pipelines update tables continuously as input data changes. Once an update is started, it continues to run until manually stopped. Continuous pipelines require an always-running cluster but ensure that downstream consumers have the most up-to-date data.

Question #71

Which of the following developer operations in CI/CD flow can be implemented in Databricks Re-pos?

A . Merge when code is committed
B . Pull request and review process
C . Trigger Databricks Repos API to pull the latest version of code into production folder
D . Resolve merge conflicts
E . Delete a branch

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

See the below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workflow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure DevOps

Question #72

Identify one of the below statements that can query a delta table in PySpark Dataframe API

A . Spark.read.mode("delta").table("table_name")
B . Spark.read.table.delta("table_name")
C . Spark.read.table("table_name")
D . Spark.read.format("delta").LoadTableAs("table_name")
E . Spark.read.format("delta").TableAs("table_name")

Reveal Solution Hide Solution

Correct Answer: C

Question #73

How VACCUM and OPTIMIZE commands can be used to manage the DELTA lake?

A . VACCUM command can be used to compact small parquet files, and the OP-TIMZE command can be used to delete parquet files that are marked for dele-tion/unused.
B . VACCUM command can be used to delete empty/blank parquet files in a delta table.
OPTIMIZE command can be used to update stale statistics on a delta table.
C . VACCUM command can be used to compress the parquet files to reduce the size of the table, OPTIMIZE command can be used to cache frequently delta tables for better performance.
D . VACCUM command can be used to delete empty/blank parquet files in a delta table, OPTIMIZE command can be used to cache frequently delta tables for better performance.
E . OPTIMIZE command can be used to compact small parquet files, and the VAC-CUM command can be used to delete parquet files that are marked for deletion/unused. (Correct)

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

VACCUM:

You can remove files no longer referenced by a Delta table and are older than the retention thresh-old by running the vacuum command on the table. vacuum is not triggered automatically. The de-fault retention threshold for the files is 7 days. To change this behavior, see Configure data retention for time travel.

OPTIMIZE:

Using OPTIMIZE you can compact data files on Delta Lake, this can improve the speed of read queries on the table. Too many small files can significantly degrade the performance of the query.

Question #74

Which of the following statements are correct on how Delta Lake implements a lake house?

A . Delta lake uses a proprietary format to write data, optimized for cloud storage
B . Using Apache Hadoop on cloud object storage
C . Delta lake always stores meta data in memory vs storage
D . Delta lake uses open source, open format, optimized cloud storage and scalable meta data
E . Delta lake stores data and meta data in computes memory

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

Delta lake is

• Open source

• Builds up on standard data format

• Optimized for cloud object storage

• Built for scalable metadata handling Delta lake is not

• Proprietary technology

• Storage format

• Storage medium

• Database service or data warehouse

Question #75

What are the different ways you can schedule a job in Databricks workspace?

A . Continuous, Incremental
B . On-Demand runs, File notification from Cloud object storage
C . Cron, On Demand runs
D . Cron, File notification from Cloud object storage
E . Once, Continuous

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Cron, On-Demand runs

Supports running job immediately or using can be scheduled using CRON syntax

Question #76

Which of the following type of tasks cannot setup through a job?

A . Notebook
B . DELTA LIVE PIPELINE
C . Spark Submit
D . Python
E . Databricks SQL Dashboard refresh

Reveal Solution Hide Solution

Correct Answer: E

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #77

Which of the following describes how Databricks Repos can help facilitate CI/CD workflows on the Databricks Lakehouse Platform?

A . Databricks Repos can facilitate the pull request, review, and approval process before merging branches
B . Databricks Repos can merge changes from a secondary Git branch into a main Git branch
C . Databricks Repos can be used to design, develop, and trigger Git automation pipelines
D . Databricks Repos can store the single-source-of-truth Git repository
E . Databricks Repos can commit or push code changes to trigger a CI/CD process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Answer is Databricks Repos can commit or push code changes to trigger a CI/CD process See below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #109

table("uncleanedSales")

Reveal Solution Hide Solution

Correct Answer: B

Explanation:

The answer is

Question #131

table(table_name))

A . format, checkpointlocation, schemalocation, overwrite
B . cloudfiles.format, checkpointlocation, cloudfiles.schemalocation, overwrite
C . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
D . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, append
E . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, overwrite

Reveal Solution Hide Solution

Correct Answer: C

Question #132

Which of the following scenarios is the best fit for AUTO LOADER?

A . Efficiently process new data incrementally from cloud object storage
B . Efficiently move data incrementally from one delta table to another delta table
C . Incrementally process new data from streaming data sources like Kafka into delta lake
D . Incrementally process new data from relational databases like MySQL
E . Efficiently copy data from one data lake location to another data lake location

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

The answer is, Efficiently process new data incrementally from cloud object storage, AU-TO LOADER only supports ingesting files stored in a cloud object storage. Auto Loader cannot process streaming data sources like Kafka or Delta streams, use Structured streaming for these data sources.

Diagram

Description automatically generated

Auto Loader and Cloud Storage Integration

Auto Loader supports a couple of ways to ingest data incrementally

Question #132

Which of the following scenarios is the best fit for AUTO LOADER?

A . Efficiently process new data incrementally from cloud object storage
B . Efficiently move data incrementally from one delta table to another delta table
C . Incrementally process new data from streaming data sources like Kafka into delta lake
D . Incrementally process new data from relational databases like MySQL
E . Efficiently copy data from one data lake location to another data lake location

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

The answer is, Efficiently process new data incrementally from cloud object storage, AU-TO LOADER only supports ingesting files stored in a cloud object storage. Auto Loader cannot process streaming data sources like Kafka or Delta streams, use Structured streaming for these data sources.

Diagram

Description automatically generated

Auto Loader and Cloud Storage Integration

Auto Loader supports a couple of ways to ingest data incrementally

Question #132

Which of the following scenarios is the best fit for AUTO LOADER?

A . Efficiently process new data incrementally from cloud object storage
B . Efficiently move data incrementally from one delta table to another delta table
C . Incrementally process new data from streaming data sources like Kafka into delta lake
D . Incrementally process new data from relational databases like MySQL
E . Efficiently copy data from one data lake location to another data lake location

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

The answer is, Efficiently process new data incrementally from cloud object storage, AU-TO LOADER only supports ingesting files stored in a cloud object storage. Auto Loader cannot process streaming data sources like Kafka or Delta streams, use Structured streaming for these data sources.

Diagram

Description automatically generated

Auto Loader and Cloud Storage Integration

Auto Loader supports a couple of ways to ingest data incrementally

Question #135

The team has decided to take advantage of table properties to identify a business owner for each table, which of the following table DDL syntax allows you to populate a table property identifying the business owner of a table

CREATE TABLE inventory (id INT, units FLOAT)

A . SET TBLPROPERTIES business_owner = ‘supply chain’
CREATE TABLE inventory (id INT, units FLOAT)
B . TBLPROPERTIES (business_owner = ‘supply chain’)
C . CREATE TABLE inventory (id INT, units FLOAT)
SET (business_owner = ‘supply chain’)
D . CREATE TABLE inventory (id INT, units FLOAT)
SET PROPERTY (business_owner = ‘supply chain’)
E . CREATE TABLE inventory (id INT, units FLOAT)
SET TAG (business_owner = ‘supply chain’)

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

CREATE TABLE inventory (id INT, units FLOAT) TBLPROPERTIES (business_owner = ‘supply chain’)

Table properties and table options (Databricks SQL) | Databricks on AWS Alter table command can used to update the TBLPROPERTIES

ALTER TABLE inventory SET TBLPROPERTIES(business_owner , ‘operations’)

Question #136

While investigating a data issue, you wanted to review yesterday’s version of the table using below command, while querying the previous version of the table using time travel you realized that you are no longer able to view the historical data in the table and you could see it the table was updated yesterday based on the table history(DESCRIBE HISTORY table_name) command what could be the reason why you can not access this data?

SELECT * FROM table_name TIMESTAMP AS OF date_sub(current_date(), 1)

A . You currently do not have access to view historical data
B . By default, historical data is cleaned every 180 days in DELTA
C . A command VACUUM table_name RETAIN 0 was ran on the table
D . Time travel is disabled
E . Time travel must be enabled before you query previous data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, VACUUM table_name RETAIN 0 was ran

The VACUUM command recursively vacuums directories associated with the Delta table and re-moves data files that are no longer in the latest state of the transaction log for the table and are older than a retention threshold. The default is 7 Days.

When VACUUM table_name RETAIN 0 is ran all of the historical versions of data are lost time travel can only provide the current state.

Question #137

Which of the following table constraints that can be enforced on Delta lake tables are supported?

A . Primary key, foreign key, Not Null, Check Constraints
B . Primary key, Not Null, Check Constraints
C . Default, Not Null, Check Constraints
D . Not Null, Check Constraints
E . Unique, Not Null, Check Constraints

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is Not Null, Check Constraints https://docs.microsoft.com/en-us/azure/databricks/delta/delta-constraints

✑ CREATE TABLE events( id LONG, ✑ date STRING,

✑ location STRING,

✑ description STRING ✑ ) USING DELTA;

ALTER TABLE events CHANGE COLUMN id SET NOT NULL;

ALTER TABLE events ADD CONSTRAINT dateWithinRange CHECK (date > ‘1900-01-01’);

Note: Databricks as of DBR 11.1 added support for Primary Key and Foreign Key when Unity Catalog is enabled but this is for information purposes only these are not actually enforced. You may ask then why are we defining these if they are not enforced, so especially these information constraints are very helpful if you have a BI tool that can benefit from knowing the relationship between the tables, so it will be easy when creating reports/dashboards or understanding the data model when using any Data modeling tool. Primary and Foreign Key

Graphical user interface, text, application, email

Description automatically generated

Question #138

Which of the following commands results in the successful creation of a view on top of the delta stream (stream on delta table)?

A . Spark.read.format("delta").table("sales").createOrReplaceTempView("streaming_vw")
B . Spark.readStream.format("delta").table("sales").createOrReplaceTempView("streaming_vw ")
C . Spark.read.format("delta").table("sales").mode("stream").createOrReplaceTempView("strea ming_vw")
D . Spark.read.format("delta").table("sales").trigger("stream").createOrReplaceTempView("stre aming_vw")
E . Spark.read.format("delta").stream("sales").createOrReplaceTempView("streaming_vw")
F . You can not create a view on streaming data source.

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is

Spark.readStream.table("sales").createOrReplaceTempView("streaming_vw")

When you load a Delta table as a stream source and use it in a streaming query, the query processes all of the data present in the table as well as any new data that arrives after the stream is started.

You can load both paths and tables as a stream, you also have the ability to ignore deletes and changes (updates, Merge, overwrites) on the delta table. Here is more information,

https://docs.databricks.com/delta/delta-streaming.html#delta-table-as-a-source

Question #139

What is the purpose of a gold layer in Multi-hop architecture?

A . Optimizes ETL throughput and analytic query performance
B . Eliminate duplicate records
C . Preserves grain of original data, without any aggregations
D . Data quality checks and schema enforcement
E . Powers ML applications, reporting, dashboards and adhoc reports.

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Powers ML applications, reporting, dashboards and adhoc reports.

Review the below link for more info,

Medallion Architecture C Databricks

Gold Layer:

Question #139

What is the purpose of a gold layer in Multi-hop architecture?

A . Optimizes ETL throughput and analytic query performance
B . Eliminate duplicate records
C . Preserves grain of original data, without any aggregations
D . Data quality checks and schema enforcement
E . Powers ML applications, reporting, dashboards and adhoc reports.

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Powers ML applications, reporting, dashboards and adhoc reports.

Review the below link for more info,

Medallion Architecture C Databricks

Gold Layer:

Question #139

What is the purpose of a gold layer in Multi-hop architecture?

A . Optimizes ETL throughput and analytic query performance
B . Eliminate duplicate records
C . Preserves grain of original data, without any aggregations
D . Data quality checks and schema enforcement
E . Powers ML applications, reporting, dashboards and adhoc reports.

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Powers ML applications, reporting, dashboards and adhoc reports.

Review the below link for more info,

Medallion Architecture C Databricks

Gold Layer:

Question #139

What is the purpose of a gold layer in Multi-hop architecture?

A . Optimizes ETL throughput and analytic query performance
B . Eliminate duplicate records
C . Preserves grain of original data, without any aggregations
D . Data quality checks and schema enforcement
E . Powers ML applications, reporting, dashboards and adhoc reports.

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Powers ML applications, reporting, dashboards and adhoc reports.

Review the below link for more info,

Medallion Architecture C Databricks

Gold Layer:

Question #139

What is the purpose of a gold layer in Multi-hop architecture?

A . Optimizes ETL throughput and analytic query performance
B . Eliminate duplicate records
C . Preserves grain of original data, without any aggregations
D . Data quality checks and schema enforcement
E . Powers ML applications, reporting, dashboards and adhoc reports.

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Powers ML applications, reporting, dashboards and adhoc reports.

Review the below link for more info,

Medallion Architecture C Databricks

Gold Layer:

Question #144

Which of the following programming languages can be used to build a Databricks SQL dashboard?

A . Python
B . Scala
C . SQL
D . R
E . All of the above

Reveal Solution Hide Solution

Correct Answer: C

Question #145

You are working on a table called orders which contains data for 2021 and you have the second table called orders_archive which contains data for 2020, you need to combine the data from two tables and there could be a possibility of the same rows between both the tables, you are looking to combine the results from both the tables and eliminate the duplicate rows, which of the following SQL statements helps you accomplish this?

A . SELECT * FROM orders UNION SELECT * FROM orders_archive (Correct)
B . SELECT * FROM orders INTERSECT SELECT * FROM orders_archive
C . SELECT * FROM orders UNION ALL SELECT * FROM orders_archive
D . SELECT * FROM orders_archive MINUS SELECT * FROM orders
E . SELECT distinct * FROM orders JOIN orders_archive on order.id = or-ders_archive.id

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is SELECT * FROM orders UNION SELECT * FROM orders_archive UNION and UNION ALL are set operators, UNION combines the output from both queries but also eliminates the duplicates.

UNION ALL combines the output from both queries.

Question #146

Once a cluster is deleted, below additional actions need to performed by the administrator

A . Remove virtual machines but storage and networking are automatically dropped
B . Drop storage disks but Virtual machines and networking are automatically dropped
C . Remove networking but Virtual machines and storage disks are automatically dropped
D . Remove logs
E . No action needs to be performed. All resources are automatically removed.

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

What is Delta?

Delta lake is

• Open source

• Builds up on standard data format

• Optimized for cloud object storage

• Built for scalable metadata handling Delta lake is not

• Proprietary technology

• Storage format

• Storage medium

• Database service or data warehouse

Question #147

Which of the statements are correct about lakehouse?

A . Lakehouse only supports Machine learning workloads and Data warehouses support BI workloads
B . Lakehouse only supports end-to-end streaming workloads and Data warehouses support Batch workloads
C . Lakehouse does not support ACID
D . In Lakehouse Storage and compute are coupled
E . Lakehouse supports schema enforcement and evolution

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Lakehouse supports schema enforcement and evolution,

Lakehouse using Delta lake can not only enforce a schema on write which is contrary to traditional big data systems that can only enforce a schema on read, it also supports evolving schema over time with the ability to control the evolution.

For example below is the Dataframe writer API and it supports three modes of enforcement and evolution,

Default: Only enforcement, no changes are allowed and any schema drift/evolution will result in failure.

Merge: Flexible, supports enforcement and evolution

✑ New columns are added ✑ Evolves nested columns

✑ Supports evolving data types, like Byte to Short to Integer to Bigint How to enable:

✑ DF.write.format("delta").option("mergeSchema", "true").saveAsTable("table_name")

✑ or

✑ spark.databricks.delta.schema.autoMerge = True ## Spark session

Overwrite: No enforcement

✑ Dropping columns

✑ Change string to integer ✑ Rename columns

How to enable:

✑ DF.write.format("delta").option("overwriteSchema", "True").saveAsTable("table_name")

What Is a Lakehouse? – The Databricks Blog

Graphical user interface, text, application

Description automatically generated

Question #148

You are working on IOT data where each device has 5 reading in an array collected in Celsius, you were asked to covert each individual reading from Celsius to Fahrenheit, fill in the blank with an appropriate function that can be used in this scenario.

Schema: deviceId INT, deviceTemp ARRAY<double>

SELECT deviceId, __(deviceTempC,i-> (i * 9/5) + 32) as deviceTempF

FROM sensors

A . APPLY
B . MULTIPLY
C . ARRAYEXPR
D . TRANSFORM
E . FORALL

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

TRANSFORM -> Transforms elements in an array in expr using the function func.1.transform(expr, func)

Question #149

What is the type of table created when you issue SQL DDL command CREATE TABLE sales (id int, units int)

A . Query fails due to missing location
B . Query fails due to missing format
C . Managed Delta table
D . External Table
E . Managed Parquet table

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Answer is Managed Delta table

Anytime a table is created without the Location keyword it is considered a managed table, by de-fault all managed tables DELTA tables

Syntax

CREATE TABLE table_name (column column_data_type…)

Question #150

You are currently working on a production job failure with a job set up in job clusters due to a data issue, what cluster do you need to start to investigate and analyze the data?

A . A Job cluster can be used to analyze the problem
B . All-purpose cluster/ interactive cluster is the recommended way to run commands and view the data.
C . Existing job cluster can be used to investigate the issue
D . Databricks SQL Endpoint can be used to investigate the issue

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Answer is All-purpose cluster/ interactive cluster is the recommended way to run commands and view the data.

A job cluster can not provide a way for a user to interact with a notebook once the job is submitted, but an Interactive cluster allows to you display data, view visualizations write or edit quries, which makes it a perfect fit to investigate and analyze the data.

Question #151

Data engineering team has provided 10 queries and asked Data Analyst team to build a dashboard and refresh the data every day at 8 AM, identify the best approach to set up data refresh for this dashaboard?

A . Each query requires a separate task and setup 10 tasks under a single job to run at 8 AM to refresh the dashboard
B . The entire dashboard with 10 queries can be refreshed at once, single schedule needs to be set up to refresh at 8 AM.
C . Setup JOB with linear dependency to all load all 10 queries into a table so the dashboard can be refreshed at once.
D . A dashboard can only refresh one query at a time, 10 schedules to set up the refresh.
E . Use Incremental refresh to run at 8 AM every day.

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is,

The entire dashboard with 10 queries can be refreshed at once, single schedule needs to be set up to refresh at 8 AM.

Automatically refresh a dashboard

A dashboard’s owner and users with the Can Edit permission can configure a dashboard to auto-matically refresh on a schedule. To automatically refresh a dashboard:

✑ uk.co.certification.simulator.questionpool.PList@d6d24d0

Question #152

What is the best way to query external csv files located on DBFS Storage to inspect the data using SQL?

A . SELECT * FROM ‘dbfs:/location/csv_files/’ FORMAT = ‘CSV’
B . SELECT CSV. * from ‘dbfs:/location/csv_files/’
C . SELECT * FROM CSV. ‘dbfs:/location/csv_files/’
D . You can not query external files directly, us COPY INTO to load the data into a table first
E . SELECT * FROM ‘dbfs:/location/csv_files/’ USING CSV

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Answer is, SELECT * FROM CSV. ‘dbfs:/location/csv_files/’

you can query external files stored on the storage using below syntax SELECT * FROM format.`/Location` format – CSV, JSON, PARQUET, TEXT

Question #153

You have written a notebook to generate a summary data set for reporting, Notebook was scheduled using the job cluster, but you realized it takes 8 minutes to start the cluster, what feature can be used to start the cluster in a timely fashion so your job can run immediatley?

A . Setup an additional job to run ahead of the actual job so the cluster is running second job starts
B . Use the Databricks cluster pools feature to reduce the startup time
C . Use Databricks Premium edition instead of Databricks standard edition
D . Pin the cluster in the cluster UI page so it is always available to the jobs
E . Disable auto termination so the cluster is always running

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup a pool and follow some best practices,

Graphical user

interface, text

Description automatically generated

Question #154

Which of the following developer operations in CI/CD flow can be implemented in Databricks Re-pos?

A . Delete branch
B . Trigger Databricks CICD pipeline
C . Commit and push code
D . Create a pull request
E . Approve the pull request

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is Commit and push code.

See the below diagram to understand the role Databricks Repos and Git provider plays when building a CI/CD workflow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Exam focus: Please study the below image carefully to understand all of the steps in the CI/CD flow to understand the tasks that are implemented in Databricks Repo vs Git Provider, exam may ask a different type of questions based on this flow.

Diagram

Description automatically generated

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #155

You noticed that colleague is manually copying the notebook with _bkp to store the previous versions, which of the following feature would you recommend instead.

A . Databricks notebooks support change tracking and versioning
B . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
C . Databricks notebooks can be exported into dbc archive files and stored in data lake
D . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Answer is Databricks notebooks support automatic change tracking and versioning. When you are editing the notebook on the right side check version history to view all the changes, every change you are making is captured and saved.

Question #171

pass

Reveal Solution Hide Solution

Correct Answer: A

Explanation:

The answer is,

Question #171

pass

Reveal Solution Hide Solution

Correct Answer: A

Explanation:

The answer is,

Question #171

pass

Reveal Solution Hide Solution

Correct Answer: A

Explanation:

The answer is,

Question #171

pass

Reveal Solution Hide Solution

Correct Answer: A

Explanation:

The answer is,

Question #171

pass

Reveal Solution Hide Solution

Correct Answer: A

Explanation:

The answer is,

Question #176

You are asked to setup an AUTO LOADER to process the incoming data, this data arrives in JSON format and get dropped into cloud object storage and you are required to process the data as soon as it arrives in cloud storage, which of the following statements is correct

A . AUTO LOADER is native to DELTA lake it cannot support external cloud object storage
B . AUTO LOADER has to be triggered from an external process when the file arrives in the cloud storage
C . AUTO LOADER needs to be converted to a Structured stream process
D . AUTO LOADER can only process continuous data when stored in DELTA lake
E . AUTO LOADER can support file notification method so it can process data as it arrives

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Auto Loader supports two modes when ingesting new files from cloud object storage Directory listing: Auto Loader identifies new files by listing the input directory, and uses a

directory polling approach.

File notification: Auto Loader can automatically set up a notification service and queue service that subscribe to file events from the input directory.

Diagram

Description automatically generated

File notification is more efficient and can be used to process the data in real-time as data arrives in cloud object storage.

Choosing between file notification and directory listing modes | Databricks on AWS

Question #177

Which of the following techniques structured streaming uses to ensure recovery of failures during stream processing?

A . Checkpointing and Watermarking
B . Write ahead logging and watermarking
C . Checkpointing and write-ahead logging
D . Delta time travel
E . The stream will failover to available nodes in the cluster
F . Checkpointing and Idempotent sinks

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is Checkpointing and write-ahead logging.

Structured Streaming uses checkpointing and write-ahead logs to record the offset range of data being processed during each trigger interval.

Question #178

Kevin is the owner of both the sales table and regional_sales_vw view which uses the sales table as the underlying source for the data, and Kevin is looking to grant select privilege on the view regional_sales_vw to one of newly joined team members Steven.

Which of the following is a true statement?

A . Kevin can not grant access to Steven since he does not have security admin privilege
B . Kevin although is the owner but does not have ALL PRIVILEGES permission
C . Kevin can grant access to the view, because he is the owner of the view and the under-lying table
D . Kevin can not grant access to Steven since he does have workspace admin privilege
E . Steve will also require SELECT access on the underlying table

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Kevin can grant access to the view, because he is the owner of the view and the un-derlying table,

Ownership determines whether or not you can grant privileges on derived objects to other users, a user who creates a schema, table, view, or function becomes its owner. The owner is granted all privileges and can grant privileges to other users

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #179

Which of the following SQL statements can be used to update a transactions table, to set a flag on the table from Y to N

A . MODIFY transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
B . MERGE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
C . UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’
D . REPLACE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is

UPDATE transactions SET active_flag = ‘N’ WHERE active_flag = ‘Y’

Delta Lake supports UPDATE statements on the delta table, all of the changes as part of the update are ACID compliant.

Question #193

unitsSold int)

Reveal Solution Hide Solution

Correct Answer: D

Explanation:

The answer is

Question #193

unitsSold int)

Reveal Solution Hide Solution

Correct Answer: D

Explanation:

The answer is

Question #193

unitsSold int)

Reveal Solution Hide Solution

Correct Answer: D

Explanation:

The answer is

Question #193

unitsSold int)

Reveal Solution Hide Solution

Correct Answer: D

Explanation:

The answer is

Question #193

unitsSold int)

Reveal Solution Hide Solution

Correct Answer: D

Explanation:

The answer is

Question #198

If you create a database sample_db with the statement CREATE DATABASE sample_db what will be the default location of the database in DBFS?

A . Default location, DBFS:/user/
B . Default location, /user/db/
C . Default Storage account
D . Statement fails “Unable to create database without location”
E . Default Location, dbfs:/user/hive/warehouse

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The Answer is dbfs:/user/hive/warehouse this is the default location where spark stores user data-bases, the default can be changed using spark.sql.warehouse.dir a parameter. You can also provide a custom location using the LOCATION keyword. Here is how this works,

Graphical user

interface, text, application, email

Description automatically generated

Default location

FYI, This can be changed used using cluster spark config or session config. Modify spark.sql.warehouse.dir location to change the default location

Graphical user

interface, text, application

Description automatically generated

Question #198

If you create a database sample_db with the statement CREATE DATABASE sample_db what will be the default location of the database in DBFS?

A . Default location, DBFS:/user/
B . Default location, /user/db/
C . Default Storage account
D . Statement fails “Unable to create database without location”
E . Default Location, dbfs:/user/hive/warehouse

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The Answer is dbfs:/user/hive/warehouse this is the default location where spark stores user data-bases, the default can be changed using spark.sql.warehouse.dir a parameter. You can also provide a custom location using the LOCATION keyword. Here is how this works,

Graphical user

interface, text, application, email

Description automatically generated

Default location

FYI, This can be changed used using cluster spark config or session config. Modify spark.sql.warehouse.dir location to change the default location

Graphical user

interface, text, application

Description automatically generated

Question #198

If you create a database sample_db with the statement CREATE DATABASE sample_db what will be the default location of the database in DBFS?

A . Default location, DBFS:/user/
B . Default location, /user/db/
C . Default Storage account
D . Statement fails “Unable to create database without location”
E . Default Location, dbfs:/user/hive/warehouse

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The Answer is dbfs:/user/hive/warehouse this is the default location where spark stores user data-bases, the default can be changed using spark.sql.warehouse.dir a parameter. You can also provide a custom location using the LOCATION keyword. Here is how this works,

Graphical user

interface, text, application, email

Description automatically generated

Default location

FYI, This can be changed used using cluster spark config or session config. Modify spark.sql.warehouse.dir location to change the default location

Graphical user

interface, text, application

Description automatically generated

Question #198

If you create a database sample_db with the statement CREATE DATABASE sample_db what will be the default location of the database in DBFS?

A . Default location, DBFS:/user/
B . Default location, /user/db/
C . Default Storage account
D . Statement fails “Unable to create database without location”
E . Default Location, dbfs:/user/hive/warehouse

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The Answer is dbfs:/user/hive/warehouse this is the default location where spark stores user data-bases, the default can be changed using spark.sql.warehouse.dir a parameter. You can also provide a custom location using the LOCATION keyword. Here is how this works,

Graphical user

interface, text, application, email

Description automatically generated

Default location

FYI, This can be changed used using cluster spark config or session config. Modify spark.sql.warehouse.dir location to change the default location

Graphical user

interface, text, application

Description automatically generated

Question #198

If you create a database sample_db with the statement CREATE DATABASE sample_db what will be the default location of the database in DBFS?

A . Default location, DBFS:/user/
B . Default location, /user/db/
C . Default Storage account
D . Statement fails “Unable to create database without location”
E . Default Location, dbfs:/user/hive/warehouse

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The Answer is dbfs:/user/hive/warehouse this is the default location where spark stores user data-bases, the default can be changed using spark.sql.warehouse.dir a parameter. You can also provide a custom location using the LOCATION keyword. Here is how this works,

Graphical user

interface, text, application, email

Description automatically generated

Default location

FYI, This can be changed used using cluster spark config or session config. Modify spark.sql.warehouse.dir location to change the default location

Graphical user

interface, text, application

Description automatically generated

Question #198

If you create a database sample_db with the statement CREATE DATABASE sample_db what will be the default location of the database in DBFS?

A . Default location, DBFS:/user/
B . Default location, /user/db/
C . Default Storage account
D . Statement fails “Unable to create database without location”
E . Default Location, dbfs:/user/hive/warehouse

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The Answer is dbfs:/user/hive/warehouse this is the default location where spark stores user data-bases, the default can be changed using spark.sql.warehouse.dir a parameter. You can also provide a custom location using the LOCATION keyword. Here is how this works,

Graphical user

interface, text, application, email

Description automatically generated

Default location

FYI, This can be changed used using cluster spark config or session config. Modify spark.sql.warehouse.dir location to change the default location

Graphical user

interface, text, application

Description automatically generated

Question #204

AS SELECT * FROM table_name

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

Question #204

AS SELECT * FROM table_name

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

Question #204

AS SELECT * FROM table_name

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

Question #207

The default threshold of VACUUM is 7 days, internal audit team asked to certain tables to maintain at least 365 days as part of compliance requirement, which of the below setting is needed to implement.

A . ALTER TABLE table_name set TBLPROPERTIES (del-ta.deletedFileRetentionDuration= ‘interval 365 days’)
B . MODIFY TABLE table_name set TBLPROPERTY (delta.maxRetentionDays = ‘inter-val 365 days’)
C . ALTER TABLE table_name set EXENDED TBLPROPERTIES (del-ta.deletedFileRetentionDuration= ‘interval 365 days’)
D . ALTER TABLE table_name set EXENDED TBLPROPERTIES (delta.vaccum.duration= ‘interval 365 days’)

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Question #207

The default threshold of VACUUM is 7 days, internal audit team asked to certain tables to maintain at least 365 days as part of compliance requirement, which of the below setting is needed to implement.

A . ALTER TABLE table_name set TBLPROPERTIES (del-ta.deletedFileRetentionDuration= ‘interval 365 days’)
B . MODIFY TABLE table_name set TBLPROPERTY (delta.maxRetentionDays = ‘inter-val 365 days’)
C . ALTER TABLE table_name set EXENDED TBLPROPERTIES (del-ta.deletedFileRetentionDuration= ‘interval 365 days’)
D . ALTER TABLE table_name set EXENDED TBLPROPERTIES (delta.vaccum.duration= ‘interval 365 days’)

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

Question #209

You are currently working to ingest millions of files that get uploaded to the cloud object storage for consumption, and you are asked to build a process to ingest this data, the schema of the file is expected to change over time, and the ingestion process should be able to handle these changes automatically.

Which of the following method can be used to ingest the data incrementally?

A . AUTO APPEND
B . AUTO LOADER
C . COPY INTO
D . Structured Streaming
E . Checkpoint

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is AUTO LOADER,

Use Auto Loader instead of the COPY INTO SQL command when:

• You want to load data from a file location that contains files in the order of millions or higher. Auto Loader can discover files more efficiently than the COPY INTO SQL command and can split file processing into multiple batches.

• COPY INTO only directory listing but AUTO LOADER supports File notification method where the Auto Loader continues to ingest files as they arrive in cloud object storage lever-aging cloud provider(Queues and triggers) and Spark’s structured streaming.

• Your data schema evolves frequently. Auto Loader provides better support for schema in-ference and evolution. See Configuring schema inference and evolution in Auto Loader.

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #210

You are noticing job cluster is taking 6 to 8 mins to start which is delaying your job to finish on time, what steps you can take to reduce the amount of time cluster startup time

A . Setup a second job ahead of first job to start the cluster, so the cluster is ready with re-sources when the job starts
B . Use All purpose cluster instead to reduce cluster start up time
C . Reduce the size of the cluster, smaller the cluster size shorter it takes to start the cluster
D . Use cluster pools to reduce the startup time of the jobs
E . Use SQL endpoints to reduce the startup time

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Use cluster pools to reduce the startup time of the jobs.

Cluster pools allow us to reserve VM’s ahead of time, when a new job cluster is created VM are grabbed from the pool. Note: when the VM’s are waiting to be used by the cluster only cost incurred is Azure. Databricks run time cost is only billed once VM is allocated to a cluster.

Here is a demo of how to setup and follow some best practices, https://www.youtube.com/watch?v=FVtITxOabxg&ab_channel=DatabricksAcademy

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #232

as total_sales from sales

Reveal Solution Hide Solution

Correct Answer: C

Explanation:

The answer is

Question #250

You are currently working with the application team to setup a SQL Endpoint point, once the team started consuming the SQL Endpoint you noticed that during peak hours as the number of concur-rent users increases you are seeing degradation in the query performance and the same queries are taking longer to run, which of the following steps can be taken to resolve the issue?

A . They can turn on the Serverless feature for the SQL endpoint.
B . They can increase the maximum bound of the SQL endpoint’s scaling range.
C . They can increase the cluster size(2X-Small to 4X-Large) of the SQL endpoint.
D . They can turn on the Auto Stop feature for the SQL endpoint.
E . They can turn on the Serverless feature for the SQL endpoint and change the Spot In-stance Policy from “Cost optimized” to “Reliability Optimized.”

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is, They can increase the maximum bound of the SQL endpoint’s scaling range, when you increase the max scaling range more clusters are added so queries instead of waiting in the queue can start running using available clusters, see below for more explanation.

The question is looking to test your ability to know how to scale a SQL Endpoint (SQL Warehouse) and you have to look for cue words or need to understand if the queries are running sequentially or concurrently. if the queries are running sequentially then scale up (Size of the cluster from 2X-Small to 4X-Large) if the queries are running concurrently or with more users then scale out (add more clusters).

SQL Endpoint (SQL Warehouse) Overview: (Please read all of the below points and the below diagram to understand)

Question #250

You are currently working with the application team to setup a SQL Endpoint point, once the team started consuming the SQL Endpoint you noticed that during peak hours as the number of concur-rent users increases you are seeing degradation in the query performance and the same queries are taking longer to run, which of the following steps can be taken to resolve the issue?

A . They can turn on the Serverless feature for the SQL endpoint.
B . They can increase the maximum bound of the SQL endpoint’s scaling range.
C . They can increase the cluster size(2X-Small to 4X-Large) of the SQL endpoint.
D . They can turn on the Auto Stop feature for the SQL endpoint.
E . They can turn on the Serverless feature for the SQL endpoint and change the Spot In-stance Policy from “Cost optimized” to “Reliability Optimized.”

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is, They can increase the maximum bound of the SQL endpoint’s scaling range, when you increase the max scaling range more clusters are added so queries instead of waiting in the queue can start running using available clusters, see below for more explanation.

The question is looking to test your ability to know how to scale a SQL Endpoint (SQL Warehouse) and you have to look for cue words or need to understand if the queries are running sequentially or concurrently. if the queries are running sequentially then scale up (Size of the cluster from 2X-Small to 4X-Large) if the queries are running concurrently or with more users then scale out (add more clusters).

SQL Endpoint (SQL Warehouse) Overview: (Please read all of the below points and the below diagram to understand)

Question #250

You are currently working with the application team to setup a SQL Endpoint point, once the team started consuming the SQL Endpoint you noticed that during peak hours as the number of concur-rent users increases you are seeing degradation in the query performance and the same queries are taking longer to run, which of the following steps can be taken to resolve the issue?

A . They can turn on the Serverless feature for the SQL endpoint.
B . They can increase the maximum bound of the SQL endpoint’s scaling range.
C . They can increase the cluster size(2X-Small to 4X-Large) of the SQL endpoint.
D . They can turn on the Auto Stop feature for the SQL endpoint.
E . They can turn on the Serverless feature for the SQL endpoint and change the Spot In-stance Policy from “Cost optimized” to “Reliability Optimized.”

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is, They can increase the maximum bound of the SQL endpoint’s scaling range, when you increase the max scaling range more clusters are added so queries instead of waiting in the queue can start running using available clusters, see below for more explanation.

The question is looking to test your ability to know how to scale a SQL Endpoint (SQL Warehouse) and you have to look for cue words or need to understand if the queries are running sequentially or concurrently. if the queries are running sequentially then scale up (Size of the cluster from 2X-Small to 4X-Large) if the queries are running concurrently or with more users then scale out (add more clusters).

SQL Endpoint (SQL Warehouse) Overview: (Please read all of the below points and the below diagram to understand)

Question #250

You are currently working with the application team to setup a SQL Endpoint point, once the team started consuming the SQL Endpoint you noticed that during peak hours as the number of concur-rent users increases you are seeing degradation in the query performance and the same queries are taking longer to run, which of the following steps can be taken to resolve the issue?

A . They can turn on the Serverless feature for the SQL endpoint.
B . They can increase the maximum bound of the SQL endpoint’s scaling range.
C . They can increase the cluster size(2X-Small to 4X-Large) of the SQL endpoint.
D . They can turn on the Auto Stop feature for the SQL endpoint.
E . They can turn on the Serverless feature for the SQL endpoint and change the Spot In-stance Policy from “Cost optimized” to “Reliability Optimized.”

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is, They can increase the maximum bound of the SQL endpoint’s scaling range, when you increase the max scaling range more clusters are added so queries instead of waiting in the queue can start running using available clusters, see below for more explanation.

The question is looking to test your ability to know how to scale a SQL Endpoint (SQL Warehouse) and you have to look for cue words or need to understand if the queries are running sequentially or concurrently. if the queries are running sequentially then scale up (Size of the cluster from 2X-Small to 4X-Large) if the queries are running concurrently or with more users then scale out (add more clusters).

SQL Endpoint (SQL Warehouse) Overview: (Please read all of the below points and the below diagram to understand)

Question #250

You are currently working with the application team to setup a SQL Endpoint point, once the team started consuming the SQL Endpoint you noticed that during peak hours as the number of concur-rent users increases you are seeing degradation in the query performance and the same queries are taking longer to run, which of the following steps can be taken to resolve the issue?

A . They can turn on the Serverless feature for the SQL endpoint.
B . They can increase the maximum bound of the SQL endpoint’s scaling range.
C . They can increase the cluster size(2X-Small to 4X-Large) of the SQL endpoint.
D . They can turn on the Auto Stop feature for the SQL endpoint.
E . They can turn on the Serverless feature for the SQL endpoint and change the Spot In-stance Policy from “Cost optimized” to “Reliability Optimized.”

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is, They can increase the maximum bound of the SQL endpoint’s scaling range, when you increase the max scaling range more clusters are added so queries instead of waiting in the queue can start running using available clusters, see below for more explanation.

The question is looking to test your ability to know how to scale a SQL Endpoint (SQL Warehouse) and you have to look for cue words or need to understand if the queries are running sequentially or concurrently. if the queries are running sequentially then scale up (Size of the cluster from 2X-Small to 4X-Large) if the queries are running concurrently or with more users then scale out (add more clusters).

SQL Endpoint (SQL Warehouse) Overview: (Please read all of the below points and the below diagram to understand)

Question #250

You are currently working with the application team to setup a SQL Endpoint point, once the team started consuming the SQL Endpoint you noticed that during peak hours as the number of concur-rent users increases you are seeing degradation in the query performance and the same queries are taking longer to run, which of the following steps can be taken to resolve the issue?

A . They can turn on the Serverless feature for the SQL endpoint.
B . They can increase the maximum bound of the SQL endpoint’s scaling range.
C . They can increase the cluster size(2X-Small to 4X-Large) of the SQL endpoint.
D . They can turn on the Auto Stop feature for the SQL endpoint.
E . They can turn on the Serverless feature for the SQL endpoint and change the Spot In-stance Policy from “Cost optimized” to “Reliability Optimized.”

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is, They can increase the maximum bound of the SQL endpoint’s scaling range, when you increase the max scaling range more clusters are added so queries instead of waiting in the queue can start running using available clusters, see below for more explanation.

The question is looking to test your ability to know how to scale a SQL Endpoint (SQL Warehouse) and you have to look for cue words or need to understand if the queries are running sequentially or concurrently. if the queries are running sequentially then scale up (Size of the cluster from 2X-Small to 4X-Large) if the queries are running concurrently or with more users then scale out (add more clusters).

SQL Endpoint (SQL Warehouse) Overview: (Please read all of the below points and the below diagram to understand)

Question #250

You are currently working with the application team to setup a SQL Endpoint point, once the team started consuming the SQL Endpoint you noticed that during peak hours as the number of concur-rent users increases you are seeing degradation in the query performance and the same queries are taking longer to run, which of the following steps can be taken to resolve the issue?

A . They can turn on the Serverless feature for the SQL endpoint.
B . They can increase the maximum bound of the SQL endpoint’s scaling range.
C . They can increase the cluster size(2X-Small to 4X-Large) of the SQL endpoint.
D . They can turn on the Auto Stop feature for the SQL endpoint.
E . They can turn on the Serverless feature for the SQL endpoint and change the Spot In-stance Policy from “Cost optimized” to “Reliability Optimized.”

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is, They can increase the maximum bound of the SQL endpoint’s scaling range, when you increase the max scaling range more clusters are added so queries instead of waiting in the queue can start running using available clusters, see below for more explanation.

The question is looking to test your ability to know how to scale a SQL Endpoint (SQL Warehouse) and you have to look for cue words or need to understand if the queries are running sequentially or concurrently. if the queries are running sequentially then scale up (Size of the cluster from 2X-Small to 4X-Large) if the queries are running concurrently or with more users then scale out (add more clusters).

SQL Endpoint (SQL Warehouse) Overview: (Please read all of the below points and the below diagram to understand)

Question #250

You are currently working with the application team to setup a SQL Endpoint point, once the team started consuming the SQL Endpoint you noticed that during peak hours as the number of concur-rent users increases you are seeing degradation in the query performance and the same queries are taking longer to run, which of the following steps can be taken to resolve the issue?

A . They can turn on the Serverless feature for the SQL endpoint.
B . They can increase the maximum bound of the SQL endpoint’s scaling range.
C . They can increase the cluster size(2X-Small to 4X-Large) of the SQL endpoint.
D . They can turn on the Auto Stop feature for the SQL endpoint.
E . They can turn on the Serverless feature for the SQL endpoint and change the Spot In-stance Policy from “Cost optimized” to “Reliability Optimized.”

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is, They can increase the maximum bound of the SQL endpoint’s scaling range, when you increase the max scaling range more clusters are added so queries instead of waiting in the queue can start running using available clusters, see below for more explanation.

The question is looking to test your ability to know how to scale a SQL Endpoint (SQL Warehouse) and you have to look for cue words or need to understand if the queries are running sequentially or concurrently. if the queries are running sequentially then scale up (Size of the cluster from 2X-Small to 4X-Large) if the queries are running concurrently or with more users then scale out (add more clusters).

SQL Endpoint (SQL Warehouse) Overview: (Please read all of the below points and the below diagram to understand)

Question #258

Which of the following locations in the Databricks product architecture hosts the notebooks and jobs?

A . Data plane
B . Control plane
C . Databricks Filesystem
D . JDBC data source
E . Databricks web application

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is Control Pane,

Databricks operates most of its services out of a control plane and a data plane, please note serverless features like SQL Endpoint and DLT compute use shared compute in Control pane.

Control Plane: Stored in Databricks Cloud Account

•The control plane includes the backend services that Databricks manages in its own Azure account. Notebook commands and many other workspace configurations are stored in the control plane and encrypted at rest.

Data Plane: Stored in Customer Cloud Account

•The data plane is managed by your Azure account and is where your data resides. This is also where data is processed. You can use Azure Databricks connectors so that your clusters can connect to external data sources outside of your Azure account to ingest data or for storage.

Timeline

Description automatically generated

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #259

A data engineer is using a Databricks SQL query to monitor the performance of an ELT job. The ELT job is triggered by a specific number of input records being ready to process. The Databricks SQL query returns the number of minutes since the job’s most recent runtime.

Which of the following approaches can enable the data engineering team to be notified if the ELT job has not been run in an hour?

A . They can set up an Alert for the accompanying dashboard to notify them if the returned value is greater than 60.
B . They can set up an Alert for the query to notify when the ELT job fails.
C . They can set up an Alert for the accompanying dashboard to notify when it has not refreshed in 60 minutes.
D . They can set up an Alert for the query to notify them if the returned value is greater than 60.
E . This type of alert is not possible in Databricks

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, They can set up an Alert for the query to notify them if the returned value is greater than 60.

The important thing to note here is that alert can only be setup on query not on the dashboard, query can return a value, which is used if alert can be triggered.

Question #277

ELSE (temp C 33 ) * 5/9 5.END

Reveal Solution Hide Solution

Correct Answer: D

Explanation:

The answer is

Question #277

ELSE (temp C 33 ) * 5/9 5.END

Reveal Solution Hide Solution

Correct Answer: D

Explanation:

The answer is

Question #277

ELSE (temp C 33 ) * 5/9 5.END

Reveal Solution Hide Solution

Correct Answer: D

Explanation:

The answer is

Question #277

ELSE (temp C 33 ) * 5/9 5.END

Reveal Solution Hide Solution

Correct Answer: D

Explanation:

The answer is

Question #277

ELSE (temp C 33 ) * 5/9 5.END

Reveal Solution Hide Solution

Correct Answer: D

Explanation:

The answer is

Question #277

ELSE (temp C 33 ) * 5/9 5.END

Reveal Solution Hide Solution

Correct Answer: D

Explanation:

The answer is

Question #283

What steps need to be taken to set up a DELTA LIVE PIPELINE as a job using the workspace UI?

A . DELTA LIVE TABLES do not support job cluster
B . Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook
C . Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the pipeline JSON file
D . Use Pipeline creation UI, select a new pipeline and job cluster

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is,

Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook.

Create a pipeline

To create a new pipeline using the Delta Live Tables notebook:

Question #283

What steps need to be taken to set up a DELTA LIVE PIPELINE as a job using the workspace UI?

A . DELTA LIVE TABLES do not support job cluster
B . Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook
C . Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the pipeline JSON file
D . Use Pipeline creation UI, select a new pipeline and job cluster

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is,

Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook.

Create a pipeline

To create a new pipeline using the Delta Live Tables notebook:

Question #283

What steps need to be taken to set up a DELTA LIVE PIPELINE as a job using the workspace UI?

A . DELTA LIVE TABLES do not support job cluster
B . Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook
C . Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the pipeline JSON file
D . Use Pipeline creation UI, select a new pipeline and job cluster

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is,

Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook.

Create a pipeline

To create a new pipeline using the Delta Live Tables notebook:

Question #283

What steps need to be taken to set up a DELTA LIVE PIPELINE as a job using the workspace UI?

A . DELTA LIVE TABLES do not support job cluster
B . Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook
C . Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the pipeline JSON file
D . Use Pipeline creation UI, select a new pipeline and job cluster

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is,

Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook.

Create a pipeline

To create a new pipeline using the Delta Live Tables notebook:

Question #283

What steps need to be taken to set up a DELTA LIVE PIPELINE as a job using the workspace UI?

A . DELTA LIVE TABLES do not support job cluster
B . Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook
C . Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the pipeline JSON file
D . Use Pipeline creation UI, select a new pipeline and job cluster

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is,

Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook.

Create a pipeline

To create a new pipeline using the Delta Live Tables notebook:

Question #283

What steps need to be taken to set up a DELTA LIVE PIPELINE as a job using the workspace UI?

A . DELTA LIVE TABLES do not support job cluster
B . Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook
C . Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the pipeline JSON file
D . Use Pipeline creation UI, select a new pipeline and job cluster

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is,

Select Workflows UI and Delta live tables tab, under task type select Delta live tables pipeline and select the notebook.

Create a pipeline

To create a new pipeline using the Delta Live Tables notebook:

Question #289

What is the purpose of the silver layer in a Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full, unprocessed history of data
C . Eliminates duplicate data, quarantines bad data
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Medallion Architecture C Databricks

Silver Layer:

Question #289

What is the purpose of the silver layer in a Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full, unprocessed history of data
C . Eliminates duplicate data, quarantines bad data
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Medallion Architecture C Databricks

Silver Layer:

Question #289

What is the purpose of the silver layer in a Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full, unprocessed history of data
C . Eliminates duplicate data, quarantines bad data
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Medallion Architecture C Databricks

Silver Layer:

Question #289

What is the purpose of the silver layer in a Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full, unprocessed history of data
C . Eliminates duplicate data, quarantines bad data
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Medallion Architecture C Databricks

Silver Layer:

Question #289

What is the purpose of the silver layer in a Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full, unprocessed history of data
C . Eliminates duplicate data, quarantines bad data
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Medallion Architecture C Databricks

Silver Layer:

Question #289

What is the purpose of the silver layer in a Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full, unprocessed history of data
C . Eliminates duplicate data, quarantines bad data
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Medallion Architecture C Databricks

Silver Layer:

Question #289

What is the purpose of the silver layer in a Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full, unprocessed history of data
C . Eliminates duplicate data, quarantines bad data
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Medallion Architecture C Databricks

Silver Layer:

Question #296

Data engineering team is required to share the data with Data science team and both the teams are using different workspaces in the same organization which of the following techniques can be used to simplify sharing data across?

*Please note the question is asking how data is shared within an organization across multiple workspaces.

A . Data Sharing
B . Unity Catalog
C . DELTA lake
D . Use a single storage location
E . DELTA LIVE Pipelines

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is the Unity catalog.

Diagram

Description automatically generated

Unity Catalog works at the Account level, it has the ability to create a meta store and attach that meta store to many workspaces see the below diagram to understand how Unity Catalog Works, as you can see a metastore can now be shared with both workspaces using Unity Catalog, prior to Unity Catalog the options was to use single cloud object storage manually mount in the second databricks workspace, and you can see here Unity Catalog really simplifies that.

Diagram

Description automatically generated with medium confidence sorry for the inconvenience watermark was added because other people on Udemy are copying my questions and images.

duct features

https://databricks.com/product/unity-catalog

Question #297

What is the main difference between the silver layer and the gold layer in medallion architecture?

A . Silver may contain aggregated data
B . Gold may contain aggregated data
C . Data quality checks are applied in gold
D . Silver is a copy of bronze data
E . God is a copy of silver data

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Medallion Architecture C Databricks

Exam focus: Please review the below image and understand the role of each layer(bronze, silver, gold) in medallion architecture, you will see varying questions targeting each layer and its purpose.

Sorry I had to add the watermark some people in Udemy are copying my content.

A diagram of a house

Description automatically generated with low confidence

Question #297

What is the main difference between the silver layer and the gold layer in medallion architecture?

A . Silver may contain aggregated data
B . Gold may contain aggregated data
C . Data quality checks are applied in gold
D . Silver is a copy of bronze data
E . God is a copy of silver data

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Medallion Architecture C Databricks

Exam focus: Please review the below image and understand the role of each layer(bronze, silver, gold) in medallion architecture, you will see varying questions targeting each layer and its purpose.

Sorry I had to add the watermark some people in Udemy are copying my content.

A diagram of a house

Description automatically generated with low confidence

Question #297

What is the main difference between the silver layer and the gold layer in medallion architecture?

A . Silver may contain aggregated data
B . Gold may contain aggregated data
C . Data quality checks are applied in gold
D . Silver is a copy of bronze data
E . God is a copy of silver data

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Medallion Architecture C Databricks

Exam focus: Please review the below image and understand the role of each layer(bronze, silver, gold) in medallion architecture, you will see varying questions targeting each layer and its purpose.

Sorry I had to add the watermark some people in Udemy are copying my content.

A diagram of a house

Description automatically generated with low confidence

Question #297

What is the main difference between the silver layer and the gold layer in medallion architecture?

A . Silver may contain aggregated data
B . Gold may contain aggregated data
C . Data quality checks are applied in gold
D . Silver is a copy of bronze data
E . God is a copy of silver data

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Medallion Architecture C Databricks

Exam focus: Please review the below image and understand the role of each layer(bronze, silver, gold) in medallion architecture, you will see varying questions targeting each layer and its purpose.

Sorry I had to add the watermark some people in Udemy are copying my content.

A diagram of a house

Description automatically generated with low confidence

Question #297

What is the main difference between the silver layer and the gold layer in medallion architecture?

A . Silver may contain aggregated data
B . Gold may contain aggregated data
C . Data quality checks are applied in gold
D . Silver is a copy of bronze data
E . God is a copy of silver data

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Medallion Architecture C Databricks

Exam focus: Please review the below image and understand the role of each layer(bronze, silver, gold) in medallion architecture, you will see varying questions targeting each layer and its purpose.

Sorry I had to add the watermark some people in Udemy are copying my content.

A diagram of a house

Description automatically generated with low confidence

Question #297

What is the main difference between the silver layer and the gold layer in medallion architecture?

A . Silver may contain aggregated data
B . Gold may contain aggregated data
C . Data quality checks are applied in gold
D . Silver is a copy of bronze data
E . God is a copy of silver data

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Medallion Architecture C Databricks

Exam focus: Please review the below image and understand the role of each layer(bronze, silver, gold) in medallion architecture, you will see varying questions targeting each layer and its purpose.

Sorry I had to add the watermark some people in Udemy are copying my content.

A diagram of a house

Description automatically generated with low confidence

Question #297

What is the main difference between the silver layer and the gold layer in medallion architecture?

A . Silver may contain aggregated data
B . Gold may contain aggregated data
C . Data quality checks are applied in gold
D . Silver is a copy of bronze data
E . God is a copy of silver data

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Medallion Architecture C Databricks

Exam focus: Please review the below image and understand the role of each layer(bronze, silver, gold) in medallion architecture, you will see varying questions targeting each layer and its purpose.

Sorry I had to add the watermark some people in Udemy are copying my content.

A diagram of a house

Description automatically generated with low confidence

Question #297

What is the main difference between the silver layer and the gold layer in medallion architecture?

A . Silver may contain aggregated data
B . Gold may contain aggregated data
C . Data quality checks are applied in gold
D . Silver is a copy of bronze data
E . God is a copy of silver data

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Medallion Architecture C Databricks

Exam focus: Please review the below image and understand the role of each layer(bronze, silver, gold) in medallion architecture, you will see varying questions targeting each layer and its purpose.

Sorry I had to add the watermark some people in Udemy are copying my content.

A diagram of a house

Description automatically generated with low confidence

Question #297

What is the main difference between the silver layer and the gold layer in medallion architecture?

A . Silver may contain aggregated data
B . Gold may contain aggregated data
C . Data quality checks are applied in gold
D . Silver is a copy of bronze data
E . God is a copy of silver data

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Medallion Architecture C Databricks

Exam focus: Please review the below image and understand the role of each layer(bronze, silver, gold) in medallion architecture, you will see varying questions targeting each layer and its purpose.

Sorry I had to add the watermark some people in Udemy are copying my content.

A diagram of a house

Description automatically generated with low confidence

Question #297

What is the main difference between the silver layer and the gold layer in medallion architecture?

A . Silver may contain aggregated data
B . Gold may contain aggregated data
C . Data quality checks are applied in gold
D . Silver is a copy of bronze data
E . God is a copy of silver data

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Medallion Architecture C Databricks

Exam focus: Please review the below image and understand the role of each layer(bronze, silver, gold) in medallion architecture, you will see varying questions targeting each layer and its purpose.

Sorry I had to add the watermark some people in Udemy are copying my content.

A diagram of a house

Description automatically generated with low confidence

Question #307

table(table_name))

A . format, checkpointlocation, schemalocation, overwrite
B . cloudfiles.format, checkpointlocation, cloudfiles.schemalocation, overwrite
C . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
D . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, overwrite
E . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, append

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema.

Here is the end to end syntax of streaming ELT, below link contains complete options Auto

Loader options | Databricks on AWS

Question #307

table(table_name))

A . format, checkpointlocation, schemalocation, overwrite
B . cloudfiles.format, checkpointlocation, cloudfiles.schemalocation, overwrite
C . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
D . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, overwrite
E . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, append

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema.

Here is the end to end syntax of streaming ELT, below link contains complete options Auto

Loader options | Databricks on AWS

Question #307

table(table_name))

A . format, checkpointlocation, schemalocation, overwrite
B . cloudfiles.format, checkpointlocation, cloudfiles.schemalocation, overwrite
C . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
D . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, overwrite
E . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, append

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema.

Here is the end to end syntax of streaming ELT, below link contains complete options Auto

Loader options | Databricks on AWS

Question #307

table(table_name))

A . format, checkpointlocation, schemalocation, overwrite
B . cloudfiles.format, checkpointlocation, cloudfiles.schemalocation, overwrite
C . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
D . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, overwrite
E . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, append

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema.

Here is the end to end syntax of streaming ELT, below link contains complete options Auto

Loader options | Databricks on AWS

Question #307

table(table_name))

A . format, checkpointlocation, schemalocation, overwrite
B . cloudfiles.format, checkpointlocation, cloudfiles.schemalocation, overwrite
C . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
D . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, overwrite
E . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, append

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema.

Here is the end to end syntax of streaming ELT, below link contains complete options Auto

Loader options | Databricks on AWS

Question #307

table(table_name))

A . format, checkpointlocation, schemalocation, overwrite
B . cloudfiles.format, checkpointlocation, cloudfiles.schemalocation, overwrite
C . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
D . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, overwrite
E . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, append

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema.

Here is the end to end syntax of streaming ELT, below link contains complete options Auto

Loader options | Databricks on AWS

Question #307

table(table_name))

A . format, checkpointlocation, schemalocation, overwrite
B . cloudfiles.format, checkpointlocation, cloudfiles.schemalocation, overwrite
C . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
D . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, overwrite
E . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, append

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema.

Here is the end to end syntax of streaming ELT, below link contains complete options Auto

Loader options | Databricks on AWS

Question #307

table(table_name))

A . format, checkpointlocation, schemalocation, overwrite
B . cloudfiles.format, checkpointlocation, cloudfiles.schemalocation, overwrite
C . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
D . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, overwrite
E . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, append

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema.

Here is the end to end syntax of streaming ELT, below link contains complete options Auto

Loader options | Databricks on AWS

Question #307

table(table_name))

A . format, checkpointlocation, schemalocation, overwrite
B . cloudfiles.format, checkpointlocation, cloudfiles.schemalocation, overwrite
C . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
D . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, overwrite
E . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, append

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema.

Here is the end to end syntax of streaming ELT, below link contains complete options Auto

Loader options | Databricks on AWS

Question #307

table(table_name))

A . format, checkpointlocation, schemalocation, overwrite
B . cloudfiles.format, checkpointlocation, cloudfiles.schemalocation, overwrite
C . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema
D . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, overwrite
E . cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, append

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is cloudfiles.format, cloudfiles.schemalocation, checkpointlocation, mergeSchema.

Here is the end to end syntax of streaming ELT, below link contains complete options Auto

Loader options | Databricks on AWS

Question #317

A dataset has been defined using Delta Live Tables and includes an expectations clause: CON-STRAINT valid_timestamp EXPECT (timestamp > ‘2020-01-01’) ON VIOLATION FAIL

What is the expected behavior when a batch of data containing data that violates these constraints is processed?

A . Records that violate the expectation are added to the target dataset and recorded as invalid in the event log.
B . Records that violate the expectation are dropped from the target dataset and recorded as invalid in the event log.
C . Records that violate the expectation cause the job to fail
D . Records that violate the expectation are added to the target dataset and flagged as in-valid in a field added to the target dataset.
E . Records that violate the expectation are dropped from the target dataset and loaded into a quarantine table.

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is Records that violate the expectation cause the job to fail.

Delta live tables support three types of expectations to fix bad data in DLT pipelines Review below example code to examine these expectations,

Diagram

Description automatically generated with medium confidence

Invalid records:

Use the expect operator when you want to keep records that violate the expectation.

Records that violate the expectation are added to the target dataset along with valid records:

SQL

CONSTRAINT valid_timestamp EXPECT (timestamp > ‘2020-01-01’)

Drop invalid records:

Use the expect or drop operator to prevent the processing of invalid records. Records that violate the expectation are dropped from the target dataset: SQL

CONSTRAINT valid_timestamp EXPECT (timestamp > ‘2020-01-01’) ON VIOLATION DROP ROW

Fail on invalid records:

When invalid records are unacceptable, use the expect or fail operator to halt execution immediately when a record fails validation.

If the operation is a table update, the system atomically rolls back the transaction:

SQL

CONSTRAINT valid_timestamp EXPECT (timestamp > ‘2020-01-01’) ON VIOLATION FAIL UP-DATE

Question #318

The current ELT pipeline is receiving data from the operations team once a day so you had setup an AUTO LOADER process to run once a day using trigger (Once = True) and scheduled a job to run once a day, operations team recently rolled out a new feature that allows them to send data every 1 min, what changes do you need to make to AUTO LOADER to process the data every 1 min.

A . Convert AUTO LOADER to structured streaming
B . Change AUTO LOADER trigger to .trigger(ProcessingTime = "1 minute")
C . Setup a job cluster run the notebook once a minute
D . Enable stream processing
E . Change AUTO LOADER trigger to ("1 minute")

Reveal Solution Hide Solution

Correct Answer: B

Question #319

When you drop an external DELTA table using the SQL Command DROP TABLE table_name, how does it impact metadata (delta log, history), and data stored in the storage?

A . Drops table from metastore, metadata (delta log, history) and data in storage
B . Drops table from metastore, data but keeps metadata (delta log, history) in storage
C . Drops table from metastore, metadata (delta log, history) but keeps the data in storage
D . Drops table from metastore, but keeps metadata (delta log, history) and data in storage
E . Drops table from metastore and data in storage but keeps metadata (delta log, history)

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is Drops table from metastore, but keeps metadata and data in storage. When an external table is dropped, only the table definition is dropped from metastore everything including data and metadata (Delta transaction log, time travel history) remains in the storage. Delta log is considered as part of metadata because if you drop a column in a delta table (managed or external) the column is not physically removed from the parquet files rather it is recorded in the delta log. The delta log becomes a key metadata layer for a Delta table to work.

Please see the below image to compare the external delta table and managed delta table and how they differ in how they are created and what happens if you drop the table.

Diagram

Description automatically generated

Question #320

How do you access or use tables in the unity catalog?

A . schema_name.table_name
B . schema_name.catalog_name.table_name
C . catalog_name.table_name
D . catalog_name.database_name.schema_name.table_name
E . catalog_name.schema_name.table_name

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is catalog_name.schema_name.table_name

Graphical user

interface, diagram

Description automatically generated

Note: Database and Schema are analogous they are interchangeably used in the Unity catalog.

FYI, A catalog is registered under a metastore, by default every workspace has a default metastore called hive_metastore, with a unity catalog you have the ability to create meatstores and share that across multiple workspaces.

Diagram

Description automatically generated

Question #321

When defining external tables using formats CSV, JSON, TEXT, BINARY any query on the exter-nal tables caches the data and location for performance reasons, so within a given spark session any new files that may have arrived will not be available after the initial query.

How can we address this limitation?

A . UNCACHE TABLE table_name
B . CACHE TABLE table_name
C . REFRESH TABLE table_name
D . BROADCAST TABLE table_name
E . CLEAR CACH table_name

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is REFRESH TABLE table_name

REFRESH TABLE table_name will force Spark to refresh the availability of external files and any changes.

When spark queries an external table it caches the files associated with it, so that way if the table is queried again it can use the cached files so it does not have to retrieve them again from cloud object storage, but the drawback here is that if new files are available Spark does not know until the Refresh command is ran.

Question #322

Which of the following developer operations in the CI/CD can only be implemented through a GIT provider when using Databricks Repos.

A . Trigger Databricks Repos pull API to update the latest version
B . Commit and push code
C . Create and edit code
D . Create a new branch
E . Pull request and review process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Pull request and review process, please note: the question is asking for steps that are being implemented in GIT provider not Databricks Repos.

See below diagram to understand the role of Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Bottom of Form

Top of Form

Question #322

Which of the following developer operations in the CI/CD can only be implemented through a GIT provider when using Databricks Repos.

A . Trigger Databricks Repos pull API to update the latest version
B . Commit and push code
C . Create and edit code
D . Create a new branch
E . Pull request and review process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Pull request and review process, please note: the question is asking for steps that are being implemented in GIT provider not Databricks Repos.

See below diagram to understand the role of Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Bottom of Form

Top of Form

Question #322

Which of the following developer operations in the CI/CD can only be implemented through a GIT provider when using Databricks Repos.

A . Trigger Databricks Repos pull API to update the latest version
B . Commit and push code
C . Create and edit code
D . Create a new branch
E . Pull request and review process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Pull request and review process, please note: the question is asking for steps that are being implemented in GIT provider not Databricks Repos.

See below diagram to understand the role of Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Bottom of Form

Top of Form

Question #322

Which of the following developer operations in the CI/CD can only be implemented through a GIT provider when using Databricks Repos.

A . Trigger Databricks Repos pull API to update the latest version
B . Commit and push code
C . Create and edit code
D . Create a new branch
E . Pull request and review process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Pull request and review process, please note: the question is asking for steps that are being implemented in GIT provider not Databricks Repos.

See below diagram to understand the role of Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Bottom of Form

Top of Form

Question #322

Which of the following developer operations in the CI/CD can only be implemented through a GIT provider when using Databricks Repos.

A . Trigger Databricks Repos pull API to update the latest version
B . Commit and push code
C . Create and edit code
D . Create a new branch
E . Pull request and review process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Pull request and review process, please note: the question is asking for steps that are being implemented in GIT provider not Databricks Repos.

See below diagram to understand the role of Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Bottom of Form

Top of Form

Question #322

Which of the following developer operations in the CI/CD can only be implemented through a GIT provider when using Databricks Repos.

A . Trigger Databricks Repos pull API to update the latest version
B . Commit and push code
C . Create and edit code
D . Create a new branch
E . Pull request and review process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Pull request and review process, please note: the question is asking for steps that are being implemented in GIT provider not Databricks Repos.

See below diagram to understand the role of Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Bottom of Form

Top of Form

Question #322

Which of the following developer operations in the CI/CD can only be implemented through a GIT provider when using Databricks Repos.

A . Trigger Databricks Repos pull API to update the latest version
B . Commit and push code
C . Create and edit code
D . Create a new branch
E . Pull request and review process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Pull request and review process, please note: the question is asking for steps that are being implemented in GIT provider not Databricks Repos.

See below diagram to understand the role of Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Bottom of Form

Top of Form

Question #322

Which of the following developer operations in the CI/CD can only be implemented through a GIT provider when using Databricks Repos.

A . Trigger Databricks Repos pull API to update the latest version
B . Commit and push code
C . Create and edit code
D . Create a new branch
E . Pull request and review process

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

The answer is Pull request and review process, please note: the question is asking for steps that are being implemented in GIT provider not Databricks Repos.

See below diagram to understand the role of Databricks Repos and Git provider plays when building a CI/CD workdlow.

All the steps highlighted in yellow can be done Databricks Repo, all the steps highlighted in Gray are done in a git provider like Github or Azure Devops.

Diagram

Description automatically generated

Bottom of Form

Top of Form

Question #330

print(f"query failed")

A . try: failure:
B . try: catch:
C . try: except: (Correct)
D . try: fail:
E . try: error:

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is try: and except:

Question #331

How does Lakehouse replace the dependency on using Data lakes and Data warehouses in a Data and Analytics solution?

A . Open, direct access to data stored in standard data formats.
B . Supports ACID transactions.
C . Supports BI and Machine learning workloads
D . Support for end-to-end streaming and batch workloads
E . All the above

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

Lakehouse combines the benefits of a data warehouse and data lakes, Lakehouse = Data Lake + DataWarehouse

Here are some of the major benefits of a lakehouse

Text, letter

Description automatically generated

Lakehouse = Data Lake + DataWarehouse

A picture

containing text, blackboard

Description automatically generated

Question #332

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?

A . Use Unity catalog to grant access to rows and columns
B . Row and column access control lists
C . Use dynamic view functions
D . Data access control lists
E . Dynamic Access control lists with Unity Catalog

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Use dynamic view functions.

Here is an example that limits access to rows based on the user being part managers group, in the below view if a user is not a part of the manager’s group you can only see rows where the total amount is <= 1000000

Dynamic view function to filter rows

Question #332

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?

A . Use Unity catalog to grant access to rows and columns
B . Row and column access control lists
C . Use dynamic view functions
D . Data access control lists
E . Dynamic Access control lists with Unity Catalog

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Use dynamic view functions.

Here is an example that limits access to rows based on the user being part managers group, in the below view if a user is not a part of the manager’s group you can only see rows where the total amount is <= 1000000

Dynamic view function to filter rows

Question #332

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?

A . Use Unity catalog to grant access to rows and columns
B . Row and column access control lists
C . Use dynamic view functions
D . Data access control lists
E . Dynamic Access control lists with Unity Catalog

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Use dynamic view functions.

Here is an example that limits access to rows based on the user being part managers group, in the below view if a user is not a part of the manager’s group you can only see rows where the total amount is <= 1000000

Dynamic view function to filter rows

Question #332

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?

A . Use Unity catalog to grant access to rows and columns
B . Row and column access control lists
C . Use dynamic view functions
D . Data access control lists
E . Dynamic Access control lists with Unity Catalog

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Use dynamic view functions.

Here is an example that limits access to rows based on the user being part managers group, in the below view if a user is not a part of the manager’s group you can only see rows where the total amount is <= 1000000

Dynamic view function to filter rows

Question #332

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?

A . Use Unity catalog to grant access to rows and columns
B . Row and column access control lists
C . Use dynamic view functions
D . Data access control lists
E . Dynamic Access control lists with Unity Catalog

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Use dynamic view functions.

Here is an example that limits access to rows based on the user being part managers group, in the below view if a user is not a part of the manager’s group you can only see rows where the total amount is <= 1000000

Dynamic view function to filter rows

Question #332

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?

A . Use Unity catalog to grant access to rows and columns
B . Row and column access control lists
C . Use dynamic view functions
D . Data access control lists
E . Dynamic Access control lists with Unity Catalog

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Use dynamic view functions.

Here is an example that limits access to rows based on the user being part managers group, in the below view if a user is not a part of the manager’s group you can only see rows where the total amount is <= 1000000

Dynamic view function to filter rows

Question #332

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?

A . Use Unity catalog to grant access to rows and columns
B . Row and column access control lists
C . Use dynamic view functions
D . Data access control lists
E . Dynamic Access control lists with Unity Catalog

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Use dynamic view functions.

Here is an example that limits access to rows based on the user being part managers group, in the below view if a user is not a part of the manager’s group you can only see rows where the total amount is <= 1000000

Dynamic view function to filter rows

Question #332

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?

A . Use Unity catalog to grant access to rows and columns
B . Row and column access control lists
C . Use dynamic view functions
D . Data access control lists
E . Dynamic Access control lists with Unity Catalog

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Use dynamic view functions.

Here is an example that limits access to rows based on the user being part managers group, in the below view if a user is not a part of the manager’s group you can only see rows where the total amount is <= 1000000

Dynamic view function to filter rows

Question #332

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?

A . Use Unity catalog to grant access to rows and columns
B . Row and column access control lists
C . Use dynamic view functions
D . Data access control lists
E . Dynamic Access control lists with Unity Catalog

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Use dynamic view functions.

Here is an example that limits access to rows based on the user being part managers group, in the below view if a user is not a part of the manager’s group you can only see rows where the total amount is <= 1000000

Dynamic view function to filter rows

Question #332

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?

A . Use Unity catalog to grant access to rows and columns
B . Row and column access control lists
C . Use dynamic view functions
D . Data access control lists
E . Dynamic Access control lists with Unity Catalog

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Use dynamic view functions.

Here is an example that limits access to rows based on the user being part managers group, in the below view if a user is not a part of the manager’s group you can only see rows where the total amount is <= 1000000

Dynamic view function to filter rows

Question #332

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?

A . Use Unity catalog to grant access to rows and columns
B . Row and column access control lists
C . Use dynamic view functions
D . Data access control lists
E . Dynamic Access control lists with Unity Catalog

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Use dynamic view functions.

Here is an example that limits access to rows based on the user being part managers group, in the below view if a user is not a part of the manager’s group you can only see rows where the total amount is <= 1000000

Dynamic view function to filter rows

Question #332

Which of the following technique can be used to implement fine-grained access control to rows and columns of the Delta table based on the user’s access?

A . Use Unity catalog to grant access to rows and columns
B . Row and column access control lists
C . Use dynamic view functions
D . Data access control lists
E . Dynamic Access control lists with Unity Catalog

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, Use dynamic view functions.

Here is an example that limits access to rows based on the user being part managers group, in the below view if a user is not a part of the manager’s group you can only see rows where the total amount is <= 1000000

Dynamic view function to filter rows

Question #344

Which of the statements are incorrect when choosing between lakehouse and Datawarehouse?

A . Lakehouse can have special indexes and caching which are optimized for Machine learning
B . Lakehouse cannot serve low query latency with high reliability for BI workloads, only suitable for batch workloads.
C . Lakehouse can be accessed through various API’s including but not limited to Python/R/SQL
D . Traditional Data warehouses have storage and compute are coupled.
E . Lakehouse uses standard data formats like Parquet.

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

The answer is Lakehouse cannot serve low query latency with high reliability for BI workloads, only suitable for batch workloads.

Lakehouse can replace traditional warehouses by leveraging storage and compute optimizations like caching to serve them with low query latency with high reliability.

Focus on comparisons between Spark Cache vs Delta Cache.

https://docs.databricks.com/delta/optimizations/delta-cache.html

What Is a Lakehouse? – The Databricks Blog

Graphical user

interface, text, application

Description automatically generated

Bottom of Form

Top of Form

Question #345

Which of the following statements are true about a lakehouse?

A . Lakehouse only supports Machine learning workloads and Data warehouses support BI workloads
B . Lakehouse only supports end-to-end streaming workloads and Data warehouses support Batch workloads
C . Lakehouse does not support ACID
D . Lakehouse do not support SQL
E . Lakehouse supports Transactions

Reveal Solution Hide Solution

Correct Answer: E
E

Explanation:

What Is a Lakehouse? – The Databricks Blog

Text

Description automatically generated

Question #346

Where are Interactive notebook results stored in Databricks product architecture?

A . Data plane
B . Control plane
C . Data and Control plane
D . JDBC data source
E . Databricks web application

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is Data and Control plane,

Only Job results are stored in Data Plane (your storage), Interactive notebook results are stored in a combination of the control plane (partial results for presentation in the UI) and customer storage.

https://docs.microsoft.com/en-us/azure/databricks/getting-started/overview#–high-level-architecture

Snippet from the above documentation,

Graphical user

interface, application

Description automatically generated

How to change this behavior?

You can change this behavior using Workspace/Admin Console settings for that workspace, once enabled all of the interactive results are stored in the customer account(data plane) except the new notebook visualization feature Databricks has recently introduced, this still stores some metadata in the control pane irrespective of the below settings. please refer to the documentation for more details.

Graphical user

interface, text, application, email

Description automatically generated

Why is this important to know?

I recently worked on a project where we had to deal with sensitive information of customers and we had a security requirement that all of the data need to be stored in the data plane including notebook results.

Question #347

Which of the following SQL statement can be used to query a table by eliminating duplicate rows from the query results?

A . SELECT DISTINCT * FROM table_name
B . SELECT DISTINCT * FROM table_name HAVING COUNT(*) > 1
C . SELECT DISTINCT_ROWS (*) FROM table_name
D . SELECT * FROM table_name GROUP BY * HAVING COUNT(*) < 1
E . SELECT * FROM table_name GROUP BY * HAVING COUNT(*) > 1

Reveal Solution Hide Solution

Correct Answer: A
A

Explanation:

The answer is SELECT DISTINCT * FROM table_name

Question #348

You are looking to process the data based on two variables, one to check if the department is supply chain and second to check if process flag is set to True

A . if department = “supply chain” & process:
B . if department == “supply chain” && process:
C . if department == “supply chain” & process == TRUE:
D . if department == “supply chain” & if process == TRUE:
E . if department == "supply chain" and process:

Reveal Solution Hide Solution

Correct Answer: E

Question #349

When writing streaming data, Spark’s structured stream supports the below write modes

A . Append, Delta, Complete
B . Delta, Complete, Continuous
C . Append, Complete, Update
D . Complete, Incremental, Update
E . Append, overwrite, Continuous

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is Append, Complete, Update

•Append mode (default) – This is the default mode, where only the new rows added to the Result Table since the last trigger will be outputted to the sink. This is supported for only those queries where rows added to the Result Table is never going to change. Hence, this mode guarantees that each row will be output only once (assuming fault-tolerant sink). For example, queries with only select, where, map, flatMap, filter, join, etc. will support Append mode.

• Complete mode – The whole Result Table will be outputted to the sink after every trigger.

This is supported for aggregation queries.

• Update mode – (Available since Spark 2.1.1) Only the rows in the Result Table that were updated since the last trigger will be outputted to the sink. More information to be added in future releases.

Question #350

Which of the following is correct for the global temporary view?

A . global temporary views cannot be accessed once the notebook is detached and attached
B . global temporary views can be accessed across many clusters
C . global temporary views can be still accessed even if the notebook is detached and at-tached
D . global temporary views can be still accessed even if the cluster is restarted
E . global temporary views are created in a database called temp database

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is global temporary views can be still accessed even if the notebook is detached and attached

There are two types of temporary views that can be created Local and Global

• A local temporary view is only available with a spark session, so another notebook in the same cluster can not access it. if a notebook is detached and reattached local temporary view is lost.

• A global temporary view is available to all the notebooks in the cluster, even if the notebook is detached and reattached it can still be accessible but if a cluster is restarted the global temporary view is lost.

Question #351

Which of the following statements can be used to test the functionality of code to test number of rows in the table equal to 10 in python?

row_count = spark.sql("select count(*) from table").collect()[0][0]

A . assert (row_count = 10, "Row count did not match")
B . assert if (row_count = 10, "Row count did not match")
C . assert row_count == 10, "Row count did not match"
D . assert if row_count == 10, "Row count did not match"
E . assert row_count = 10, "Row count did not match"

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is assert row_count == 10, "Row count did not match"

Review below documentation

Question #351

Which of the following statements can be used to test the functionality of code to test number of rows in the table equal to 10 in python?

row_count = spark.sql("select count(*) from table").collect()[0][0]

A . assert (row_count = 10, "Row count did not match")
B . assert if (row_count = 10, "Row count did not match")
C . assert row_count == 10, "Row count did not match"
D . assert if row_count == 10, "Row count did not match"
E . assert row_count = 10, "Row count did not match"

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is assert row_count == 10, "Row count did not match"

Review below documentation

Question #351

Which of the following statements can be used to test the functionality of code to test number of rows in the table equal to 10 in python?

row_count = spark.sql("select count(*) from table").collect()[0][0]

A . assert (row_count = 10, "Row count did not match")
B . assert if (row_count = 10, "Row count did not match")
C . assert row_count == 10, "Row count did not match"
D . assert if row_count == 10, "Row count did not match"
E . assert row_count = 10, "Row count did not match"

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is assert row_count == 10, "Row count did not match"

Review below documentation

Question #351

Which of the following statements can be used to test the functionality of code to test number of rows in the table equal to 10 in python?

row_count = spark.sql("select count(*) from table").collect()[0][0]

A . assert (row_count = 10, "Row count did not match")
B . assert if (row_count = 10, "Row count did not match")
C . assert row_count == 10, "Row count did not match"
D . assert if row_count == 10, "Row count did not match"
E . assert row_count = 10, "Row count did not match"

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is assert row_count == 10, "Row count did not match"

Review below documentation

Question #351

Which of the following statements can be used to test the functionality of code to test number of rows in the table equal to 10 in python?

row_count = spark.sql("select count(*) from table").collect()[0][0]

A . assert (row_count = 10, "Row count did not match")
B . assert if (row_count = 10, "Row count did not match")
C . assert row_count == 10, "Row count did not match"
D . assert if row_count == 10, "Row count did not match"
E . assert row_count = 10, "Row count did not match"

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is assert row_count == 10, "Row count did not match"

Review below documentation

Question #351

Which of the following statements can be used to test the functionality of code to test number of rows in the table equal to 10 in python?

row_count = spark.sql("select count(*) from table").collect()[0][0]

A . assert (row_count = 10, "Row count did not match")
B . assert if (row_count = 10, "Row count did not match")
C . assert row_count == 10, "Row count did not match"
D . assert if row_count == 10, "Row count did not match"
E . assert row_count = 10, "Row count did not match"

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is assert row_count == 10, "Row count did not match"

Review below documentation

Question #351

Which of the following statements can be used to test the functionality of code to test number of rows in the table equal to 10 in python?

row_count = spark.sql("select count(*) from table").collect()[0][0]

A . assert (row_count = 10, "Row count did not match")
B . assert if (row_count = 10, "Row count did not match")
C . assert row_count == 10, "Row count did not match"
D . assert if row_count == 10, "Row count did not match"
E . assert row_count = 10, "Row count did not match"

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is assert row_count == 10, "Row count did not match"

Review below documentation

Question #358

SELECT * FROM ____________________

A . SELECT * FROM f{schema_name.table_name}
B . SELECT * FROM {schem_name.table_name}
C . SELECT * FROM ${schema_name}.${table_name}
D . SELECT * FROM schema_name.table_name

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, SELECT * FROM ${schema_name}.${table_name} %python

table_name = "sales"

schema_name = "bronze"

%sql

SELECT * FROM ${schema_name}.${table_name}

${python variable} -> Python variables in Databricks SQL code

Question #359

When using the complete mode to write stream data, how does it impact the target table?

A . Entire stream waits for complete data to write
B . Stream must complete to write the data
C . Target table cannot be updated while stream is pending
D . Target table is overwritten for each batch
E . Delta commits transaction once the stream is stopped

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is Target table is overwritten for each batch

Complete mode – The whole Result Table will be outputted to the sink after every trigger. This is supported for aggregation queries

Question #359

When using the complete mode to write stream data, how does it impact the target table?

A . Entire stream waits for complete data to write
B . Stream must complete to write the data
C . Target table cannot be updated while stream is pending
D . Target table is overwritten for each batch
E . Delta commits transaction once the stream is stopped

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is Target table is overwritten for each batch

Complete mode – The whole Result Table will be outputted to the sink after every trigger. This is supported for aggregation queries

Question #359

When using the complete mode to write stream data, how does it impact the target table?

A . Entire stream waits for complete data to write
B . Stream must complete to write the data
C . Target table cannot be updated while stream is pending
D . Target table is overwritten for each batch
E . Delta commits transaction once the stream is stopped

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is Target table is overwritten for each batch

Complete mode – The whole Result Table will be outputted to the sink after every trigger. This is supported for aggregation queries

Question #362

FROM carts GROUP BY cartId

Expected result:

cartId items

1 [1,100,200,300,250]

A . FLATTEN, COLLECT_UNION
B . ARRAY_UNION, FLATTEN
C . ARRAY_UNION, ARRAY_DISTINT
D . ARRAY_UNION, COLLECT_SET
E . ARRAY_DISTINCT, ARRAY_UNION

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

COLLECT SET is a kind of aggregate function that combines a column value from all rows into a unique list ARRAY_UNION combines and removes any duplicates,

Graphical user

interface, application

Description automatically generated with medium confidence

Question #363

While investigating a data issue in a Delta table, you wanted to review logs to see when and who updated the table, what is the best way to review this data?

A . Review event logs in the Workspace
B . Run SQL SHOW HISTORY table_name
C . Check Databricks SQL Audit logs
D . Run SQL command DESCRIBE HISTORY table_name
E . Review workspace audit logs

Reveal Solution Hide Solution

Question #364

What is the purpose of a silver layer in Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full and unprocessed history of data
C . A schema is enforced, with data quality checks.
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, A schema is enforced, with data quality checks.

Medallion Architecture C Databricks

Silver Layer:

Question #364

What is the purpose of a silver layer in Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full and unprocessed history of data
C . A schema is enforced, with data quality checks.
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, A schema is enforced, with data quality checks.

Medallion Architecture C Databricks

Silver Layer:

Question #364

What is the purpose of a silver layer in Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full and unprocessed history of data
C . A schema is enforced, with data quality checks.
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, A schema is enforced, with data quality checks.

Medallion Architecture C Databricks

Silver Layer:

Question #364

What is the purpose of a silver layer in Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full and unprocessed history of data
C . A schema is enforced, with data quality checks.
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, A schema is enforced, with data quality checks.

Medallion Architecture C Databricks

Silver Layer:

Question #364

What is the purpose of a silver layer in Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full and unprocessed history of data
C . A schema is enforced, with data quality checks.
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, A schema is enforced, with data quality checks.

Medallion Architecture C Databricks

Silver Layer:

Question #364

What is the purpose of a silver layer in Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full and unprocessed history of data
C . A schema is enforced, with data quality checks.
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, A schema is enforced, with data quality checks.

Medallion Architecture C Databricks

Silver Layer:

Question #364

What is the purpose of a silver layer in Multi hop architecture?

A . Replaces a traditional data lake
B . Efficient storage and querying of full and unprocessed history of data
C . A schema is enforced, with data quality checks.
D . Refined views with aggregated data
E . Optimized query performance for business-critical data

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is, A schema is enforced, with data quality checks.

Medallion Architecture C Databricks

Silver Layer:

Question #371

Why does AUTO LOADER require schema location?

A . Schema location is used to store user provided schema
B . Schema location is used to identify the schema of target table
C . AUTO LOADER does not require schema location, because its supports Schema evolution
D . Schema location is used to store schema inferred by AUTO LOADER
E . Schema location is used to identify the schema of target table and source table

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is, Schema location is used to store schema inferred by AUTO LOADER, so the next time AUTO LOADER runs faster as does not need to infer the schema every single time by trying to use the last known schema.

Auto Loader samples the first 50 GB or 1000 files that it discovers, whichever limit is crossed first. To avoid incurring this inference cost at every stream start up, and to be able to provide a stable schema across stream restarts, you must set the option cloudFiles.schemaLocation. Auto Loader creates a hidden directory _schemas at this location to track schema changes to the input data over time.

The below link contains detailed documentation on different options Auto Loader options | Databricks on AWS

Question #372

How to determine if a table is a managed table vs external table?

A . Run IS_MANAGED(‘table_name’) function
B . All external tables are stored in data lake, managed tables are stored in DELTA lake
C . All managed tables are stored in unity catalog
D . Run SQL command DESCRIBE EXTENDED table_name and check type
E . Run SQL command SHOW TABLES to see the type of the table

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is Run SQL command DESCRIBE EXTENDED table_name and check type Example of External table

Graphical user

interface, text, application

Description automatically generated

Example of managed table

Graphical user

interface, text, application, Teams

Description automatically generated

Question #373

In order to use Unity catalog features, which of the following steps needs to be taken on man-aged/external tables in the Databricks workspace?

A . Enable unity catalog feature in workspace settings
B . Migrate/upgrade objects in workspace managed/external tables/view to unity catalog
C . Upgrade to DBR version 15.0
D . Copy data from workspace to unity catalog
E . Upgrade workspace to Unity catalog

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

Upgrade tables and views to Unity Catalog – Azure Databricks | Microsoft Docs

Managed table: Upgrade a managed to Unity Catalog

External table: Upgrade an external table to Unity Catalog

Question #374

Data science team members are using a single cluster to perform data analysis, although cluster size was chosen to handle multiple users and auto-scaling was enabled, the team realized queries are still running slow, what would be the suggested fix for this?

A . Setup multiple clusters so each team member has their own cluster
B . Disable the auto-scaling feature
C . Use High concurrency mode instead of the standard mode
D . Increase the size of the driver node

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

The answer is Use High concurrency mode instead of the standard mode, https://docs.databricks.com/clusters/cluster-config-best-practices.html#cluster-mode High Concurrency clusters are ideal for groups of users who need to share resources or run ad-hoc jobs. Databricks recommends enabling autoscaling for High Concurrency clusters.

Question #375

Kevin is the owner of the schema sales, Steve wanted to create new table in sales schema called regional_sales so Kevin grants the create table permissions to Steve. Steve creates the new table called regional_sales in sales schema, who is the owner of the table regional_sales

A . Kevin is the owner of sales schema, all the tables in the schema will be owned by Kevin
B . Steve is the owner of the table
C . By default ownership is assigned DBO
D . By default ownership is assigned to DEFAULT_OWNER
E . Kevin and Smith both are owners of table

Reveal Solution Hide Solution

Correct Answer: B
B

Explanation:

A user who creates the object becomes its owner, does not matter who is the owner of the

parent object.

Question #376

You are currently working with the second team and both teams are looking to modify the same notebook, you noticed that the second member is copying the notebooks to the personal folder to edit and replace the collaboration notebook, which notebook feature do you recommend to make the process easier to collaborate.

A . Databricks notebooks should be copied to a local machine and setup source control locally to version the notebooks
B . Databricks notebooks support automatic change tracking and versioning
C . Databricks Notebooks support real-time coauthoring on a single notebook
D . Databricks notebooks can be exported into dbc archive files and stored in data lake
E . Databricks notebook can be exported as HTML and imported at a later time

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Answer is Databricks Notebooks support real-time coauthoring on a single notebook Every change is saved, and a notebook can be changed my multiple users.

Question #377

Your colleague was walking you through how a job was setup, but you noticed a warning message that said, “Jobs running on all-purpose cluster are considered all purpose compute", the colleague was not sure why he was getting the warning message, how do you best explain this warning mes-sage?

A . All-purpose clusters cannot be used for Job clusters, due to performance issues.
B . All-purpose clusters take longer to start the cluster vs a job cluster
C . All-purpose clusters are less expensive than the job clusters
D . All-purpose clusters are more expensive than the job clusters
E . All-purpose cluster provide interactive messages that can not be viewed in a job

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

Warning message:

Graphical user

interface, text, application, email

Description automatically generated

Pricing for All-purpose clusters are more expensive than the job clusters AWS pricing(Aug 15th 2022)

Graphical user

interface

Description automatically generated

Bottom of Form

Top of Form

Question #378

Which of the following functions can be used to convert JSON string to Struct data type?

A . TO_STRUCT (json value)
B . FROM_JSON (json value)
C . FROM_JSON (json value, schema of json)
D . CONVERT (json value, schema of json)
E . CAST (json value as STRUCT)

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Syntax

Copy

Question #378

Which of the following functions can be used to convert JSON string to Struct data type?

A . TO_STRUCT (json value)
B . FROM_JSON (json value)
C . FROM_JSON (json value, schema of json)
D . CONVERT (json value, schema of json)
E . CAST (json value as STRUCT)

Reveal Solution Hide Solution

Correct Answer: C
C

Explanation:

Syntax

Copy

Question #380

You are asked to write a python function that can read data from a delta table and return the Data-Frame, which of the following is correct?

A . Python function cannot return a DataFrame
B . Write SQL UDF to return a DataFrame
C . Write SQL UDF that can return tabular data
D . Python function will result in out of memory error due to data volume
E . Python function can return a DataFrame

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is Python function can return a DataFrame

The function would something like this,

Question #380

You are asked to write a python function that can read data from a delta table and return the Data-Frame, which of the following is correct?

A . Python function cannot return a DataFrame
B . Write SQL UDF to return a DataFrame
C . Write SQL UDF that can return tabular data
D . Python function will result in out of memory error due to data volume
E . Python function can return a DataFrame

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is Python function can return a DataFrame

The function would something like this,

Question #380

You are asked to write a python function that can read data from a delta table and return the Data-Frame, which of the following is correct?

A . Python function cannot return a DataFrame
B . Write SQL UDF to return a DataFrame
C . Write SQL UDF that can return tabular data
D . Python function will result in out of memory error due to data volume
E . Python function can return a DataFrame

Reveal Solution Hide Solution

Correct Answer: D
D

Explanation:

The answer is Python function can return a DataFrame

The function would something like this,