The code block displayed below contains an error. The code block is intended to write DataFrame transactionsDf to disk as a parquet file in location /FileStore/transactions_split, using column

The code block displayed below contains an error. The code block is intended to write DataFrame transactionsDf to disk as a parquet file in location /FileStore/transactions_split, using column

storeId as key for partitioning. Find the error.

Code block:

transactionsDf.write.format("parquet").partitionOn("storeId").save("/FileStore/transactions_s plit")A.

A. The format("parquet") expression is inappropriate to use here, "parquet" should be passed as first argument to the save() operator and "/FileStore/transactions_split" as the second argument.

B. Partitioning data by storeId is possible with the partitionBy expression, so partitionOn should be replaced by partitionBy.

C. Partitioning data by storeId is possible with the bucketBy expression, so partitionOn should be replaced by bucketBy.

D. partitionOn("storeId") should be called before the write operation.

E. The format("parquet") expression should be removed and instead, the information should be added to the write expression like so: write("parquet").

Answer: B

Explanation:

Correct code block:

transactionsDf.write.format("parquet").partitionBy("storeId").save("/FileStore/transactions_s plit")

More info: partition by – Reading files which are written using PartitionBy or BucketBy in Spark – Stack Overflow

Static notebook | Dynamic notebook: See test 1,

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>