The code block displayed below contains an error. The code block is intended to write DataFrame transactionsDf to disk as a parquet file in location /FileStore/transactions_split, using column
storeId as key for partitioning. Find the error.
Code block:
transactionsDf.write.format("parquet").partitionOn("storeId").save("/FileStore/transactions_s plit")A.
A. The format("parquet") expression is inappropriate to use here, "parquet" should be passed as first argument to the save() operator and "/FileStore/transactions_split" as the second argument.
B. Partitioning data by storeId is possible with the partitionBy expression, so partitionOn should be replaced by partitionBy.
C. Partitioning data by storeId is possible with the bucketBy expression, so partitionOn should be replaced by bucketBy.
D. partitionOn("storeId") should be called before the write operation.
E. The format("parquet") expression should be removed and instead, the information should be added to the write expression like so: write("parquet").
Answer: B
Explanation:
Correct code block:
transactionsDf.write.format("parquet").partitionBy("storeId").save("/FileStore/transactions_s plit")
More info: partition by – Reading files which are written using PartitionBy or BucketBy in Spark – Stack Overflow
Static notebook | Dynamic notebook: See test 1,
Latest Databricks Certified Associate Developer for Apache Spark 3.0 Dumps Valid Version with 180 Q&As
Latest And Valid Q&A | Instant Download | Once Fail, Full Refund