What should you do?

Your company’s on-premises Apache Hadoop servers are approaching end-of-life, and IT has decided to migrate the cluster to Google Cloud Dataproc. A like-for-like migration of the cluster would require 50 TB of Google Persistent Disk per node. The CIO is concerned about the cost of using that much block storage....

December 15, 2023 No Comments READ MORE +

You want to use a BigQuery table as a data sink. In which writing mode(s) can you use BigQuery as a sink?

You want to use a BigQuery table as a data sink. In which writing mode(s) can you use BigQuery as a sink?A . Both batch and streamingB . BigQuery cannot be used as a sinkC . Only batchD . Only streamingView AnswerAnswer: A Explanation: When you apply a BigQueryIO.Write transform...

December 15, 2023 No Comments READ MORE +

Which learning algorithm should you use?

You are creating a model to predict housing prices. Due to budget constraints, you must run it on a single resource-constrained virtual machine. Which learning algorithm should you use?A . Linear regressionB . Logistic classificationC . Recurrent neural networkD . Feedforward neural networkView AnswerAnswer: A

December 15, 2023 No Comments READ MORE +

Which schema should you use?

MJTelco needs you to create a schema in Google Bigtable that will allow for the historical analysis of the last 2 years of records. Each record that comes in is sent every 15 minutes, and contains a unique identifier of the device and a data record. The most common query...

December 15, 2023 No Comments READ MORE +

What are two of the characteristics of using online prediction rather than batch prediction?

What are two of the characteristics of using online prediction rather than batch prediction?A . It is optimized to handle a high volume of data instances in a job and to run more complex models.B . Predictions are returned in the response message.C . Predictions are written to output files...

December 15, 2023 No Comments READ MORE +

Which component will be used for the data processing operation?

You are developing a software application using Google's Dataflow SDK, and want to use conditional, for loops and other complex programming structures to create a branching pipeline. Which component will be used for the data processing operation?A . PCollectionB . TransformC . PipelineD . Sink APIView AnswerAnswer: B Explanation: In...

December 15, 2023 No Comments READ MORE +

What are two of the benefits of using denormalized data structures in BigQuery?

What are two of the benefits of using denormalized data structures in BigQuery?A . Reduces the amount of data processed, reduces the amount of storage requiredB . Increases query speed, makes queries simplerC . Reduces the amount of storage required, increases query speedD . Reduces the amount of data processed,...

December 15, 2023 No Comments READ MORE +

Why do you need to split a machine learning dataset into training data and test data?

Why do you need to split a machine learning dataset into training data and test data?A . So you can try two different sets of featuresB . To make sure your model is generalized for more than just the training dataC . To allow you to create unit tests in...

December 14, 2023 No Comments READ MORE +

Which three steps should you take?

Your company handles data processing for a number of different clients. Each client prefers to use their own suite of analytics tools, with some allowing direct query access via Google BigQuery. You need to secure the data so that clients cannot see each other’s data. You want to ensure appropriate...

December 14, 2023 No Comments READ MORE +

Which of the following is not true about Dataflow pipelines?

Which of the following is not true about Dataflow pipelines?A . Pipelines are a set of operationsB . Pipelines represent a data processing jobC . Pipelines represent a directed graph of stepsD . Pipelines can share data between instancesView AnswerAnswer: D Explanation: The data and transforms in a pipeline are...

December 14, 2023 No Comments READ MORE +