Which action should you try first to increase the efficiency of your pipeline?

You are profiling the performance of your TensorFlow model training time and notice a performance issue caused by inefficiencies in the input data pipeline for a single 5 terabyte CSV file dataset on Cloud Storage. You need to optimize the input pipeline performance.

Which action should you try first to increase the efficiency of your pipeline?
A . Preprocess the input CSV file into a TFRecord file.
B. Randomly select a 10 gigabyte subset of the data to train your model.
C. Split into multiple CSV files and use a parallel interleave transformation.
D. Set the reshuffle_each_iteration parameter to true in the tf.data.Dataset.shuffle method.

Answer: D

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments