Exam4Training

Cloudera CCA175 CCA Spark and Hadoop Developer Exam Online Training

Question #1

Problem Scenario 1:

You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.categories

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Please accomplish following activities.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Connecting to existing MySQL Database mysql –user=retail_dba –password=cloudera retail_db

Step 2: Show all the available tables show tables;

Step 3: View/Count data from a table in MySQL select count(1} from categories;

Step 4: Check the currently available data in HDFS directory hdfs dfs -Is

Step 5: Import Single table (Without specifying directory).

sqoop import –connect jdbc:mysql://quickstart:3306/retail_db -username=retail_dba -password=cloudera -table=categories

Note: Please check you dont have space between before or after ‘=’ sign. Sqoop uses the MapReduce framework to copy data from RDBMS to hdfs

Step 6: Read the data from one of the partition, created using above command, hdfs dfs -catxategories/part-m-00000

Step 7: Specifying target directory in import command (We are using number of mappers =1, you can change accordingly) sqoop import -connect jdbc:mysql://quickstart:3306/retail_db -username=retail_dba -password=cloudera ~table=categories -target-dir=categortes_target –m 1

Step 8: Check the content in one of the partition file.

hdfs dfs -cat categories_target/part-m-00000

Step 9: Specifying parent directory so that you can copy more than one table in a specified target directory. Command to specify warehouse directory.

sqoop import -.-connect jdbc:mysql://quickstart:3306/retail_db –username=retail dba -password=cloudera -table=categories -warehouse-dir=categories_warehouse –m 1

Question #2

Problem Scenario 2:

There is a parent organization called "ABC Group Inc", which has two child companies named Tech Inc and MPTech.

Both companies employee information is given in two separate text file as below. Please do the following activity for employee details.

Tech Inc.txt

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Check All Available command hdfs dfs

Step 2: Get help on Individual command hdfs dfs -help get

Step 3: Create a directory in HDFS using named Employee and create a Dummy file in it called e.g. Techinc.txt hdfs dfs -mkdir Employee

Now create an emplty file in Employee directory using Hue.

Step 4: Create a directory on Local file System and then Create two files, with the given data in problems.

Step 5: Now we have an existing directory with content in it, now using HDFS command line, overrid this existing Employee directory. While copying these files from local file System to HDFS. cd /home/cloudera/Desktop/ hdfs dfs -put -f Employee

Step 6: Check All files in directory copied successfully hdfs dfs -Is Employee

Step 7: Now merge all the files in Employee directory, hdfs dfs -getmerge -nl Employee MergedEmployee.txt

Step 8: Check the content of the file. cat MergedEmployee.txt

Step 9: Copy merged file in Employeed directory from local file ssytem to HDFS. hdfs dfs -put MergedEmployee.txt Employee/

Step 10: Check file copied or not. hdfs dfs -Is Employee

Step 11: Change the permission of the merged file on HDFS hdfs dfs -chmpd 664 Employee/MergedEmployee.txt

Step 12: Get the file from HDFS to local file system, hdfs dfs -get Employee Employee_hdfs

Question #3

Problem Scenario 3: You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.categories

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Please accomplish following activities.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Import Single table (Subset data} Note: Here the ‘ is the same you find on – key

sqoop import –connect jdbc:mysql://quickstart:3306/retail_db –username=retail_dba -password=cloudera -table=categories ~warehouse-dir= categories_subset –where ‘category_id’=22 –m 1

Step 2: Check the output partition

hdfs dfs -cat categoriessubset/categories/part-m-00000

Step 3: Change the selection criteria (Subset data)

sqoop import –connect jdbc:mysql://quickstart:3306/retail_db –username=retail_dba -password=cloudera -table=categories ~warehouse-dir= categories_subset_2 –where ’category_id’>22 -m 1

Step 4: Check the output partition

hdfs dfs -cat categories_subset_2/categories/part-m-00000

Step 5: Use between clause (Subset data)

sqoop import –connect jdbc:mysql://quickstart:3306/retail_db –username=retail_dba -password=cloudera -table=categories ~warehouse-dir=categories_subset_3 –where "’category_id’ between 1 and 22" –m 1

Step 6: Check the output partition

hdfs dfs -cat categories_subset_3/categories/part-m-00000

Step 7: Changing the delimiter during import.

sqoop import –connect jdbc:mysql://quickstart:3306/retail_db –username=retail dba -password=cloudera -table=categories -warehouse-dir=:categories_subset_6 –where "/’categoryjd /’ between 1 and 22" -fields-terminated-by=’|’ -m 1

Step 8: Check the.output partition

hdfs dfs -cat categories_subset_6/categories/part-m-00000

Step 9: Selecting subset columns

sqoop import –connect jdbc:mysql://quickstart:3306/retail_db –username=retail_dba -password=cloudera -table=categories –warehouse-dir=categories subset col -where "/’category id/’ between 1 and 22" -fields-terminated-by=T -columns=category name, category id –m 1

Step 10: Check the output partition

hdfs dfs -cat categories_subset_col/categories/part-m-00000

Step 11: Inserting record with null values (Using mysql} ALTER TABLE categories modify category_department_id int(11); INSERT INTO categories values ^NULL/TESTING’); select" from categories;

Step 12: Encode non string null column

sqoop import –connect jdbc:mysql://quickstart:3306/retail_db –username=retail dba -password=cloudera -table=categories –warehouse-dir=categortes_subset_17 -where ""category_id" between 1 and 61" -fields-terminated-by=, |’ –null-string-N’ -null-non-string=, N’ –m 1

Step 13: View the content

hdfs dfs -cat categories_subset_17/categories/part-m-00000

Step 14: Import all the tables from a schema (This step will take little time)

sqoop import-all-tables -connect jdbc:mysql://quickstart:3306/retail_db –username=retail_dba -password=cloudera -warehouse-dir=categories_si

Step 15: View the contents

hdfs dfs -Is categories_subset_all_tables

Step 16: Cleanup or back to originals.

delete from categories where categoryid in (59, 60);

ALTER TABLE categories modify category_department_id int(11) NOTNULL;

ALTER TABLE categories modify category_name varchar(45) NOT NULL;

desc categories;

Question #4

Problem Scenario 4: You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.categories

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Please accomplish following activities.

Import Single table categories (Subset data} to hive managed table, where category_id between 1 and 22

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Import Single table (Subset data)

sqoop import –connect jdbc:mysql://quickstart:3306/retail_db -username=retail_dba -password=cloudera -table=categories -where " ’ category_id ’ between 1 and 22" –hive-import –m 1

Note: Here the ‘ is the same you find on ~ key

This command will create a managed table and content will be created in the following directory.

/user/hive/warehouse/categories

Step 2: Check whether table is created or not (In Hive)

show tables;

select * from categories;

Question #4

Problem Scenario 4: You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.categories

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Please accomplish following activities.

Import Single table categories (Subset data} to hive managed table, where category_id between 1 and 22

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Import Single table (Subset data)

sqoop import –connect jdbc:mysql://quickstart:3306/retail_db -username=retail_dba -password=cloudera -table=categories -where " ’ category_id ’ between 1 and 22" –hive-import –m 1

Note: Here the ‘ is the same you find on ~ key

This command will create a managed table and content will be created in the following directory.

/user/hive/warehouse/categories

Step 2: Check whether table is created or not (In Hive)

show tables;

select * from categories;

Question #4

Problem Scenario 4: You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.categories

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Please accomplish following activities.

Import Single table categories (Subset data} to hive managed table, where category_id between 1 and 22

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Import Single table (Subset data)

sqoop import –connect jdbc:mysql://quickstart:3306/retail_db -username=retail_dba -password=cloudera -table=categories -where " ’ category_id ’ between 1 and 22" –hive-import –m 1

Note: Here the ‘ is the same you find on ~ key

This command will create a managed table and content will be created in the following directory.

/user/hive/warehouse/categories

Step 2: Check whether table is created or not (In Hive)

show tables;

select * from categories;

Question #4

Problem Scenario 4: You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.categories

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Please accomplish following activities.

Import Single table categories (Subset data} to hive managed table, where category_id between 1 and 22

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Import Single table (Subset data)

sqoop import –connect jdbc:mysql://quickstart:3306/retail_db -username=retail_dba -password=cloudera -table=categories -where " ’ category_id ’ between 1 and 22" –hive-import –m 1

Note: Here the ‘ is the same you find on ~ key

This command will create a managed table and content will be created in the following directory.

/user/hive/warehouse/categories

Step 2: Check whether table is created or not (In Hive)

show tables;

select * from categories;

Question #4

Problem Scenario 4: You have been given MySQL DB with following details.

user=retail_dba

password=cloudera

database=retail_db

table=retail_db.categories

jdbc URL = jdbc:mysql://quickstart:3306/retail_db

Please accomplish following activities.

Import Single table categories (Subset data} to hive managed table, where category_id between 1 and 22

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Import Single table (Subset data)

sqoop import –connect jdbc:mysql://quickstart:3306/retail_db -username=retail_dba -password=cloudera -table=categories -where " ’ category_id ’ between 1 and 22" –hive-import –m 1

Note: Here the ‘ is the same you find on ~ key

This command will create a managed table and content will be created in the following directory.

/user/hive/warehouse/categories

Step 2: Check whether table is created or not (In Hive)

show tables;

select * from categories;

Question #9

Import departments table as a text file in /user/cloudera/departments.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: List tables using sqoop

sqoop list-tables –connect jdbc:mysql://quickstart:330G/retail_db –username retail dba -password cloudera

Step 2: Eval command, just run a count query on one of the table.

sqoop eval

–connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

-password cloudera

–query "select count(1) from ordeMtems"

Step 3: Import all the tables as avro file.

sqoop import-all-tables

-connect jdbc:mysql://quickstart:3306/retail_db

-username=retail_dba

-password=cloudera

-as-avrodatafile

-warehouse-dir=/user/hive/warehouse/retail stage.db

-ml

Step 4: Import departments table as a text file in /user/cloudera/departments

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

-username=retail_dba

-password=cloudera

-table departments

-as-textfile

-target-dir=/user/cloudera/departments

Step 5: Verify the imported data.

hdfs dfs -Is /user/cloudera/departments

hdfs dfs -Is /user/hive/warehouse/retailstage.db

hdfs dfs -Is /user/hive/warehouse/retail_stage.db/products

Question #9

Import departments table as a text file in /user/cloudera/departments.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: List tables using sqoop

sqoop list-tables –connect jdbc:mysql://quickstart:330G/retail_db –username retail dba -password cloudera

Step 2: Eval command, just run a count query on one of the table.

sqoop eval

–connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

-password cloudera

–query "select count(1) from ordeMtems"

Step 3: Import all the tables as avro file.

sqoop import-all-tables

-connect jdbc:mysql://quickstart:3306/retail_db

-username=retail_dba

-password=cloudera

-as-avrodatafile

-warehouse-dir=/user/hive/warehouse/retail stage.db

-ml

Step 4: Import departments table as a text file in /user/cloudera/departments

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

-username=retail_dba

-password=cloudera

-table departments

-as-textfile

-target-dir=/user/cloudera/departments

Step 5: Verify the imported data.

hdfs dfs -Is /user/cloudera/departments

hdfs dfs -Is /user/hive/warehouse/retailstage.db

hdfs dfs -Is /user/hive/warehouse/retail_stage.db/products

Question #9

Import departments table as a text file in /user/cloudera/departments.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: List tables using sqoop

sqoop list-tables –connect jdbc:mysql://quickstart:330G/retail_db –username retail dba -password cloudera

Step 2: Eval command, just run a count query on one of the table.

sqoop eval

–connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

-password cloudera

–query "select count(1) from ordeMtems"

Step 3: Import all the tables as avro file.

sqoop import-all-tables

-connect jdbc:mysql://quickstart:3306/retail_db

-username=retail_dba

-password=cloudera

-as-avrodatafile

-warehouse-dir=/user/hive/warehouse/retail stage.db

-ml

Step 4: Import departments table as a text file in /user/cloudera/departments

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

-username=retail_dba

-password=cloudera

-table departments

-as-textfile

-target-dir=/user/cloudera/departments

Step 5: Verify the imported data.

hdfs dfs -Is /user/cloudera/departments

hdfs dfs -Is /user/hive/warehouse/retailstage.db

hdfs dfs -Is /user/hive/warehouse/retail_stage.db/products

Question #9

Import departments table as a text file in /user/cloudera/departments.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: List tables using sqoop

sqoop list-tables –connect jdbc:mysql://quickstart:330G/retail_db –username retail dba -password cloudera

Step 2: Eval command, just run a count query on one of the table.

sqoop eval

–connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

-password cloudera

–query "select count(1) from ordeMtems"

Step 3: Import all the tables as avro file.

sqoop import-all-tables

-connect jdbc:mysql://quickstart:3306/retail_db

-username=retail_dba

-password=cloudera

-as-avrodatafile

-warehouse-dir=/user/hive/warehouse/retail stage.db

-ml

Step 4: Import departments table as a text file in /user/cloudera/departments

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

-username=retail_dba

-password=cloudera

-table departments

-as-textfile

-target-dir=/user/cloudera/departments

Step 5: Verify the imported data.

hdfs dfs -Is /user/cloudera/departments

hdfs dfs -Is /user/hive/warehouse/retailstage.db

hdfs dfs -Is /user/hive/warehouse/retail_stage.db/products

Question #13

Store all the Java files in a directory called java_output to evalute the further

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Drop all the tables, which we have created in previous problems. Before implementing the solution.

Login to hive and execute following command.

show tables;

drop table categories;

drop table customers;

drop table departments;

drop table employee;

drop table ordeMtems;

drop table orders;

drop table products;

show tables;

Check warehouse directory. hdfs dfs -Is /user/hive/warehouse

Step 2: Now we have cleaned database. Import entire retail db with all the required parameters as problem statement is asking.

sqoop import-all-tables

-m3

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-hive-import

–hive-overwrite

-create-hive-table

–compress

–compression-codec org.apache.hadoop.io.compress.SnappyCodec

–outdir java_output

Step 3: Verify the work is accomplished or not.

a. Go to hive and check all the tables hive

show tables;

select count(1) from customers;

b. Check the-warehouse directory and number of partitions,

hdfs dfs -Is /user/hive/warehouse

hdfs dfs -Is /user/hive/warehouse/categories

c. Check the output Java directory.

Is -Itr java_output/

Question #13

Store all the Java files in a directory called java_output to evalute the further

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Drop all the tables, which we have created in previous problems. Before implementing the solution.

Login to hive and execute following command.

show tables;

drop table categories;

drop table customers;

drop table departments;

drop table employee;

drop table ordeMtems;

drop table orders;

drop table products;

show tables;

Check warehouse directory. hdfs dfs -Is /user/hive/warehouse

Step 2: Now we have cleaned database. Import entire retail db with all the required parameters as problem statement is asking.

sqoop import-all-tables

-m3

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-hive-import

–hive-overwrite

-create-hive-table

–compress

–compression-codec org.apache.hadoop.io.compress.SnappyCodec

–outdir java_output

Step 3: Verify the work is accomplished or not.

a. Go to hive and check all the tables hive

show tables;

select count(1) from customers;

b. Check the-warehouse directory and number of partitions,

hdfs dfs -Is /user/hive/warehouse

hdfs dfs -Is /user/hive/warehouse/categories

c. Check the output Java directory.

Is -Itr java_output/

Question #13

Store all the Java files in a directory called java_output to evalute the further

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Drop all the tables, which we have created in previous problems. Before implementing the solution.

Login to hive and execute following command.

show tables;

drop table categories;

drop table customers;

drop table departments;

drop table employee;

drop table ordeMtems;

drop table orders;

drop table products;

show tables;

Check warehouse directory. hdfs dfs -Is /user/hive/warehouse

Step 2: Now we have cleaned database. Import entire retail db with all the required parameters as problem statement is asking.

sqoop import-all-tables

-m3

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-hive-import

–hive-overwrite

-create-hive-table

–compress

–compression-codec org.apache.hadoop.io.compress.SnappyCodec

–outdir java_output

Step 3: Verify the work is accomplished or not.

a. Go to hive and check all the tables hive

show tables;

select count(1) from customers;

b. Check the-warehouse directory and number of partitions,

hdfs dfs -Is /user/hive/warehouse

hdfs dfs -Is /user/hive/warehouse/categories

c. Check the output Java directory.

Is -Itr java_output/

Question #13

Store all the Java files in a directory called java_output to evalute the further

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Drop all the tables, which we have created in previous problems. Before implementing the solution.

Login to hive and execute following command.

show tables;

drop table categories;

drop table customers;

drop table departments;

drop table employee;

drop table ordeMtems;

drop table orders;

drop table products;

show tables;

Check warehouse directory. hdfs dfs -Is /user/hive/warehouse

Step 2: Now we have cleaned database. Import entire retail db with all the required parameters as problem statement is asking.

sqoop import-all-tables

-m3

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-hive-import

–hive-overwrite

-create-hive-table

–compress

–compression-codec org.apache.hadoop.io.compress.SnappyCodec

–outdir java_output

Step 3: Verify the work is accomplished or not.

a. Go to hive and check all the tables hive

show tables;

select count(1) from customers;

b. Check the-warehouse directory and number of partitions,

hdfs dfs -Is /user/hive/warehouse

hdfs dfs -Is /user/hive/warehouse/categories

c. Check the output Java directory.

Is -Itr java_output/

Question #17

Also make sure you have imported only two columns from table, which are department_id, department_name

Reveal Solution Hide Solution

Correct Answer: Solutions:

Step 1: Clean the hdfs tile system, if they exists clean out.

hadoop fs -rm -R departments

hadoop fs -rm -R categories

hadoop fs -rm -R products

hadoop fs -rm -R orders

hadoop fs -rm -R order_itmes

hadoop fs -rm -R customers

Step 2: Now import the department table as per requirement.

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

-target-dir /user/cloudera/departments

-m2

-boundary-query "select 1, 25 from departments"

-columns department_id, department_name

Step 3: Check imported data.

hdfs dfs -Is departments

hdfs dfs -cat departments/part-m-00000

hdfs dfs -cat departments/part-m-00001

Question #17

Also make sure you have imported only two columns from table, which are department_id, department_name

Reveal Solution Hide Solution

Correct Answer: Solutions:

Step 1: Clean the hdfs tile system, if they exists clean out.

hadoop fs -rm -R departments

hadoop fs -rm -R categories

hadoop fs -rm -R products

hadoop fs -rm -R orders

hadoop fs -rm -R order_itmes

hadoop fs -rm -R customers

Step 2: Now import the department table as per requirement.

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

-target-dir /user/cloudera/departments

-m2

-boundary-query "select 1, 25 from departments"

-columns department_id, department_name

Step 3: Check imported data.

hdfs dfs -Is departments

hdfs dfs -cat departments/part-m-00000

hdfs dfs -cat departments/part-m-00001

Question #17

Also make sure you have imported only two columns from table, which are department_id, department_name

Reveal Solution Hide Solution

Correct Answer: Solutions:

Step 1: Clean the hdfs tile system, if they exists clean out.

hadoop fs -rm -R departments

hadoop fs -rm -R categories

hadoop fs -rm -R products

hadoop fs -rm -R orders

hadoop fs -rm -R order_itmes

hadoop fs -rm -R customers

Step 2: Now import the department table as per requirement.

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

-target-dir /user/cloudera/departments

-m2

-boundary-query "select 1, 25 from departments"

-columns department_id, department_name

Step 3: Check imported data.

hdfs dfs -Is departments

hdfs dfs -cat departments/part-m-00000

hdfs dfs -cat departments/part-m-00001

Question #17

Also make sure you have imported only two columns from table, which are department_id, department_name

Reveal Solution Hide Solution

Correct Answer: Solutions:

Step 1: Clean the hdfs tile system, if they exists clean out.

hadoop fs -rm -R departments

hadoop fs -rm -R categories

hadoop fs -rm -R products

hadoop fs -rm -R orders

hadoop fs -rm -R order_itmes

hadoop fs -rm -R customers

Step 2: Now import the department table as per requirement.

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

-target-dir /user/cloudera/departments

-m2

-boundary-query "select 1, 25 from departments"

-columns department_id, department_name

Step 3: Check imported data.

hdfs dfs -Is departments

hdfs dfs -cat departments/part-m-00000

hdfs dfs -cat departments/part-m-00001

Question #21

Also make sure you use orderid columns for sqoop to use for boundary conditions.

Reveal Solution Hide Solution

Correct Answer: Solutions:

Step 1: Clean the hdfs file system, if they exists clean out.

hadoop fs -rm -R departments

hadoop fs -rm -R categories

hadoop fs -rm -R products

hadoop fs -rm -R orders

hadoop fs -rm -R order_items

hadoop fs -rm -R customers

Step 2: Now import the department table as per requirement.

sqoop import

–connect jdbc:mysql://quickstart:3306/retail_db

-username=retail_dba

-password=cloudera

-query="select’ from orders join order_items on orders.orderid = order_items.order_item_order_id where SCONDITlONS "

-target-dir /user/cloudera/order_join

-split-by order_id

–num-mappers 2

Step 3: Check imported data.

hdfs dfs -Is order_join

hdfs dfs -cat order_join/part-m-00000

hdfs dfs -cat order_join/part-m-00001

Question #21

Also make sure you use orderid columns for sqoop to use for boundary conditions.

Reveal Solution Hide Solution

Correct Answer: Solutions:

Step 1: Clean the hdfs file system, if they exists clean out.

hadoop fs -rm -R departments

hadoop fs -rm -R categories

hadoop fs -rm -R products

hadoop fs -rm -R orders

hadoop fs -rm -R order_items

hadoop fs -rm -R customers

Step 2: Now import the department table as per requirement.

sqoop import

–connect jdbc:mysql://quickstart:3306/retail_db

-username=retail_dba

-password=cloudera

-query="select’ from orders join order_items on orders.orderid = order_items.order_item_order_id where SCONDITlONS "

-target-dir /user/cloudera/order_join

-split-by order_id

–num-mappers 2

Step 3: Check imported data.

hdfs dfs -Is order_join

hdfs dfs -cat order_join/part-m-00000

hdfs dfs -cat order_join/part-m-00001

Question #21

Also make sure you use orderid columns for sqoop to use for boundary conditions.

Reveal Solution Hide Solution

Correct Answer: Solutions:

Step 1: Clean the hdfs file system, if they exists clean out.

hadoop fs -rm -R departments

hadoop fs -rm -R categories

hadoop fs -rm -R products

hadoop fs -rm -R orders

hadoop fs -rm -R order_items

hadoop fs -rm -R customers

Step 2: Now import the department table as per requirement.

sqoop import

–connect jdbc:mysql://quickstart:3306/retail_db

-username=retail_dba

-password=cloudera

-query="select’ from orders join order_items on orders.orderid = order_items.order_item_order_id where SCONDITlONS "

-target-dir /user/cloudera/order_join

-split-by order_id

–num-mappers 2

Step 3: Check imported data.

hdfs dfs -Is order_join

hdfs dfs -cat order_join/part-m-00000

hdfs dfs -cat order_join/part-m-00001

Question #21

Also make sure you use orderid columns for sqoop to use for boundary conditions.

Reveal Solution Hide Solution

Correct Answer: Solutions:

Step 1: Clean the hdfs file system, if they exists clean out.

hadoop fs -rm -R departments

hadoop fs -rm -R categories

hadoop fs -rm -R products

hadoop fs -rm -R orders

hadoop fs -rm -R order_items

hadoop fs -rm -R customers

Step 2: Now import the department table as per requirement.

sqoop import

–connect jdbc:mysql://quickstart:3306/retail_db

-username=retail_dba

-password=cloudera

-query="select’ from orders join order_items on orders.orderid = order_items.order_item_order_id where SCONDITlONS "

-target-dir /user/cloudera/order_join

-split-by order_id

–num-mappers 2

Step 3: Check imported data.

hdfs dfs -Is order_join

hdfs dfs -cat order_join/part-m-00000

hdfs dfs -cat order_join/part-m-00001

Question #25

Also make sure your results fields are terminated by ‘|’ and lines terminated by ‘n

Reveal Solution Hide Solution

Correct Answer: Solutions:

Step 1: Clean the hdfs file system, if they exists clean out.

hadoop fs -rm -R departments

hadoop fs -rm -R categories

hadoop fs -rm -R products

hadoop fs -rm -R orders

hadoop fs -rm -R order_items

hadoop fs -rm -R customers

Step 2: Now import the department table as per requirement.

sqoop import

-connect jdbc:mysql://quickstart:330G/retaiI_db

–username=retail_dba

-password=cloudera

-table departments

-target-dir=departments

-fields-terminated-by ‘|’

-lines-terminated-by ‘n’

-ml

Step 3: Check imported data.

hdfs dfs -Is departments

hdfs dfs -cat departments/part-m-00000

Step 4: Now again import data and needs to appended.

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

-target-dir departments

-append

-tields-terminated-by ‘|’

-lines-termtnated-by ‘n’

-ml

Step 5: Again Check the results

hdfs dfs -Is departments

hdfs dfs -cat departments/part-m-00001

Question #25

Also make sure your results fields are terminated by ‘|’ and lines terminated by ‘n

Reveal Solution Hide Solution

Correct Answer: Solutions:

Step 1: Clean the hdfs file system, if they exists clean out.

hadoop fs -rm -R departments

hadoop fs -rm -R categories

hadoop fs -rm -R products

hadoop fs -rm -R orders

hadoop fs -rm -R order_items

hadoop fs -rm -R customers

Step 2: Now import the department table as per requirement.

sqoop import

-connect jdbc:mysql://quickstart:330G/retaiI_db

–username=retail_dba

-password=cloudera

-table departments

-target-dir=departments

-fields-terminated-by ‘|’

-lines-terminated-by ‘n’

-ml

Step 3: Check imported data.

hdfs dfs -Is departments

hdfs dfs -cat departments/part-m-00000

Step 4: Now again import data and needs to appended.

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

-target-dir departments

-append

-tields-terminated-by ‘|’

-lines-termtnated-by ‘n’

-ml

Step 5: Again Check the results

hdfs dfs -Is departments

hdfs dfs -cat departments/part-m-00001

Question #25

Also make sure your results fields are terminated by ‘|’ and lines terminated by ‘n

Reveal Solution Hide Solution

Correct Answer: Solutions:

Step 1: Clean the hdfs file system, if they exists clean out.

hadoop fs -rm -R departments

hadoop fs -rm -R categories

hadoop fs -rm -R products

hadoop fs -rm -R orders

hadoop fs -rm -R order_items

hadoop fs -rm -R customers

Step 2: Now import the department table as per requirement.

sqoop import

-connect jdbc:mysql://quickstart:330G/retaiI_db

–username=retail_dba

-password=cloudera

-table departments

-target-dir=departments

-fields-terminated-by ‘|’

-lines-terminated-by ‘n’

-ml

Step 3: Check imported data.

hdfs dfs -Is departments

hdfs dfs -cat departments/part-m-00000

Step 4: Now again import data and needs to appended.

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

-target-dir departments

-append

-tields-terminated-by ‘|’

-lines-termtnated-by ‘n’

-ml

Step 5: Again Check the results

hdfs dfs -Is departments

hdfs dfs -cat departments/part-m-00001

Question #25

Also make sure your results fields are terminated by ‘|’ and lines terminated by ‘n

Reveal Solution Hide Solution

Correct Answer: Solutions:

Step 1: Clean the hdfs file system, if they exists clean out.

hadoop fs -rm -R departments

hadoop fs -rm -R categories

hadoop fs -rm -R products

hadoop fs -rm -R orders

hadoop fs -rm -R order_items

hadoop fs -rm -R customers

Step 2: Now import the department table as per requirement.

sqoop import

-connect jdbc:mysql://quickstart:330G/retaiI_db

–username=retail_dba

-password=cloudera

-table departments

-target-dir=departments

-fields-terminated-by ‘|’

-lines-terminated-by ‘n’

-ml

Step 3: Check imported data.

hdfs dfs -Is departments

hdfs dfs -cat departments/part-m-00000

Step 4: Now again import data and needs to appended.

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

-target-dir departments

-append

-tields-terminated-by ‘|’

-lines-termtnated-by ‘n’

-ml

Step 5: Again Check the results

hdfs dfs -Is departments

hdfs dfs -cat departments/part-m-00001

Question #29

Please import data in a non-existing table, means while importing create hive table named hadoopexam.departments_new

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Go to hive interface and create database.

hive

create database hadoopexam;

Step 2. Use the database created in above step and then create table in it. use hadoopexam; show tables;

Step 3: Create table in it.

create table departments (department_id int, department_name string);

show tables;

desc departments;

desc formatted departments;

Step 4: Please check following directory must not exist else it will give error, hdfs dfs -Is /user/cloudera/departments

If directory already exists, make sure it is not useful and than delete the same.

This is the staging directory where Sqoop store the intermediate data before pushing in hive table.

hadoop fs -rm -R departments

Step 5: Now import data in existing table

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

~username=retail_dba

-password=cloudera

–table departments

-hive-home /user/hive/warehouse

-hive-import

-hive-overwrite

-hive-table hadoopexam.departments

Step 6: Check whether data has been loaded or not.

hive;

use hadoopexam;

show tables;

select" from departments;

desc formatted departments;

Step 7: Import data in non-existing tables in hive and create table while importing.

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

~password=cloudera

-table departments

-hive-home /user/hive/warehouse

-hive-import

-hive-overwrite

-hive-table hadoopexam.departments_new

-create-hive-table

Step 8: Check-whether data has been loaded or not.

hive;

use hadoopexam;

show tables;

select" from departments_new;

desc formatted departments_new;

Question #29

Please import data in a non-existing table, means while importing create hive table named hadoopexam.departments_new

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Go to hive interface and create database.

hive

create database hadoopexam;

Step 2. Use the database created in above step and then create table in it. use hadoopexam; show tables;

Step 3: Create table in it.

create table departments (department_id int, department_name string);

show tables;

desc departments;

desc formatted departments;

Step 4: Please check following directory must not exist else it will give error, hdfs dfs -Is /user/cloudera/departments

If directory already exists, make sure it is not useful and than delete the same.

This is the staging directory where Sqoop store the intermediate data before pushing in hive table.

hadoop fs -rm -R departments

Step 5: Now import data in existing table

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

~username=retail_dba

-password=cloudera

–table departments

-hive-home /user/hive/warehouse

-hive-import

-hive-overwrite

-hive-table hadoopexam.departments

Step 6: Check whether data has been loaded or not.

hive;

use hadoopexam;

show tables;

select" from departments;

desc formatted departments;

Step 7: Import data in non-existing tables in hive and create table while importing.

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

~password=cloudera

-table departments

-hive-home /user/hive/warehouse

-hive-import

-hive-overwrite

-hive-table hadoopexam.departments_new

-create-hive-table

Step 8: Check-whether data has been loaded or not.

hive;

use hadoopexam;

show tables;

select" from departments_new;

desc formatted departments_new;

Question #29

Please import data in a non-existing table, means while importing create hive table named hadoopexam.departments_new

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Go to hive interface and create database.

hive

create database hadoopexam;

Step 2. Use the database created in above step and then create table in it. use hadoopexam; show tables;

Step 3: Create table in it.

create table departments (department_id int, department_name string);

show tables;

desc departments;

desc formatted departments;

Step 4: Please check following directory must not exist else it will give error, hdfs dfs -Is /user/cloudera/departments

If directory already exists, make sure it is not useful and than delete the same.

This is the staging directory where Sqoop store the intermediate data before pushing in hive table.

hadoop fs -rm -R departments

Step 5: Now import data in existing table

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

~username=retail_dba

-password=cloudera

–table departments

-hive-home /user/hive/warehouse

-hive-import

-hive-overwrite

-hive-table hadoopexam.departments

Step 6: Check whether data has been loaded or not.

hive;

use hadoopexam;

show tables;

select" from departments;

desc formatted departments;

Step 7: Import data in non-existing tables in hive and create table while importing.

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

~password=cloudera

-table departments

-hive-home /user/hive/warehouse

-hive-import

-hive-overwrite

-hive-table hadoopexam.departments_new

-create-hive-table

Step 8: Check-whether data has been loaded or not.

hive;

use hadoopexam;

show tables;

select" from departments_new;

desc formatted departments_new;

Question #29

Please import data in a non-existing table, means while importing create hive table named hadoopexam.departments_new

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Go to hive interface and create database.

hive

create database hadoopexam;

Step 2. Use the database created in above step and then create table in it. use hadoopexam; show tables;

Step 3: Create table in it.

create table departments (department_id int, department_name string);

show tables;

desc departments;

desc formatted departments;

Step 4: Please check following directory must not exist else it will give error, hdfs dfs -Is /user/cloudera/departments

If directory already exists, make sure it is not useful and than delete the same.

This is the staging directory where Sqoop store the intermediate data before pushing in hive table.

hadoop fs -rm -R departments

Step 5: Now import data in existing table

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

~username=retail_dba

-password=cloudera

–table departments

-hive-home /user/hive/warehouse

-hive-import

-hive-overwrite

-hive-table hadoopexam.departments

Step 6: Check whether data has been loaded or not.

hive;

use hadoopexam;

show tables;

select" from departments;

desc formatted departments;

Step 7: Import data in non-existing tables in hive and create table while importing.

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

~password=cloudera

-table departments

-hive-home /user/hive/warehouse

-hive-import

-hive-overwrite

-hive-table hadoopexam.departments_new

-create-hive-table

Step 8: Check-whether data has been loaded or not.

hive;

use hadoopexam;

show tables;

select" from departments_new;

desc formatted departments_new;

Question #33

Now import only new inserted records and append to existring directory . which has been created in first step.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Clean already imported data. (In real exam, please make sure you dont delete data generated from previous exercise).

hadoop fs -rm -R departments

Step 2: Import data in departments directory.

sqoop import

–connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

"target-dir/user/cloudera/departments

Step 3: Insert the five records in departments table.

mysql -user=retail_dba –password=cloudera retail_db

Insert into departments values(10, "physics"); Insert into departments values(11, "Chemistry"); Insert into departments values(12, "Maths"); Insert into departments values(13, "Science"); Insert into departments values(14, "Engineering"); commit;

select’ from departments;

Step 4: Get the maximum value of departments from last import, hdfs dfs -cat /user/cloudera/departments/part* that should be 7

Step 5: Do the incremental import based on last import and append the results.

sqoop import

–connect "jdbc:mysql://quickstart.cloudera:330G/retail_db"

~username=retail_dba

-password=cloudera

-table departments

–target-dir /user/cloudera/departments

-append

-check-column "department_id"

-incremental append

-last-value 7

Step 6: Now check the result.

hdfs dfs -cat /user/cloudera/departments/part"

Question #33

Now import only new inserted records and append to existring directory . which has been created in first step.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Clean already imported data. (In real exam, please make sure you dont delete data generated from previous exercise).

hadoop fs -rm -R departments

Step 2: Import data in departments directory.

sqoop import

–connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

"target-dir/user/cloudera/departments

Step 3: Insert the five records in departments table.

mysql -user=retail_dba –password=cloudera retail_db

Insert into departments values(10, "physics"); Insert into departments values(11, "Chemistry"); Insert into departments values(12, "Maths"); Insert into departments values(13, "Science"); Insert into departments values(14, "Engineering"); commit;

select’ from departments;

Step 4: Get the maximum value of departments from last import, hdfs dfs -cat /user/cloudera/departments/part* that should be 7

Step 5: Do the incremental import based on last import and append the results.

sqoop import

–connect "jdbc:mysql://quickstart.cloudera:330G/retail_db"

~username=retail_dba

-password=cloudera

-table departments

–target-dir /user/cloudera/departments

-append

-check-column "department_id"

-incremental append

-last-value 7

Step 6: Now check the result.

hdfs dfs -cat /user/cloudera/departments/part"

Question #33

Now import only new inserted records and append to existring directory . which has been created in first step.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Clean already imported data. (In real exam, please make sure you dont delete data generated from previous exercise).

hadoop fs -rm -R departments

Step 2: Import data in departments directory.

sqoop import

–connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

"target-dir/user/cloudera/departments

Step 3: Insert the five records in departments table.

mysql -user=retail_dba –password=cloudera retail_db

Insert into departments values(10, "physics"); Insert into departments values(11, "Chemistry"); Insert into departments values(12, "Maths"); Insert into departments values(13, "Science"); Insert into departments values(14, "Engineering"); commit;

select’ from departments;

Step 4: Get the maximum value of departments from last import, hdfs dfs -cat /user/cloudera/departments/part* that should be 7

Step 5: Do the incremental import based on last import and append the results.

sqoop import

–connect "jdbc:mysql://quickstart.cloudera:330G/retail_db"

~username=retail_dba

-password=cloudera

-table departments

–target-dir /user/cloudera/departments

-append

-check-column "department_id"

-incremental append

-last-value 7

Step 6: Now check the result.

hdfs dfs -cat /user/cloudera/departments/part"

Question #33

Now import only new inserted records and append to existring directory . which has been created in first step.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Clean already imported data. (In real exam, please make sure you dont delete data generated from previous exercise).

hadoop fs -rm -R departments

Step 2: Import data in departments directory.

sqoop import

–connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

"target-dir/user/cloudera/departments

Step 3: Insert the five records in departments table.

mysql -user=retail_dba –password=cloudera retail_db

Insert into departments values(10, "physics"); Insert into departments values(11, "Chemistry"); Insert into departments values(12, "Maths"); Insert into departments values(13, "Science"); Insert into departments values(14, "Engineering"); commit;

select’ from departments;

Step 4: Get the maximum value of departments from last import, hdfs dfs -cat /user/cloudera/departments/part* that should be 7

Step 5: Do the incremental import based on last import and append the results.

sqoop import

–connect "jdbc:mysql://quickstart.cloudera:330G/retail_db"

~username=retail_dba

-password=cloudera

-table departments

–target-dir /user/cloudera/departments

-append

-check-column "department_id"

-incremental append

-last-value 7

Step 6: Now check the result.

hdfs dfs -cat /user/cloudera/departments/part"

Question #33

Now import only new inserted records and append to existring directory . which has been created in first step.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Clean already imported data. (In real exam, please make sure you dont delete data generated from previous exercise).

hadoop fs -rm -R departments

Step 2: Import data in departments directory.

sqoop import

–connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

"target-dir/user/cloudera/departments

Step 3: Insert the five records in departments table.

mysql -user=retail_dba –password=cloudera retail_db

Insert into departments values(10, "physics"); Insert into departments values(11, "Chemistry"); Insert into departments values(12, "Maths"); Insert into departments values(13, "Science"); Insert into departments values(14, "Engineering"); commit;

select’ from departments;

Step 4: Get the maximum value of departments from last import, hdfs dfs -cat /user/cloudera/departments/part* that should be 7

Step 5: Do the incremental import based on last import and append the results.

sqoop import

–connect "jdbc:mysql://quickstart.cloudera:330G/retail_db"

~username=retail_dba

-password=cloudera

-table departments

–target-dir /user/cloudera/departments

-append

-check-column "department_id"

-incremental append

-last-value 7

Step 6: Now check the result.

hdfs dfs -cat /user/cloudera/departments/part"

Question #33

Now import only new inserted records and append to existring directory . which has been created in first step.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Clean already imported data. (In real exam, please make sure you dont delete data generated from previous exercise).

hadoop fs -rm -R departments

Step 2: Import data in departments directory.

sqoop import

–connect jdbc:mysql://quickstart:3306/retail_db

–username=retail_dba

-password=cloudera

-table departments

"target-dir/user/cloudera/departments

Step 3: Insert the five records in departments table.

mysql -user=retail_dba –password=cloudera retail_db

Insert into departments values(10, "physics"); Insert into departments values(11, "Chemistry"); Insert into departments values(12, "Maths"); Insert into departments values(13, "Science"); Insert into departments values(14, "Engineering"); commit;

select’ from departments;

Step 4: Get the maximum value of departments from last import, hdfs dfs -cat /user/cloudera/departments/part* that should be 7

Step 5: Do the incremental import based on last import and append the results.

sqoop import

–connect "jdbc:mysql://quickstart.cloudera:330G/retail_db"

~username=retail_dba

-password=cloudera

-table departments

–target-dir /user/cloudera/departments

-append

-check-column "department_id"

-incremental append

-last-value 7

Step 6: Now check the result.

hdfs dfs -cat /user/cloudera/departments/part"

Question #39

Now do the incremental import based on created_date column.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Login to musql db

mysql –user=retail_dba -password=cloudera

show databases;

use retail db; show tables;

Step 2: Create a table as given in problem statement.

CREATE table departments_new (department_id int(11), department_name varchar(45), createddate T1MESTAMP DEFAULT NOW());

show tables;

Step 3: isert records from departments table to departments_new insert into departments_new select a.", null from departments a;

Step 4: Import data from departments new table to hdfs.

sqoop import

-connect jdbc:mysql://quickstart:330G/retail_db

~username=retail_dba

-password=cloudera

-table departments_new

–target-dir /user/cloudera/departments_new

–split-by departments

Stpe 5: Check the imported data.

hdfs dfs -cat /user/cloudera/departmentsnew/part"

Step 6: Insert following 5 records in departmentsnew table.

Insert into departments_new values(110, "Civil", null);

Insert into departments_new values(111, "Mechanical", null);

Insert into departments_new values(112, "Automobile", null);

Insert into departments_new values(113, "Pharma", null);

Insert into departments_new values(114, "Social Engineering", null);

commit;

Stpe 7: Import incremetal data based on created_date column.

sqoop import

-connect jdbc:mysql://quickstart:330G/retaiI_db

-username=retail_dba

-password=cloudera

–table departments_new

-target-dir /user/cloudera/departments_new

-append

-check-column created_date

-incremental lastmodified

-split-by departments

-last-value "2016-01-30 12:07:37.0"

Step 8: Check the imported value.

hdfs dfs -cat /user/cloudera/departmentsnew/part"

Question #39

Now do the incremental import based on created_date column.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Login to musql db

mysql –user=retail_dba -password=cloudera

show databases;

use retail db; show tables;

Step 2: Create a table as given in problem statement.

CREATE table departments_new (department_id int(11), department_name varchar(45), createddate T1MESTAMP DEFAULT NOW());

show tables;

Step 3: isert records from departments table to departments_new insert into departments_new select a.", null from departments a;

Step 4: Import data from departments new table to hdfs.

sqoop import

-connect jdbc:mysql://quickstart:330G/retail_db

~username=retail_dba

-password=cloudera

-table departments_new

–target-dir /user/cloudera/departments_new

–split-by departments

Stpe 5: Check the imported data.

hdfs dfs -cat /user/cloudera/departmentsnew/part"

Step 6: Insert following 5 records in departmentsnew table.

Insert into departments_new values(110, "Civil", null);

Insert into departments_new values(111, "Mechanical", null);

Insert into departments_new values(112, "Automobile", null);

Insert into departments_new values(113, "Pharma", null);

Insert into departments_new values(114, "Social Engineering", null);

commit;

Stpe 7: Import incremetal data based on created_date column.

sqoop import

-connect jdbc:mysql://quickstart:330G/retaiI_db

-username=retail_dba

-password=cloudera

–table departments_new

-target-dir /user/cloudera/departments_new

-append

-check-column created_date

-incremental lastmodified

-split-by departments

-last-value "2016-01-30 12:07:37.0"

Step 8: Check the imported value.

hdfs dfs -cat /user/cloudera/departmentsnew/part"

Question #39

Now do the incremental import based on created_date column.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Login to musql db

mysql –user=retail_dba -password=cloudera

show databases;

use retail db; show tables;

Step 2: Create a table as given in problem statement.

CREATE table departments_new (department_id int(11), department_name varchar(45), createddate T1MESTAMP DEFAULT NOW());

show tables;

Step 3: isert records from departments table to departments_new insert into departments_new select a.", null from departments a;

Step 4: Import data from departments new table to hdfs.

sqoop import

-connect jdbc:mysql://quickstart:330G/retail_db

~username=retail_dba

-password=cloudera

-table departments_new

–target-dir /user/cloudera/departments_new

–split-by departments

Stpe 5: Check the imported data.

hdfs dfs -cat /user/cloudera/departmentsnew/part"

Step 6: Insert following 5 records in departmentsnew table.

Insert into departments_new values(110, "Civil", null);

Insert into departments_new values(111, "Mechanical", null);

Insert into departments_new values(112, "Automobile", null);

Insert into departments_new values(113, "Pharma", null);

Insert into departments_new values(114, "Social Engineering", null);

commit;

Stpe 7: Import incremetal data based on created_date column.

sqoop import

-connect jdbc:mysql://quickstart:330G/retaiI_db

-username=retail_dba

-password=cloudera

–table departments_new

-target-dir /user/cloudera/departments_new

-append

-check-column created_date

-incremental lastmodified

-split-by departments

-last-value "2016-01-30 12:07:37.0"

Step 8: Check the imported value.

hdfs dfs -cat /user/cloudera/departmentsnew/part"

Question #42

Now import the data from following directory into departments_export table, /user/cloudera/departments new

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Login to musql db

mysql –user=retail_dba -password=cloudera

show databases; use retail_db; show tables;

step 2: Create a table as given in problem statement.

CREATE table departments_export (departmentjd int(11), department_name varchar(45), created_date T1MESTAMP DEFAULT NOW());

show tables;

Step 3: Export data from /user/cloudera/departmentsnew to new table departments_export

sqoop export -connect jdbc:mysql://quickstart:3306/retail_db

-username retaildba

–password cloudera

–table departments_export

-export-dir /user/cloudera/departments_new

-batch

Step 4: Now check the export is correctly done or not. mysql -user*retail_dba -password=cloudera

show databases;

use retail _db;

show tables;

select’ from departments_export;

Question #42

Now import the data from following directory into departments_export table, /user/cloudera/departments new

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Login to musql db

mysql –user=retail_dba -password=cloudera

show databases; use retail_db; show tables;

step 2: Create a table as given in problem statement.

CREATE table departments_export (departmentjd int(11), department_name varchar(45), created_date T1MESTAMP DEFAULT NOW());

show tables;

Step 3: Export data from /user/cloudera/departmentsnew to new table departments_export

sqoop export -connect jdbc:mysql://quickstart:3306/retail_db

-username retaildba

–password cloudera

–table departments_export

-export-dir /user/cloudera/departments_new

-batch

Step 4: Now check the export is correctly done or not. mysql -user*retail_dba -password=cloudera

show databases;

use retail _db;

show tables;

select’ from departments_export;

Question #42

Now import the data from following directory into departments_export table, /user/cloudera/departments new

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Login to musql db

mysql –user=retail_dba -password=cloudera

show databases; use retail_db; show tables;

step 2: Create a table as given in problem statement.

CREATE table departments_export (departmentjd int(11), department_name varchar(45), created_date T1MESTAMP DEFAULT NOW());

show tables;

Step 3: Export data from /user/cloudera/departmentsnew to new table departments_export

sqoop export -connect jdbc:mysql://quickstart:3306/retail_db

-username retaildba

–password cloudera

–table departments_export

-export-dir /user/cloudera/departments_new

-batch

Step 4: Now check the export is correctly done or not. mysql -user*retail_dba -password=cloudera

show databases;

use retail _db;

show tables;

select’ from departments_export;

Question #42

Now import the data from following directory into departments_export table, /user/cloudera/departments new

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Login to musql db

mysql –user=retail_dba -password=cloudera

show databases; use retail_db; show tables;

step 2: Create a table as given in problem statement.

CREATE table departments_export (departmentjd int(11), department_name varchar(45), created_date T1MESTAMP DEFAULT NOW());

show tables;

Step 3: Export data from /user/cloudera/departmentsnew to new table departments_export

sqoop export -connect jdbc:mysql://quickstart:3306/retail_db

-username retaildba

–password cloudera

–table departments_export

-export-dir /user/cloudera/departments_new

-batch

Step 4: Now check the export is correctly done or not. mysql -user*retail_dba -password=cloudera

show databases;

use retail _db;

show tables;

select’ from departments_export;

Question #42

Now import the data from following directory into departments_export table, /user/cloudera/departments new

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Login to musql db

mysql –user=retail_dba -password=cloudera

show databases; use retail_db; show tables;

step 2: Create a table as given in problem statement.

CREATE table departments_export (departmentjd int(11), department_name varchar(45), created_date T1MESTAMP DEFAULT NOW());

show tables;

Step 3: Export data from /user/cloudera/departmentsnew to new table departments_export

sqoop export -connect jdbc:mysql://quickstart:3306/retail_db

-username retaildba

–password cloudera

–table departments_export

-export-dir /user/cloudera/departments_new

-batch

Step 4: Now check the export is correctly done or not. mysql -user*retail_dba -password=cloudera

show databases;

use retail _db;

show tables;

select’ from departments_export;

Question #42

Now import the data from following directory into departments_export table, /user/cloudera/departments new

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Login to musql db

mysql –user=retail_dba -password=cloudera

show databases; use retail_db; show tables;

step 2: Create a table as given in problem statement.

CREATE table departments_export (departmentjd int(11), department_name varchar(45), created_date T1MESTAMP DEFAULT NOW());

show tables;

Step 3: Export data from /user/cloudera/departmentsnew to new table departments_export

sqoop export -connect jdbc:mysql://quickstart:3306/retail_db

-username retaildba

–password cloudera

–table departments_export

-export-dir /user/cloudera/departments_new

-batch

Step 4: Now check the export is correctly done or not. mysql -user*retail_dba -password=cloudera

show databases;

use retail _db;

show tables;

select’ from departments_export;

Question #42

Now import the data from following directory into departments_export table, /user/cloudera/departments new

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Login to musql db

mysql –user=retail_dba -password=cloudera

show databases; use retail_db; show tables;

step 2: Create a table as given in problem statement.

CREATE table departments_export (departmentjd int(11), department_name varchar(45), created_date T1MESTAMP DEFAULT NOW());

show tables;

Step 3: Export data from /user/cloudera/departmentsnew to new table departments_export

sqoop export -connect jdbc:mysql://quickstart:3306/retail_db

-username retaildba

–password cloudera

–table departments_export

-export-dir /user/cloudera/departments_new

-batch

Step 4: Now check the export is correctly done or not. mysql -user*retail_dba -password=cloudera

show databases;

use retail _db;

show tables;

select’ from departments_export;

Question #49

Now export this data from hdfs to mysql retail_db.departments table. During upload make sure existing department will just updated and no new departments needs to be inserted.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Create a csv tile named updateddepartments.csv with give content.

Step 2: Now upload this tile to HDFS.

Create a directory called newdata.

hdfs dfs -mkdir new_data

hdfs dfs -put updated_departments.csv newdata/

Step 3: Check whether tile is uploaded or not. hdfs dfs -Is new_data

Step 4: Export this file to departments table using sqoop.

sqoop export –connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

–password cloudera

-table departments

–export-dir new_data

-batch

-m 1

-update-key department_id

-update-mode allowinsert

Step 5: Check whether required data upsert is done or not. mysql –user=retail_dba -password=cloudera

show databases;

use retail_db;

show tables;

select" from departments;

Step 6: Update updated_departments.csv file.

Step 7: Override the existing file in hdfs.

hdfs dfs -put updated_departments.csv newdata/

Step 8: Now do the Sqoop export as per the requirement.

sqoop export –connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

–password cloudera

–table departments

–export-dir new_data

–batch

-m 1

–update-key-department_id

-update-mode updateonly

Step 9: Check whether required data update is done or not. mysql –user=retail_dba -password=cloudera

show databases;

use retail db;

show tables;

select" from departments;

Question #49

Now export this data from hdfs to mysql retail_db.departments table. During upload make sure existing department will just updated and no new departments needs to be inserted.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Create a csv tile named updateddepartments.csv with give content.

Step 2: Now upload this tile to HDFS.

Create a directory called newdata.

hdfs dfs -mkdir new_data

hdfs dfs -put updated_departments.csv newdata/

Step 3: Check whether tile is uploaded or not. hdfs dfs -Is new_data

Step 4: Export this file to departments table using sqoop.

sqoop export –connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

–password cloudera

-table departments

–export-dir new_data

-batch

-m 1

-update-key department_id

-update-mode allowinsert

Step 5: Check whether required data upsert is done or not. mysql –user=retail_dba -password=cloudera

show databases;

use retail_db;

show tables;

select" from departments;

Step 6: Update updated_departments.csv file.

Step 7: Override the existing file in hdfs.

hdfs dfs -put updated_departments.csv newdata/

Step 8: Now do the Sqoop export as per the requirement.

sqoop export –connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

–password cloudera

–table departments

–export-dir new_data

–batch

-m 1

–update-key-department_id

-update-mode updateonly

Step 9: Check whether required data update is done or not. mysql –user=retail_dba -password=cloudera

show databases;

use retail db;

show tables;

select" from departments;

Question #49

Now export this data from hdfs to mysql retail_db.departments table. During upload make sure existing department will just updated and no new departments needs to be inserted.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Create a csv tile named updateddepartments.csv with give content.

Step 2: Now upload this tile to HDFS.

Create a directory called newdata.

hdfs dfs -mkdir new_data

hdfs dfs -put updated_departments.csv newdata/

Step 3: Check whether tile is uploaded or not. hdfs dfs -Is new_data

Step 4: Export this file to departments table using sqoop.

sqoop export –connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

–password cloudera

-table departments

–export-dir new_data

-batch

-m 1

-update-key department_id

-update-mode allowinsert

Step 5: Check whether required data upsert is done or not. mysql –user=retail_dba -password=cloudera

show databases;

use retail_db;

show tables;

select" from departments;

Step 6: Update updated_departments.csv file.

Step 7: Override the existing file in hdfs.

hdfs dfs -put updated_departments.csv newdata/

Step 8: Now do the Sqoop export as per the requirement.

sqoop export –connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

–password cloudera

–table departments

–export-dir new_data

–batch

-m 1

–update-key-department_id

-update-mode updateonly

Step 9: Check whether required data update is done or not. mysql –user=retail_dba -password=cloudera

show databases;

use retail db;

show tables;

select" from departments;

Question #49

Now export this data from hdfs to mysql retail_db.departments table. During upload make sure existing department will just updated and no new departments needs to be inserted.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Create a csv tile named updateddepartments.csv with give content.

Step 2: Now upload this tile to HDFS.

Create a directory called newdata.

hdfs dfs -mkdir new_data

hdfs dfs -put updated_departments.csv newdata/

Step 3: Check whether tile is uploaded or not. hdfs dfs -Is new_data

Step 4: Export this file to departments table using sqoop.

sqoop export –connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

–password cloudera

-table departments

–export-dir new_data

-batch

-m 1

-update-key department_id

-update-mode allowinsert

Step 5: Check whether required data upsert is done or not. mysql –user=retail_dba -password=cloudera

show databases;

use retail_db;

show tables;

select" from departments;

Step 6: Update updated_departments.csv file.

Step 7: Override the existing file in hdfs.

hdfs dfs -put updated_departments.csv newdata/

Step 8: Now do the Sqoop export as per the requirement.

sqoop export –connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

–password cloudera

–table departments

–export-dir new_data

–batch

-m 1

–update-key-department_id

-update-mode updateonly

Step 9: Check whether required data update is done or not. mysql –user=retail_dba -password=cloudera

show databases;

use retail db;

show tables;

select" from departments;

Question #49

Now export this data from hdfs to mysql retail_db.departments table. During upload make sure existing department will just updated and no new departments needs to be inserted.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Create a csv tile named updateddepartments.csv with give content.

Step 2: Now upload this tile to HDFS.

Create a directory called newdata.

hdfs dfs -mkdir new_data

hdfs dfs -put updated_departments.csv newdata/

Step 3: Check whether tile is uploaded or not. hdfs dfs -Is new_data

Step 4: Export this file to departments table using sqoop.

sqoop export –connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

–password cloudera

-table departments

–export-dir new_data

-batch

-m 1

-update-key department_id

-update-mode allowinsert

Step 5: Check whether required data upsert is done or not. mysql –user=retail_dba -password=cloudera

show databases;

use retail_db;

show tables;

select" from departments;

Step 6: Update updated_departments.csv file.

Step 7: Override the existing file in hdfs.

hdfs dfs -put updated_departments.csv newdata/

Step 8: Now do the Sqoop export as per the requirement.

sqoop export –connect jdbc:mysql://quickstart:3306/retail_db

-username retail_dba

–password cloudera

–table departments

–export-dir new_data

–batch

-m 1

–update-key-department_id

-update-mode updateonly

Step 9: Check whether required data update is done or not. mysql –user=retail_dba -password=cloudera

show databases;

use retail db;

show tables;

select" from departments;

Question #54

Please import the departments table in a directory called departments_enclosedby and file should be able to process by downstream system.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Connect to mysql database.

mysql –user=retail_dba -password=cloudera

show databases; use retail_db; show tables;

Insert record

Insert into departments values(9999, ‘"Data Science"’);

select" from departments;

Step 2: Import data as per requirement.

sqoop import

-connect jdbc:mysql;//quickstart:3306/retail_db

~username=retail_dba

–password=cloudera

-table departments

-target-dir /user/cloudera/departments_enclosedby

-enclosed-by V -escaped-by \ -fields-terminated-by–‘ -lines-terminated-by:

Step 3: Check the result.

hdfs dfs -cat/user/cloudera/departments_enclosedby/part"

Question #54

Please import the departments table in a directory called departments_enclosedby and file should be able to process by downstream system.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Connect to mysql database.

mysql –user=retail_dba -password=cloudera

show databases; use retail_db; show tables;

Insert record

Insert into departments values(9999, ‘"Data Science"’);

select" from departments;

Step 2: Import data as per requirement.

sqoop import

-connect jdbc:mysql;//quickstart:3306/retail_db

~username=retail_dba

–password=cloudera

-table departments

-target-dir /user/cloudera/departments_enclosedby

-enclosed-by V -escaped-by \ -fields-terminated-by–‘ -lines-terminated-by:

Step 3: Check the result.

hdfs dfs -cat/user/cloudera/departments_enclosedby/part"

Question #54

Please import the departments table in a directory called departments_enclosedby and file should be able to process by downstream system.

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Connect to mysql database.

mysql –user=retail_dba -password=cloudera

show databases; use retail_db; show tables;

Insert record

Insert into departments values(9999, ‘"Data Science"’);

select" from departments;

Step 2: Import data as per requirement.

sqoop import

-connect jdbc:mysql;//quickstart:3306/retail_db

~username=retail_dba

–password=cloudera

-table departments

-target-dir /user/cloudera/departments_enclosedby

-enclosed-by V -escaped-by \ -fields-terminated-by–‘ -lines-terminated-by:

Step 3: Check the result.

hdfs dfs -cat/user/cloudera/departments_enclosedby/part"

Question #57

Now import data from mysql table departments to this hive table. Please make sure that data should be visible using below hive command, select" from departments_hive

Reveal Solution Hide Solution

Correct Answer: Solution:

Step 1: Create hive table as said.

hive

show tables;

create table departments_hive(department_id int, department_name string);

Step 2: The important here is, when we create a table without delimiter fields. Then default delimiter for hive is ^A ( 01). Hence, while importing data we have to provide proper delimiter.

sqoop import

-connect jdbc:mysql://quickstart:3306/retail_db

~username=retail_dba

-password=cloudera

–table departments

–hive-home /user/hive/warehouse

-hive-import

-hive-overwrite

–hive-table departments_hive

–fields-terminated-by ‘