When would you usually consider to add clustering key to a table

When would you usually consider to add clustering key to a table
A . The performance of the query has deteriorated over a period of time.
B. The number of users querying the table has increased
C. it is a multi-terabyte size table
D. The table has more than 20 columns

Answer: A,C

Explanation:

Clustering keys are not intended for all tables. The size of a table, as well as the query performance for the table, should dictate whether to define a clustering key for the table. In particular, to see performance improvements from a clustering key, a table has to be large enough to consist of a sufficiently large number of micro-partitions, and the column(s) defined in the clustering key have to provide sufficient filtering to select a subset of these micro-partitions.

In general, tables in the multi-terabyte (TB) range will experience the most benefit from clustering, particularly if DML is performed regularly/continually on these tables.

Also, before explicitly choosing to cluster a table, Snowflake strongly recommends that you test a representative set of queries on the table to establish some performance baselines.

Apart from the above, please also understand why the performance of a table will deteriorate over a period of time. Snowflake physically stores data in 16MB micro-partitions which are immutable. So, when you are constantly inserting/updating records in the tables, those micro-partitions are getting recreated. When they get recreated, it is not possible for Snowflake to ensure that the records are clustered together. Hence, the clustering deteriorates over a period of time. If you create clustering key, auto clustering is turned on and Snowflake automatically reclusters the records based on an algorithm. It does

not cluster the entire table at the same time, it does it gradually.

if you have a table with cluster keys and you have the proper access(MONITOR USAGE privilege or ACCOUNTADMIN), please run the below query, it shows you the last 12 hours of clustering history

select *

from table(information_schema.automatic_clustering_history( date_range_start=>dateadd(h, -12, current_timestamp)));

Latest ARA-C01 Dumps Valid Version with 156 Q&As

Latest And Valid Q&A | Instant Download | Once Fail, Full Refund

Subscribe
Notify of
guest
0 Comments
Inline Feedbacks
View all comments