Partition Key, Clustering Key

October 27, 2024

Apache Cassandra Partition Key, Clustering Key

Partition Key:

In Cassandra, the partition key is specified during table creation as part of the PRIMARY KEY definition.

The partition key is used to determine the node in the cluster where the data will be stored, ensuring data distribution and load balancing.

Syntax for Defining a Partition Key

When creating a table, the partition key is the first part of the PRIMARY KEY definition.

Single-Column Partition Key

If the partition key is based on a single column, it is simply the first column in the PRIMARY KEY definition:

CREATE TABLE users (

user_id UUID, -- Partition key

name TEXT,

age INT,

PRIMARY KEY (user_id)

);

user_id is the partition key in this table. Each row will be partitioned based on this column.

Composite Partition Key

If we want to use more than one column as the partition key (composite partition key), wrap the partition key columns in parentheses within the PRIMARY KEY clause:

CREATE TABLE orders (

customer_id UUID, -- Part of the partition key

order_id UUID, -- Part of the partition key

order_date TIMESTAMP,

total DECIMAL,

PRIMARY KEY ((customer_id, order_id))

);

customer_id and order_id together form the composite partition key. This means the data will be partitioned based on the combination of these two columns

Clustering Key:

Any additional columns in the PRIMARY KEY serve as clustering columns, which define how the data is sorted within a partition.

CREATE TABLE events (

event_id UUID, -- Partition key

timestamp TIMESTAMP, -- Clustering column

event_description TEXT,

PRIMARY KEY (event_id, timestamp)

);

• event_id is the partition key.

• timestamp is a clustering column, so rows within a partition will be sorted by timestamp.

Primary Key:

• The primary key is a combination of the partition key and the clustering key(s) (if any).

• It uniquely identifies each row in the table.

• The partition key is used to determine which node stores the row, while the clustering key is used to organize rows within the same partition.

Search This Blog

Partition Key, Clustering Key

Comments

Post a Comment

Popular posts from this blog

Peer to Peer Architecture

Node

Virtual Nodes in Ring