architect-handbook

Software Architect Handbook

View on GitHub

Database scaling

There are two broad approaches for database scaling: vertical and horizontal scaling

Vertical scaling (Scaling up)

Scaling by adding more power (CPU, RAM, DISK, etc) to an existing machine. There are powerful database servers, however, vertical scaling comes with some serious drawbacks:

Horizontal scaling (Sharding)

It is the practice of adding more servers. Sharding separates large databases into smaller, more easily managed parts called shards. Each shard shares the same schema, though the actual data on each shard is unique to the shard.

There are different techniques for sharding. For example, you could access user data based on user IDs. Anytime you access data, a hash function is used to find the corresponding shard (such as user_id % 4 if you have 4 shards).

Sharding Key

The most important factor to consider when implementing a sharding strategy is the choice of the sharding key, also known as partition key. One of the most important criteria is to choose a key that can evenly distribute data.

Sharding Key consists of one or more columns that determine how data is distributed. For example, "user_id" could be the sharding key.

A sharding key allows you to retrieve and modify data efficiently by routing database queries to the correct database.

Drawbacks

Sharding introduces complexities and new challenges to the system: