Sharding vs Partitioning
Sharding scales a database by spreading data across many machines. The two main strategies are sharding and partitioning.
Sharding
Sharding splits a large database into smaller pieces called shards. Each shard is a separate database that holds a subset of the data. You usually shard by a key, such as user ID or location. Each shard can live on a different server. This gives horizontal scaling.
How sharding is achieved?
- Consistent hashing can decide which shard holds a piece of data.
- App logic or a database load balancer can route each query to the right shard. It uses the sharding key.
- A database load balancer also helps when a query spans many shards. One such example is Vitess for MySQL.
Partitioning
Partitioning splits a single database into smaller pieces called partitions. You usually create partitions by a rule, such as date ranges or categories. Unlike shards, all partitions belong to the same database. They usually live on the same server. Partitioning improves query speed and makes one database easier to manage.