Talk on â€œdatabase scalabilityâ€
This is a very interesting talk by Jonathan Ellis on database scalability. He designed and implemented multi-petabyte storage for Mozy and is currently the project chair for Apache Cassandra.
What every developer should know about database scalability, PyCon 2010
View more presentations from jbellis.
- Scalability is not improving latency, but increasing throughput
- But overall performance shouldnâ€™t degrade
- Throw hardware, not people at the problem
- Traditional databases use b-tree indexes. But requires the entire index to be in-memory at the same place.
- Easy bandaid #1â€“ SSD storage is better for b-tree indexes which need to hit disk
- Easy bandaid #2 â€“ Buy faster server every 2 years. As long as your userbase doesnâ€™t grow faster that Mooreâ€™s law
- Easy bandaid #3 â€“ Use caching to handle hotspots (Distributed)
- Memcache server failures can change where hashing keys are kept
- Consistent hashing solves the problem by mapping keys to tokens. The tokens can move around to more or less server. Apps would be able to figure out which keys are where.