Skip to content

Readings

Alex Robinson edited this page Jul 18, 2016 · 10 revisions

A good starting point is Andy Pavlo's "What's New with NewSQL" survey paper on NewSQL, which also covers the history of SQL and NoSQL databases. You might also find some lectures from his database internals class useful, especially if you need a crash course on a specific topic like query planning or concurrency control.

Academic papers on Spanner/F1 (and related predecessors) are also a valuable resource, since Spanner was the original inspiration for CockroachDB:

  • The Spanner itself.
  • The F1 system builds more features on top of Spanner, some of which might make it to Cockroach.
  • Online Asynchronous Schema Change in F1 describes how to evolve schemas in F1.
  • The Spanner paper assumes in many places an understanding of Bigtable and Colossus, so you might want to read those (at least the BigTable paper for sure) before attempting to understand Spanner.

Once you get the high level picture, you might want to focus on some specific building blocks of CockroachDB in more detail. Here are some subtopics, although this section could use some more expansion.

Distributed systems lower levels:

Database internals:

  • Trying to understand a complex database concept that is new to you can be very difficult, particularly in a distributed setting. The MySQL documentation can be a good place to start, so that you first get a good grip of the single-machine setting before extending it to the distributed setting:

Query optimization:

Other database systems: