pageserver: support 1 million relations #9516
Labels
c/storage/pageserver
Component: storage: pageserver
t/feature
Issue type: feature, for new features or requests
We do not currently define a maximum number of relations that we support, but it is known that beyond about 10k relations things get dicey. The exact number of issues is unknown, but the primary architectural issue is how we store RelDirectory as a monlithic blob that gets rewritten whenever we add/remove one.
Postgres itself does not define a practical limit on relations per database: the hard limit is approximately one billion, but it is well known that the practical limit is much lower, and dependent on hardware+config:
To pick an arbitrary but realistic goal, let's support+test 1 million tables. This is realistic because:
A tiny initial step in this direction is #9507, which adds a test that creates 8000 tables (not very many!) to reproduce a specific scaling bug in transaction aborts. That test currently has a relatively long runtime (tens of seconds) because our code for tracking timeline metadata is still very inefficient.
The goal is to make it work "fast enough", in the sense that a database is usable and things don't time out, but not necessarily to implement every possible optimisation. For example, logical size calculations will be expensive with 1 million relations (requiring many megabytes of reads from storage), and that is okay as long as the expense does not cause the system to fail from the user's point of view.
Implement code for rewriting metadata to the new format on startup. This should run very early during startup so that no other parts of the code need to understand the old format: we can then maintain this for a long time.will have two read paths instead.test_historic_storage_formats
)get_rel_exists()
during WAL ingestion with many relations #9855Out of scope:
The text was updated successfully, but these errors were encountered: