Skip to content

Commit

Permalink
tests: Speed up test_pgdata_import_smoke on Postgres v17 (#10567)
Browse files Browse the repository at this point in the history
The test runs this query:

    select count(*), sum(data::bigint)::bigint from t

to validate the test results between each part of the test. It performs
a simple sequential scan and aggregation, but was taking an order of
magnitude longer on v17 than on previous Postgres versions, which
sometimes caused the test to time out. There were two reasons for that:

1. On v17, the planner estimates the table to have only only one row. In
reality it has 305790 rows, and older versions estimated it at 611580,
which is not too bad given that the table has not been analyzed so the
planner bases that estimate just on the number of pages and the widths
of the datatypes. The new estimate of 1 row is much worse, and it leads
the planner to disregard parallel plans, whereas on older versions you
got a Parallel Seq Scan.

I tracked this down to upstream commit 29cf61ade3, "Consider fillfactor
when estimating relation size". With that commit,
table_block_relation_estimate_size() function calculates that each page
accommodates less than 1 row when the fillfactor is taken into account,
which rounds down to 0. In reality, the executor will always place at
least one row on a page regardless of fillfactor, but the new estimation
formula doesn't take that into account.

I reported this to pgsql-hackers
(https://www.postgresql.org/message-id/2bf9d973-7789-4937-a7ca-0af9fb49c71e%40iki.fi),
we don't need to do anything more about it in neon. It's OK to not use
parallel scans here; once issue 2. below is addressed, the queries are
fast enough without parallelism..

2. On v17, prefetching was not happening for the sequential scan. That's
because starting with v17, buffers are reserved in the shared buffer
cache before prefetching is initiated, and we use a tiny
shared_buffers=1MB setting in the tests. The prefetching is effectively
disabled with such a small shared_buffers setting, to protect the system
from completely starving out of buffers.

   To address that, simply bump up shared_buffers in the test.

This patch addresses the second issue, which is enough to fix the
problem.
  • Loading branch information
hlinnaka authored Jan 30, 2025
1 parent 5e0c407 commit df87a55
Showing 1 changed file with 20 additions and 4 deletions.
24 changes: 20 additions & 4 deletions test_runner/regress/test_import_pgdata.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,12 @@ def handler(request: Request) -> Response:
env.pageserver.stop()
env.pageserver.start()

# By default our tests run with a tiny shared_buffers=1MB setting. That
# doesn't allow any prefetching on v17 and above, where the new streaming
# read machinery keeps buffers pinned while prefetching them. Use a higher
# setting to enable prefetching and speed up the tests
ep_config = ["shared_buffers=64MB"]

#
# Put data in vanilla pg
#
Expand Down Expand Up @@ -246,7 +252,11 @@ def validate_vanilla_equivalence(ep):
#

ro_endpoint = env.endpoints.create_start(
branch_name=import_branch_name, endpoint_id="ro", tenant_id=tenant_id, lsn=last_record_lsn
branch_name=import_branch_name,
endpoint_id="ro",
tenant_id=tenant_id,
lsn=last_record_lsn,
config_lines=ep_config,
)

validate_vanilla_equivalence(ro_endpoint)
Expand Down Expand Up @@ -276,7 +286,10 @@ def validate_vanilla_equivalence(ep):
# validate that we can write
#
rw_endpoint = env.endpoints.create_start(
branch_name=import_branch_name, endpoint_id="rw", tenant_id=tenant_id
branch_name=import_branch_name,
endpoint_id="rw",
tenant_id=tenant_id,
config_lines=ep_config,
)
rw_endpoint.safe_psql("create table othertable(values text)")
rw_lsn = Lsn(rw_endpoint.safe_psql_scalar("select pg_current_wal_flush_lsn()"))
Expand All @@ -296,7 +309,7 @@ def validate_vanilla_equivalence(ep):
ancestor_start_lsn=rw_lsn,
)
br_tip_endpoint = env.endpoints.create_start(
branch_name="br-tip", endpoint_id="br-tip-ro", tenant_id=tenant_id
branch_name="br-tip", endpoint_id="br-tip-ro", tenant_id=tenant_id, config_lines=ep_config
)
validate_vanilla_equivalence(br_tip_endpoint)
br_tip_endpoint.safe_psql("select * from othertable")
Expand All @@ -309,7 +322,10 @@ def validate_vanilla_equivalence(ep):
ancestor_start_lsn=initdb_lsn,
)
br_initdb_endpoint = env.endpoints.create_start(
branch_name="br-initdb", endpoint_id="br-initdb-ro", tenant_id=tenant_id
branch_name="br-initdb",
endpoint_id="br-initdb-ro",
tenant_id=tenant_id,
config_lines=ep_config,
)
validate_vanilla_equivalence(br_initdb_endpoint)
with pytest.raises(psycopg2.errors.UndefinedTable):
Expand Down

1 comment on commit df87a55

@github-actions
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

7565 tests run: 7204 passed, 0 failed, 361 skipped (full report)


Code coverage* (full report)

  • functions: 33.4% (8517 of 25538 functions)
  • lines: 49.1% (71523 of 145568 lines)

* collected from Rust tests only


The comment gets automatically updated with the latest test results
df87a55 at 2025-01-31T01:26:37.116Z :recycle:

Please sign in to comment.