Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Teamcity build #3

Closed
wants to merge 1,926 commits into from
Closed

Teamcity build #3

wants to merge 1,926 commits into from

Conversation

sambhav-jain-16
Copy link
Owner

No description provided.

DrewKimball and others added 30 commits October 31, 2024 23:57
This commit copies the dep-rules for triggers to the release_24_3
directory. This is in a separate commit so that it can be omitted from
the backport to v24.3.

Informs cockroachdb#126359

Release note: None
- `resolved_age` is now NULL if there is no resolved timestamp.
- `created` is now a TIMESTAMPTZ rather than an INT.

Epic: none
Release note: None
Previously, the event stream poller would track only ranges undergoing an
initial scan. This patch adds tracking for ranges that are over 2 minutes
behind.

Epic: none

Release note: none
133432: roachtest: add gomock target for Test interface r=DarrylWong a=herkolategan

Previously, any updates to the test interface would cause the build to fail since the mock for it in the `clusterstats` package was not automatically generated. This change adds it to the bazel build in order to regenerate the mock for the roachtest Test interface.

Epic: None
Release note: None

Co-authored-by: Herko Lategan <[email protected]>
Previously, we added a new test to TestReaderCatalogTSAdvance to disable
AOST timestamps for reader catalogs and confirm errors were surfaced by
the leasing subsystem when timestamps are mixed. This test relied on a
fixed number of iterations to reproduce this error, which wasn't
reliable. To address this, this patch will use a lease manager hook to
cause descriptor updates to be partially be picked up when the AOST is
disabled. This ensures the scenario is encountered without extra
iterations.

Fixes: cockroachdb#133897
Fixes: cockroachdb#133902

Release note: None
133974: kvclient/kvcoord: one distsender.Rangefeed() api to rule them all r=dt a=dt

...and in the spanTimePairs bind them.

Release note: none.
Epic: none.

Co-authored-by: David Taylor <[email protected]>
133979: crosscluster/producer: poll for lagging ranges r=dt a=msbutler

Previously, the event stream poller would track only ranges undergoing an initial scan. This patch adds tracking for ranges that are over 2 minutes behind.

Epic: none

Release note: none

Co-authored-by: Michael Butler <[email protected]>
133986: catalog/replication: deflake TestReaderCatalogTSAdvance r=fqazi a=fqazi

Previously, we added a new test to TestReaderCatalogTSAdvance to disable AOST timestamps for reader catalogs and confirm errors were surfaced by the leasing subsystem when timestamps are mixed. This test relied on a fixed number of iterations to reproduce this error, which wasn't reliable. To address this, this patch will use a lease manager hook to cause descriptor updates to be partially be picked up when the AOST is disabled. This ensures the scenario is encountered without extra iterations.

Fixes: cockroachdb#133897
Fixes: cockroachdb#133902

Release note: None

Co-authored-by: Faizan Qazi <[email protected]>
Previously this test would fail due to 0A000 and 2BP01
even though those failures are expected.
In these code changes we properly handle these errors.

Epic: CRDB-25314

Fixes: cockroachdb#66662
Release note: None
133991: kvclient: fix rangefeed retry reason counters r=stevendanna a=stevendanna

Previously, retriable errors would often be recorded in the wrong
retry counter because the code that initialised the metrics assumed a
stable ordering when iterating a map.

Given the amount of work that is about to happen when restarting a
rangefeed, I doubt the optimization of a slice lookup vs a map lookup
is that important here.

Further, we almost surely do not want to panic just because we
couldn't find a metric for the reason a node might have sent to us, so
now rather than panic'ing we have a fallback counter for unknown error
types.

Epic: none

Release note: Fix bug that could result in incorrect metrics related
to retriable rangefeed errors.

Co-authored-by: Steven Danna <[email protected]>
Update script to check 24.2 and 24.3 branches as well.

Epic: none
Release note: None
…type changes

When altering a column’s type in a way that requires a backfill, we drop
the old column and add a new one. Previously, the new column was always
added to the end of the column family. This change ensures the column
family order is preserved.

In the DSC, the column family is updated when a new ColumnType element
is added. I introduced a new field to this element to control the order,
which requires specifying the column ID that the new column should
follow.

Epic: CRDB-25314
Closes cockroachdb#133040
Release note: none
Retrieving replica counts and region data via the `SHOW RANGES FROM
DATABASE` and `SHOW RANGES FROM TABLE` queries is extremely costly,
especially on a large cluster.

Executing these queries should not be done automatically and it's
unclear if the cost of computation is worth the data that the customer
can see. It would be better for a customer to explicitly opt-in to
computing this data in special cases.

A cluster setting (`ui.database_locality_metadata.enabled`) has
been introduced to allow customers to turn this functionality off if
they're managing a large fleet where it's difficult to prevent users
from using this feature. By default this data is enabled.

The implementation is admittedly a hack, designed to reduce the size
of this diff and the risk of the backport. Implementing this change
purely on the client-side resulted in too much code needing to be
changed as we would both need to conditionally change the queries
being used to load data, and gate them on cluster setting being loaded
successfully prior to the SQL requests being made.

Resolves: CRDB-42482

Release note (ops change, ui change): A new cluster setting
`ui.database_locality_metadata.enabled` allows operators to disable
loading extended database and table region information in the
DB Console Database and Table pages. This information can cause
significant CPU load on large clusters with many ranges. Versions of
this page from 24.3 onwards do not have this problem. If customers
will require this data, they can use the `SHOW RANGES FROM {DATABASE|
TABLE}` query via SQL to compute on-demand.
133863: server, ui: tweak language for throttling, remove obsolete alerts, add test probe r=dhartunian a=dhartunian

_Note: these 3 small changes are grouped for easy review + backporting_

----

[server: allow env var override for telemetry ping](cockroachdb@6f44b58) 

When testing the license throttling behavior, it's helpful to easily
override the telemetry ping timestamp. This timestamp will only be
accepted if it's smaller than the current recorded timestamp, which
reduces the chance that this ability can be used to extend the grace
period.

Epic: [CRDB-40209](https://cockroachlabs.atlassian.net/browse/CRDB-40209)
Release note: None
`@[dhartunian](https://github.com/cockroachdb/cockroach/commits?author=dhartunian)`
dhartunian committed 2 minutes ago

----
[ui: tweak license expiration text for throttle bar](cockroachdb@be48767) 

Instead of asking a user to "renew" we tell them to "add" a new
license, which matches other language we've used.

Epic: [CRDB-40853](https://cockroachlabs.atlassian.net/browse/CRDB-40853)
Release note: None
`@[dhartunian](https://github.com/cockroachdb/cockroach/commits?author=dhartunian)`
dhartunian committed 1 minute ago

----
[ui: remove core deprecation notification](cockroachdb@3190cf8) 

Prior to 24.3 release, we added a notification in the DB Console to
alert customers to the licensing changes and give them time to
prepare. Now that they'll be rolling out, the notice is removed since
it's no longer in the future.

Epic: [CRDB-40853](https://cockroachlabs.atlassian.net/browse/CRDB-40853)
Release note: None

Co-authored-by: David Hartunian <[email protected]>
Changes:

 * [`c1712d15`](cockroachdb/pebble@c1712d15) sstable: lazily allocate ValueFetcher

Release note: none.
Epic: none.
Epic: CRDB-17171
Release note: None
Release justification: Non-production code changes
134074: bazel-github-helper: correct typo r=celiala a=rickystewart

Epic: CRDB-17171
Release note: None
Release justification: Non-production code changes

Co-authored-by: Ricky Stewart <[email protected]>
133131: workload/schemachange: setColumnType operations will occasionally fail r=Dedej-Bergin a=Dedej-Bergin

Previously this test would fail due to 0A000 and 2BP01 even though those failures are expected.
In these code changes we properly handle these errors.

Epic: CRDB-25314

Fixes: cockroachdb#66662
Release note: None

133845: sql/schemachanger: Preserving column family order for complex column type changes r=spilchen a=spilchen

When altering a column’s type in a way that requires a backfill, we drop the old column and add a new one. Previously, the new column was always added to the end of the column family. This change ensures the column family order is preserved.

In the DSC, the column family is updated when a new ColumnType element is added. I introduced a new field to this element to control the order, which requires specifying the column ID that the new column should follow.

Epic: CRDB-25314
Closes cockroachdb#133040
Release note: none

Co-authored-by: Bergin Dedej <[email protected]>
Co-authored-by: Matt Spilchen <[email protected]>
In e60feee, we started persisting leader information in the HardState
and loading it upon restart. As a result, upon restart, every raft peer
begin its life in a heartbeat lease. This meant that for the first 2-4
seconds upon restart, a leader could not be elected.

This regression was particularly bad for quiesced ranges, as this 2-4
second timer would only start once the replica is ticked. An observed
effect of this was changefeedds over quiesced ranges, where we serially
acquire leases on the constituent ranges to publish the rangefeeds
closed timestamp. This 2-4 second timer would end up stacking, which
could cause O(ranges) seconds resolved timestamp lag. See the
accompanying test which constructs this scenario for more details.

This patch resolves this regression by only loading leader information
if a leader was fortified at shutdown time. If it wasn't, we don't load
leader information, and therefore don't begin life in a heartbeat lease.
Note that because leader leases are not quiesced, the 2s regression
is not as pernicious as it is for epoch based leases.

Epic: none

Release note: None
Validation sometimes takes longer than the current timeout to complete.
Increasing the test timeout until we can improve the performance of the
test validation.

Epic: none
Fixes: cockroachdb#134025
Fixes: cockroachdb#133936
Fixes: cockroachdb#133921
Fixes: cockroachdb#133799

Release note: None
…achdb#133496

133284: demo: remove copy referring to telemetry disabling env var r=angles-n-daemons a=angles-n-daemons

demo: remove copy referring to telemetry disabling env var

`COCKROACH_SKIP_ENABLING_DIAGNOSTIC_REPORTING` can still be used to disable telemetry for CRDB. With the core deprecation happening however, we want to streamline how people enable / disable telemetry in their cluster, so the copy to this variable has been removed.

It will still function as before, the only difference is that it will not show up in the demo startup message.

Epic: CRDB-40209
Fixes: cockroachdb#132688
Release note (general change):
COCKROACH_SKIP_ENABLING_DIAGNOSTIC_REPORTING is no longer mentioned in the demo command.

133334: sql,util: fix incorrect logtags.Buffer usage r=arulajmani a=stevendanna

(*logtags.Buffer).Add creates a new Buffer, copies the log tags from the receiver, adds the new tag to the new Buffer, and return the buffer. These callers weren't doing anything with the response so they weren't actually adding the tags they thought they were adding.

This seems like some wasted work, but I suppose the more common case is `(*logtags.Buffer).Merge` which is more efficient. (Note repeated calls with `logtags.AddTag(ctx, key, val)` is similarly unfortunate, but also rare in the codebase from what I can see).

Epic: none
Release note: None

133417: num32: add more functions to the num32 library r=DrewKimball,mw5h a=andy-kimball

Add additional numeric functions that vector indexing will use. Use the gonum BLAS library where it gives a performance advantage.

Epic: CRDB-42943

Release note: None

133496: tree: optionally treat string constants as collated r=mw5h a=mw5h

Previously, the type checker would reject comparisons between string constants and collated collated strings without an explicit type cast. This patch relaxes that restriction so that comparison of collated strings against string literals does the collated comparison as one would expect.

Fixes cockroachdb#133141

Release note (bug fix): String constants can now be compared against collated strings.

Co-authored-by: Brian Dillmann <[email protected]>
Co-authored-by: Steven Danna <[email protected]>
Co-authored-by: Andrew Kimball <[email protected]>
Co-authored-by: Matt White <[email protected]>
133075: apiv2,server: filter expensive db queries from sql-over-http r=dhartunian a=dhartunian

Retrieving replica counts and region data via the `SHOW RANGES FROM DATABASE` and `SHOW RANGES FROM TABLE` queries is extremely costly, especially on a large cluster.

Executing these queries should not be done automatically and it's unclear if the cost of computation is worth the data that the customer can see. It would be better for a customer to explicitly opt-in to computing this data in special cases.

A cluster setting (`ui.database_locality_metadata.enabled`) has been introduced to allow customers to turn this functionality off if they're managing a large fleet where it's difficult to prevent users from using this feature. By default this data is enabled.

The implementation is admittedly a hack, designed to reduce the size of this diff and the risk of the backport. Implementing this change purely on the client-side resulted in too much code needing to be changed as we would both need to conditionally change the queries being used to load data, and gate them on cluster setting being loaded successfully prior to the SQL requests being made.

Resolves: CRDB-42482

Release note (ops change, ui change): A new cluster setting `ui.database_locality_metadata.enabled` allows operators to disable loading extended database and table region information in the DB Console Database and Table pages. This information can cause significant CPU load on large clusters with many ranges. Versions of this page from 24.3 onwards do not have this problem. If customers will require this data, they can use the `SHOW RANGES FROM {DATABASE| TABLE}` query via SQL to compute on-demand.

Co-authored-by: David Hartunian <[email protected]>
133512: sql/schemachanger: add dep rules between trigger element and its dependents r=DrewKimball a=DrewKimball

This commit adds a set of dependency rules to the declarative schemachanger to ensure that triggers and their dependent elements (name, references etc.) are added and removed in the correct order.

Informs cockroachdb#126359

Release note: None

Co-authored-by: Drew Kimball <[email protected]>
…gers

This commit prevents a possible bug in uniqueness checks when the table has
row-level BEFORE triggers. The issue is an optimization that attempts to
inline INSERT values into the check, to avoid buffering. This inlining
logic uses the input of the INSERT operator *before* row-level triggers
are added, and so it does not observe any modifications made to the rows.
This could in theory result in the check failing to detect duplicates,
although I was unable to create a logictest to produce this result.

Fixes cockroachdb#133329

Release note: None
134080: kv: ensure leader election is not delayed for quiesced ranges on restart r=nvanbenschoten a=arulajmani

In e60feee, we started persisting leader information in the HardState and loading it upon restart. As a result, upon restart, every raft peer begin its life in a heartbeat lease. This meant that for the first 2-4 seconds upon restart, a leader could not be elected.

This regression was particularly bad for quiesced ranges, as this 2-4 second timer would only start once the replica is ticked. An observed effect of this was changefeedds over quiesced ranges, where we serially acquire leases on the constituent ranges to publish the rangefeeds closed timestamp. This 2-4 second timer would end up stacking, which could cause O(ranges) seconds resolved timestamp lag. See the accompanying test which constructs this scenario for more details.

This patch resolves this regression by only loading leader information if a leader was fortified at shutdown time. If it wasn't, we don't load leader information, and therefore don't begin life in a heartbeat lease. Note that because leader leases are not quiesced, the 2s regression is not as pernicious as it is for epoch based leases.

Epic: none

Release note: None

Co-authored-by: Arul Ajmani <[email protected]>
This commit modifies the following tests to allow them to run in
both store liveness enabled/disabled configurations:

1) TestCommitWithoutNewTermEntry
2) TestConfChangeCheckBeforeCampaign
3) TestConfChangeV2CheckBeforeCampaign
4) TestLeaderCycle
5) TestLeaderCyclePreVote
6) TestLeaderTransferReceiveHigherTermVote
7) TestLogReplication
8) TestOldMessages

References: cockroachdb#132241

Release note: None
133670: raft: enable store liveness in raft unit tests phase 1 r=iskettaneh a=iskettaneh

This commit modifies the following tests to allow them to run in both store liveness enabled/disabled configurations:

1) TestCommitWithoutNewTermEntry
2) TestConfChangeCheckBeforeCampaign
3) TestConfChangeV2CheckBeforeCampaign
4) TestLeaderCycle
5) TestLeaderCyclePreVote
6) TestLeaderTransferReceiveHigherTermVote
7) TestLogReplication
8) TestOldMessages

References: cockroachdb#132241

Release note: None

134065: scripts: update check-pebble-dep.sh r=RaduBerinde a=RaduBerinde

Update script to check 24.2 and 24.3 branches as well.

Epic: none
Release note: None

134081: roachtest: increase timeout for cdc/bank roachtest r=rharding6373 a=rharding6373

Validation sometimes takes longer than the current timeout to complete. Increasing the test timeout until we can improve the performance of the test validation.

Epic: none
Fixes: cockroachdb#134025
Fixes: cockroachdb#133936
Fixes: cockroachdb#133921
Fixes: cockroachdb#133799

Release note: None

Co-authored-by: Ibrahim Kettaneh <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
Co-authored-by: rharding6373 <[email protected]>
134048: sql: further crdb_internal.active_range_feed cleanups r=dt a=stevendanna

- `resolved_age` is now NULL if there is no resolved timestamp.
- `created` is now a TIMESTAMPTZ rather than an INT.

Epic: none
Release note: None

Co-authored-by: Steven Danna <[email protected]>
Right now, the bench doesn't assert that the commit converges.
Moreover, it benches a test-only type that we don't use outside
the testing environment.

Fixes: cockroachdb#133973

Release note: None
craig[bot] and others added 29 commits November 7, 2024 04:10
134470: settings: retire kvadmission.low_pri_read_elastic_control.enabled r=dt a=dt

Release note (ops change): The kvadmission.low_pri_read_elastic_control.enabled has been removed as all bulk requests are now subject to eleastic admission by default.
Epic: https://cockroachlabs.atlassian.net/browse/CRDB-39798.

Co-authored-by: David Taylor <[email protected]>
134407: schemachange/mixed-versions: handle possible syntax error for BIT(0) column usage r=spilchen a=spilchen

A recent fix enabled support for BIT(0) columns (see issue cockroachdb#132944), which has since been backported. However, since this backport hasn’t been released yet, running against an older down-level client may still trigger a syntax error. This change adds a temporary check to handle such errors until the backports for cockroachdb#132944 are publicly released.

This adjustment will be backported to older branches, with slight variations per release:
- release-24.3: Reuse the same fix as in master.
- release-24.2: Use isClusterVersionLessThan with clusterversion.V24.2.
- release-24.1: Use isClusterVersionLessThan with clusterversion.V24.1.
- release-23.2: Use isClusterVersionLessThan with clusterversion.V23.2.

Epic: None
Release note: None
Closes cockroachdb#133339
Closes cockroachdb#133790
Closes cockroachdb#131162

Co-authored-by: Matt Spilchen <[email protected]>
The extra layer of wrapper/indirection between the batching logic and
retrying in the LDR processor and the actual kv or sql batch handler was
not really helping us and indeed was going to make extending the KV
handler to support multi-row batches harder. Instead the KV and SQL
processors now directly implement HandleBatch themselves.

Release note: none.
Epic: none.
134377: sql/schemachanger: fix a misattributed metric and noisy log r=rafiss a=rafiss

### sql: only increment sql.ddl.count for DDL operations

This was leasing to a highly inflated count of the metric due to TCL
statements such as DISCARD ALL. Those are not DDL statements, but they
can modify schema by dropping temporary sequences. Using the statement
return type is a more accurate way to check.

Release note (bug fix): Fixed a minor bug where DISCARD ALL statements
were counted under the sql.ddl.count metric. Now these will be counted
under the sql.misc.count metric.

### sql: move a spammy log to a higher verbosity level

This was getting logged too frequently, since the function is called during the
planning phase to check if a statement works in the declarative schema
changer. This would get logged even if the statement does not use the
declarative schema changer, which is a little misleading.

Epic: None
Release note: None

Co-authored-by: Rafi Shamim <[email protected]>
Previously, all of our PCR reader catalog tests tested fairly short
txn's when advancing the timestamp. This confirmed that in the common
case the reader catalog timestamp could be pushed forward. However, when
long running txn's are executing we will need to wait for their leases
to expire before the timestamps can be moved forward. To address this,
this patch adds a test to confirm that we wait for leases to expire and
that user txn's cannot be impacted because they have a AOST set.

Fixes: cockroachdb#133731

Release note: None
In cockroachdb#125267 we added CREATE STATISTICS and ANALYZE statements to all of
the sqlsmith-based randomized tests. But because CREATE STATISTICS and
ANALYZE produce table statistics with nondeterministic `createdAt`
times, this makes the forecasts based on these stats nondeterministic as
well. I think the proper fix here would be to add an option to CREATE
STATISTICS to set the `createdAt` time, but for now simply disable stats
forecasting.

Epic: None

Release note: None
134240: catalog/replication: test long running txns while advancing timestamps r=fqazi a=fqazi

Previously, all of our PCR reader catalog tests tested fairly short txn's when advancing the timestamp. This confirmed that in the common case the reader catalog timestamp could be pushed forward. However, when long running txn's are executing we will need to wait for their leases to expire before the timestamps can be moved forward. To address this, this patch adds a test to confirm that we wait for leases to expire and that user txn's cannot be impacted because they have a AOST set.

Fixes: cockroachdb#133731

Release note: None

Co-authored-by: Faizan Qazi <[email protected]>
134362: raft: fix TODOs in TestLeaderTransfer{StaleFollower,StepsDownImmediately} r=nvanbenschoten a=arulajmani

These TODOs were written because we hadn't implemented the de-fortification protocl back then. Now that we have, we can address them.

Epic: none
Release note: None

Co-authored-by: Arul Ajmani <[email protected]>
134272: crosscluster/logical: eliminate RowProcessor indirection r=dt a=dt

The extra layer of wrapper/indirection between the batching logic and retrying in the LDR processor and the actual kv or sql batch handler was not really helping us and indeed was going to make extending the KV handler to support multi-row batches harder. Instead the KV and SQL processors now directly implement HandleBatch themselves.

Release note: none.
Epic: none.

Co-authored-by: David Taylor <[email protected]>
134324: cloud: fix missing shared config profile error on kms backups r=msbutler a=kev-cao

Backups with KMS keys fail due to a missing shared config profile `default` after the upgrade to the AWS SDK V2. This is due to a bad port over to the V2 SDK where we enforce that the `default` shared config must be used for implicit auth on AWS KMS. As we now use `config.LoadDefaultConfig` to load the client configuration, it will automatically attempt to load the shared config but ignore the error if it does not exist. We should not explicitly specify that the shared config must be used or else it will error upon failing to find it.

Fixes: cockroachdb#134214

Epic: none

Release note (bug fix): Fixed AWS backup/restore with kms failing due to missing shared config.

Co-authored-by: Kevin Cao <[email protected]>
In cockroachdb#133092 the min version was set at 23.2 for the test, however it was
only applied to the "locality=global" variation. This also applies to
the single-region variation.

Fixes: cockroachdb#133433

Release note: None
- Gate status server grpc UpdateTableMetadataCache on ADMIN
This should only be accessed via the api v2 http handler, which
performs finer privilege checks on if the user has CONNECT privs
on a database.
- When filtering dbs and tables to return in the apis, permit
dbs and tables that have CONNECT on the public role.
- Move TestUpdateTableMetadataCacheJobRunsOnRPCTrigger test
to status_test.go. This test is more well-suited to live here since
it tests functionality in the status server.

Epic: CRDB-37558
Fixes: cockroachdb#134431
Fixes: cockroachdb#130245

Release note (ui change): Users may access DB Console's db pages
(db overview, tables overview, table details) if they have CONNECT
privilege on the database.
Previously many of the test utilities were inside the tests package.
This forced any tests to also be inside that package. By moving them
into roachtestutil (and making public) it becomes easier to split tests
out.

Epic: None

Release note: None
…achdb#134422 cockroachdb#134514

134209: roachtest: move utilities to roachtestutil r=srosenberg a=andrewbaptist

Previously many of the test utilities were inside the tests package. This forced any tests to also be inside that package. By moving them into roachtestutil (and making public) it becomes easier to split tests out.

Epic: None

Release note: None

134213: roachtest: fix multi-store-remove roachtest r=itsbilal a=jbowens

Fix the multi-store-remove roachtest. Setting --wal-failover=disabled on a multi-store node that has never had WAL failover enabled is not yet supported.

Fixes cockroachdb#133939.
Informs cockroachdb#131468.
Epic: none
Release note: none

134335: roachtest: disable 23.1 -> 23.2 testing for follower reads r=miraradeva a=andrewbaptist

In cockroachdb#133092 the min version was set at 23.2 for the test, however it was only applied to the "locality=global" variation. This also applies to the single-region variation.

Fixes: cockroachdb#133433

Release note: None

134422: crosscluster: require 24.3 for LDR r=stevendanna a=stevendanna

We now require a number of options that are only available on 24.3.

Epic: none
Release note: None

134514: sqlsmith: disable stats forecasting in sqlsmith-based tests r=mgartner,rytaft a=michae2

In cockroachdb#125267 we added CREATE STATISTICS and ANALYZE statements to all of the sqlsmith-based randomized tests. But because CREATE STATISTICS and ANALYZE produce table statistics with nondeterministic `createdAt` times, this makes the forecasts based on these stats nondeterministic as well. I think the proper fix here would be to add an option to CREATE STATISTICS to set the `createdAt` time, but for now simply disable stats forecasting.

Epic: None

Release note: None

Co-authored-by: Andrew Baptist <[email protected]>
Co-authored-by: Jackson Owens <[email protected]>
Co-authored-by: Steven Danna <[email protected]>
Co-authored-by: Michael Erickson <[email protected]>
134458: server: databases apis privilege fixes r=xinhaoz a=xinhaoz

- Gate status server grpc UpdateTableMetadataCache on ADMIN This should only be accessed via the api v2 http handler, which performs finer privilege checks on if the user has CONNECT privs on a database.
- When filtering dbs and tables to return in the apis, permit dbs and tables that have CONNECT on the public role.
- Move TestUpdateTableMetadataCacheJobRunsOnRPCTrigger test to status_test.go. This test is more well-suited to live here since it tests functionality in the status server.

Epic: CRDB-37558
Fixes: cockroachdb#134431
Fixes: cockroachdb#130245

Release note (ui change): Users may access DB Console's db pages (db overview, tables overview, table details) if they have CONNECT privilege on the database.

134466: pebbleiter: implement ValueAndErr r=RaduBerinde a=jbowens

Implement ValueAndErr on the storage/pebbleiter package's assertionIter to ensure correct use of value slices. Additionally, assert that no callers make use of the deprecated Value method.

Epic: none
Release note: none

Co-authored-by: Xin Hao Zhang <[email protected]>
Co-authored-by: Jackson Owens <[email protected]>
Currently, if the heartbeat loop's connection closes, the heartbeater will fail
repeatedly until the producer job times out and fails. In this patch, the
heartbeat call will attempt to recover from this failure mode by attempting
re-establish the connection.

Epic: none

Release note: none
This will allow us to access useful fields in the EvalCtx, such as the
fields relating to transaction state.

Release note: None
133893: colexec: harden eager cancellation in parallel unordered sync r=yuzefovich a=yuzefovich

This commit hardens the eager cancellation mechanism in the parallel unordered synchronizer. It was recently fixed in dda8b3a, but the newly added test exposed a bug where the eager cancellation in a child PUS could poison the query execution of another input of the parent PUS, incorrectly failing the query altogether. More detailed description can be found [here](cockroachdb#127942 (comment)), but in short, due sharing of the same leaf txn between most operators in a flow, eager cancellation of one operator could lead to poisoning the execution of another operator which could only happen with a hierarchy of PUSes. This commit fixes such situation by swallowing all context cancellation errors in draining state of a PUS _even if_ that particular PUS didn't eagerly cancel its inputs.

The rationale for why this behavior is safe is the following:
- if the query should result in an error, then some other error must have been propagated to the client, and this is what caused the sync to transition into the draining state in the first place. (We do replace errors for the client in one case - set `DistSQLReceiver.SetError` where some errors from KV have higher priority then others, but it isn't applicable here.)
- if the query should not result in an error and should succeed, yet we have some pending context cancellation errors, then it must be the case that query execution was short-circuited (e.g. because of the LIMIT), so we can pretend the part of the execution that hit the pending error didn't actually run since clearly it wasn't necessary to compute the query result.

Note that we couldn't swallow all types of errors in the draining state (e.g. ReadWithinUncertaintyIntervalError that comes from the KV layer results in "poisoning" the txn, so we need to propagate it to the client), so we only have a single error type that we swallow.

Also note that having another PUS is needed for this problem to occur because we must have concurrency between the child PUS that performs the eager cancellation and another operator that gets poisoned, and while we have two sources of concurrency within a single flow, only PUS is applicable (the other being outboxes but we only have eager cancellation for local plans).

Additionally, while working on this change I realized another reason for why we don't want to lift the restriction for having eager cancellation only on "leaf" PUSes, so I extended the comment. This commit also adds a few more logic tests.

Fixes: cockroachdb#127942.

Release note: None

133960: roachtest: add cs op to toggle automatic table metadata job r=xinhaoz a=xinhaoz

This commit adds a roachtest operation that toggles automatic runs of the update table metadata job, which collects and writes information about storage statistics for all tables in the cluster.

Epic: none

Release note: None

134373: sql/schemachanger: plumb EvalCtx into BuildCtx r=rafiss a=rafiss

This will allow us to access useful fields in the EvalCtx, such as the fields relating to transaction state.

Epic: None
Release note: None

Co-authored-by: Yahor Yuzefovich <[email protected]>
Co-authored-by: Xin Hao Zhang <[email protected]>
Co-authored-by: Rafi Shamim <[email protected]>
This commit disallows replacing a trigger function that is actively
used in a trigger. This is necessary because triggers keep an inlined
version of the function body (specialized for the table the trigger is on)
and the inlined version currently does not change when the function is
replaced. Note that it is still possible to use CREATE OR REPLACE syntax
if the function is not yet used in a trigger.

Fixes cockroachdb#134549

Release note: None
Once we advance the version to to 25.1, we will bump again to 24.3

Epic: REL-1292
Release note: None
134307: crosscluster/streamclient: attempt to reconnect conn in heartbeat call r=msbutler a=msbutler

Currently, if the heartbeat loop's connection closes, the heartbeater will fail repeatedly until the producer job times out and fails. In this patch, the heartbeat call will attempt to recover from this failure mode by attempting re-establish the connection.

Epic: none

Release note: none

Co-authored-by: Michael Butler <[email protected]>
134352: raft: don't remember leader after stepping down if not fortified r=nvanbenschoten a=nvanbenschoten

This commit extends the change in bee163a to not remember that a replica was the leader if it is not fortified when stepping down. This gives us time to stabilize the following state as the v24.3 release is rolled out:

```
r.state = StateFollower && r.lead = r.id
```

Once this state is stabilized (within and above `pkg/raft`), we can remove this special case.

Epic: None
Release note: None

Co-authored-by: Nathan VanBenschoten <[email protected]>
134347: jobs: don't redact job ID in log tags; don't redact internal executor opName r=rafiss a=rafiss

This will assist with debugging.

fixes cockroachdb#132113
Release note: None

134370: quantize: add rabitq quantization r=mw5h a=andy-kimball

This commit implements RaBitQ quantization according to the algorithm described in this paper:

    "RaBitQ: Quantizing High-Dimensional Vectors with a Theoretical Error Bound
    for Approximate Nearest Neighbor Search" by Jianyang Gao & Cheng Long.

The RaBitQ quantization method provides good accuracy, produces compact codes, provides practical error bounds, is easy to implement, and can be accelerated with fast SIMD instructions. RaBitQ quantization codes use only 1 bit per dimension in the original vector.

Epic: CRDB-42943

Release note: None

Co-authored-by: Rafi Shamim <[email protected]>
Co-authored-by: Andrew Kimball <[email protected]>
134575: sql: disallow replacing a referenced trigger function r=DrewKimball a=DrewKimball

This commit disallows replacing a trigger function that is actively used in a trigger. This is necessary because triggers keep an inlined version of the function body (specialized for the table the trigger is on) and the inlined version currently does not change when the function is replaced. Note that it is still possible to use CREATE OR REPLACE syntax if the function is not yet used in a trigger.

Fixes cockroachdb#134549

Release note: None

Co-authored-by: Drew Kimball <[email protected]>
134598: Revert "c-deps: upgrade krb5 to 1.21.3 upstream" r=rickystewart a=msbutler

This reverts commit eb8c145.

Co-authored-by: Michael Butler <[email protected]>
134340: clusterversion: bump MinSupported to 24.2 r=RaduBerinde a=RaduBerinde

Once we advance the version to to 25.1, we will bump again to 24.3

Epic: REL-1292
Release note: None

Co-authored-by: Radu Berinde <[email protected]>
cockroachdb#129221 and
cockroachdb#132023 added changes to
exporters to emit openmetrics. This PR makes changes to the roachtests
to make use of the changes in the above PRs. This change also made some
changes to some roachtests that use neither of the above approaches

Epic: https://cockroachlabs.atlassian.net/browse/CRDB-41852

Release note: None
<what was there before: Previously, ...>
<why it needed to change: This was inadequate because ...>
<what you did about it: To address this, this patch ...>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.