Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

roachtest: Add LDAP conn. latency test via roachtest #130998

Merged

Conversation

sanchit-CRL
Copy link
Collaborator

@sanchit-CRL sanchit-CRL commented Sep 19, 2024

No test previously existed to compute and monitor LDAP
connection latency

Created a roachtest which leverages the workload
to get the stats for LDAP connection latency
The test provisions an openLDAP service and it's user jdoe
which is authenticated on the CRDB via LDAP.
The test

  • provisions openLDAP with TLS connection
  • Creates a user named jdoe into CRDB
  • Sets the HBA conf and custom CA into the cluster settings
  • runs the workload binary to compute the connection latency

Epic: CRDB-40412
Fixes: #127358

Release note: None

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@sanchit-CRL sanchit-CRL force-pushed the sanchit_ldap_connection_latency branch from 76ab3b7 to 4cea168 Compare September 19, 2024 08:14
@sanchit-CRL sanchit-CRL force-pushed the sanchit_ldap_connection_latency branch from 4cea168 to d0d7c01 Compare September 19, 2024 09:38
Copy link
Contributor

@souravcrl souravcrl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @pritesh-lahoti and @sanchit-CRL)


-- commits line 2 at r1:
nit: since the title is prefixed with a package name, we don't capitalize the title. The text that follows after colon(:) is not capitalized.

https://cockroachlabs.atlassian.net/wiki/spaces/CRDB/pages/73072807/Git+Commit+Messages#Commit-title


-- commits line 13 at r1:
nit: spell error

Suggestion:

  * Sets the HBA conf and custom CA into the cluster settings

-- commits line 17 at r1:
nit: we need to provide a release note at the end of the commit even if there is none, as that helps docs team to go through the commits for documentation.
https://cockroachlabs.atlassian.net/wiki/spaces/CRDB/pages/186548364/Release+notes#High-level-rules

Suggestion:

  Release Note: none

build/teamcity/cockroach/nightlies/roachtest_nightly_gce.sh line 10 at r1 (raw file):

source "$dir/teamcity-bazel-support.sh"  # For run_bazel

BAZEL_SUPPORT_EXTRA_DOCKER_ARGS="-e LITERAL_ARTIFACTS_DIR=$root/artifacts -e BUILD_VCS_NUMBER -e CLOUD -e COCKROACH_DEV_LICENSE -e TESTS -e COUNT -e GITHUB_API_TOKEN -e GITHUB_ORG -e GITHUB_REPO -e GOOGLE_EPHEMERAL_CREDENTIALS -e GOOGLE_KMS_KEY_A -e GOOGLE_KMS_KEY_B -e GOOGLE_CREDENTIALS_ASSUME_ROLE -e GOOGLE_SERVICE_ACCOUNT -e SLACK_TOKEN -e TC_BUILDTYPE_ID -e TC_BUILD_BRANCH -e TC_BUILD_ID -e TC_SERVER_URL -e SELECT_PROBABILITY -e COCKROACH_RANDOM_SEED -e ROACHTEST_ASSERTIONS_ENABLED_SEED -e ROACHTEST_FORCE_RUN_INVALID_RELEASE_BRANCH -e GRAFANA_SERVICE_ACCOUNT_JSON -e GRAFANA_SERVICE_ACCOUNT_AUDIENCE -e ARM_PROBABILITY -e USE_SPOT -e SELECTIVE_TESTS -e SFUSER -e SFPASSWORD -e SIDE_EYE_API_TOKEN -e COCKROACH_EA_PROBABILITY -e LDAP_TEST_USER_PASSWORD" \

nit: for SnowFlake I see we use both username and password credentials. It should be more secure to pass in the full credential here.


pkg/cmd/roachtest/tests/connection_latency.go line 170 at r1 (raw file):

		Cluster: r.MakeClusterSpec(numNodes+1,
			spec.WorkloadNode(), spec.GCEZones(regionUsCentral)),
		RequiresLicense:            false,

nit: this is required since LDAP feature is entirely behind cockroach enterprise license. Setting this to true sets the COCKROACH_DEV_LICENSE for the test.


pkg/cmd/roachtest/tests/connection_latency.go line 184 at r1 (raw file):

	ldapTestUserPassword, ok := os.LookupEnv("LDAP_TEST_USER_PASSWORD")
	if !ok {
		t.L().Printf("environment variable LDAP_TEST_USER_PASSWORD is not set")

nit: this should fail the test if environment variable is not set. We do this for github API token here:

if !ok {
t.Fatalf("GitHub API token environment variable %s is not set", githubAPITokenEnv)
}


pkg/cmd/roachtest/tests/connection_latency.go line 233 at r1 (raw file):

	// Currently the test runs for a single region, this is for future use
	// if the test has to be extended for multiple regions.
	if numZones > 1 {

This would probably need change to HBA as each connection for a different region will need to be mapped to a separate LDAP server. This seems beyond scope of current work, and can be left out I feel.


pkg/cmd/roachtest/testdata/ldap_authentication_hba_conf line 2 at r1 (raw file):

host    all           roachprod           0.0.0.0/0          password
host    all           all            all                 ldap ldapserver=crlcloud.dev ldapport=636 \"ldapbasedn=OU=AADDC Users,DC=crlcloud,DC=dev\" \"ldapbinddn=CN=Test User,OU=AADDC Users,DC=crlcloud,DC=dev\" ldapbindpasswd=~cockroachTest ldapsearchattribute=sAMAccountName \"ldapsearchfilter=(memberOf=CN=crl-crlcloud-dev-domain-sync-users,OU=AADDC Users,DC=crlcloud,DC=dev)\"

nit: we should also redact the ldap password and username from HBA conf. Effectively this could be set from the same environment variable used for the latency test.

Copy link
Contributor

@pritesh-lahoti pritesh-lahoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 6 of 6 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @sanchit-CRL and @souravcrl)


pkg/cmd/roachtest/tests/connection_latency.go line 204 at r1 (raw file):

	urlTemplate := func(host string) string {
		return fmt.Sprintf("postgresql://testuser:%s@%s:{pgport:1}", ldapTestUserPassword, host)

nit: can move testuser to a var/constant and then use it here and at prepareSQLUserForLDAP(ctx, t, c, "testuser").


pkg/cmd/roachtest/tests/connection_latency.go line 233 at r1 (raw file):

Previously, souravcrl wrote…

This would probably need change to HBA as each connection for a different region will need to be mapped to a separate LDAP server. This seems beyond scope of current work, and can be left out I feel.

I agree.


pkg/cmd/roachtest/tests/connection_latency.go line 280 at r1 (raw file):

		c.Node(1), roachprod.PGURLOptions{})
	require.NoError(t, err)
	conn, err := pgx.Connect(ctx, pgURL[0])

Should we add require.NotEmpty(t, pgURL) before accessing pgURL[0]?


pkg/cmd/roachtest/tests/registry.go line 46 at r1 (raw file):

	registerClusterReplicationDisconnect(r)
	registerConnectionLatencyTest(r)
	registerLDAPConnectionLatencyTest(r)

nit: this may need to be arranged in lexicographical order.

@sanchit-CRL sanchit-CRL force-pushed the sanchit_ldap_connection_latency branch from d0d7c01 to 4e75bea Compare October 3, 2024 09:34
angles-n-daemons added a commit to angles-n-daemons/cockroach that referenced this pull request Nov 1, 2024
Historically, to estimate which tables and indexes existed within the
span of a range, we used the Collection's `GetAllDescriptors` function
and returned the values which fell in between the start and end keys of
the range's span. This approach had the benefit of being precise, but
the drawback of being computationally expensive - since the system can
have hundreds of nodes and theoretically millions of tables, using
`GetAllDescriptors` begins to become a bottleneck.

This approach was modified so that instead of pulling all descriptors
for the system when computing range contents, the system instead used
the start key of the range to identify the exact table + index at that
point within the keyspace
([change](https://github.com/cockroachdb/cockroach/pull/77277/files)).
This works if each table, index has their own range, but quickly breaks
down if tables share a range. Consider the following layout of data:

```
T1=Table 1
T2=Table 2
R1=Range1

└─────T1─────┴─────T2────┴─────T3─────┘
└────────┴─────────R1───────┴─────────┘
```

Since the start key of the range falls with Table 1, the system
associates the range with only Table 1, despite it containing Tables 2
and 3.

Using this information, it becomes necessary to identify a set of
descriptors within a certain span. This PR introduces the
`ScanDescriptorsInSpans` function which does just that, allows the user
to specify a set of spans whose descriptors are important and then
return a catalog including those descriptors.

It does this by translating the span keys into descriptor span keys and
scanning them from the descriptors table. For example given a span
`[/Table/Users/PKEY/1, /Table/Users/SECONDARY/chicago]` where the ID for
the Users table is `5`, it will generate a descriptor span
`[/Table/Descriptors/PKEY/5, /Table/Descriptors/PKEY/6]`.

This descriptor too comes with its drawbacks in that within the
descriptor space, keys are scoped by table, and not necessarily indexes.
That means in a following PR, the status server will be responsible for
taking these descriptors, which include all indexes in the tables
pulled, and filtering it down to only the indexes which appear in the
specified range.

The bulk of the changeset is in updating the datadriven tests to test
this behavior, the primary area of focus for review should be the
`pkg/sql/catalog/internal/catkv/catalog_reader.go` file (~75 LOC).

Epic: CRDB-43151
Fixes: cockroachdb#130998

Release note: None
@sanchit-CRL sanchit-CRL force-pushed the sanchit_ldap_connection_latency branch from 4e75bea to 1259519 Compare December 12, 2024 08:55
@sanchit-CRL sanchit-CRL marked this pull request as ready for review December 12, 2024 08:55
@sanchit-CRL sanchit-CRL requested a review from a team as a code owner December 12, 2024 08:55
@sanchit-CRL sanchit-CRL requested review from shailendra-patel and csgourav and removed request for a team December 12, 2024 08:55
@sanchit-CRL sanchit-CRL force-pushed the sanchit_ldap_connection_latency branch 3 times, most recently from 05e8b1d to 6d7729b Compare December 13, 2024 06:22
Copy link
Collaborator Author

@sanchit-CRL sanchit-CRL left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @csgourav, @pritesh-lahoti, @shailendra-patel, and @souravcrl)


pkg/cmd/roachtest/tests/connection_latency.go line 170 at r1 (raw file):

Previously, souravcrl wrote…

nit: this is required since LDAP feature is entirely behind cockroach enterprise license. Setting this to true sets the COCKROACH_DEV_LICENSE for the test.

the field has been removed from the roachtest spec now


pkg/cmd/roachtest/tests/connection_latency.go line 233 at r1 (raw file):

Previously, pritesh-lahoti wrote…

I agree.

Done.


pkg/cmd/roachtest/tests/connection_latency.go line 280 at r1 (raw file):

Previously, pritesh-lahoti wrote…

Should we add require.NotEmpty(t, pgURL) before accessing pgURL[0]?

Done.


pkg/cmd/roachtest/testdata/ldap_authentication_hba_conf line 2 at r1 (raw file):

Previously, souravcrl wrote…

nit: we should also redact the ldap password and username from HBA conf. Effectively this could be set from the same environment variable used for the latency test.

Not required anymore as the password is for a user created locally

@sanchit-CRL sanchit-CRL force-pushed the sanchit_ldap_connection_latency branch from 6d7729b to f30fd00 Compare December 16, 2024 04:59
Copy link
Contributor

@pritesh-lahoti pritesh-lahoti left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 7 of 8 files at r2, 1 of 1 files at r3, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @csgourav, @shailendra-patel, @souravcrl, and @vidit-bhat)

pkg/cmd/roachtest/tests/connection_latency.go Show resolved Hide resolved
No test previously existed to compute and monitor LDAP
connection latency

Created a roachtest which leverages the workload
to get the stats for LDAP connection latency
The test provisions an openLDAP service and it's user `jdoe`
which is authenticated on the CRDB via LDAP.
The test
* provisions openLDAP with TLS connection
* Creates a user named jdoe into CRDB
* Sets the HBA conf and custom CA into the cluster settings
* runs the workload binary to compute the connection latency

Epic: CRDB-40412
Fixes: cockroachdb#127358

Release note: None
@sanchit-CRL sanchit-CRL force-pushed the sanchit_ldap_connection_latency branch from f30fd00 to 69b8862 Compare December 17, 2024 17:03
@sanchit-CRL
Copy link
Collaborator Author

bors r+

@craig craig bot merged commit 5169369 into cockroachdb:master Dec 18, 2024
22 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ldap: scalability & performance testing
6 participants