Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Source MySQL\MsSql\Postgres: added RDS base performance tests #8215

Merged
merged 31 commits into from
Dec 12, 2021

Conversation

etsybaev
Copy link
Contributor

@etsybaev etsybaev commented Nov 23, 2021

What

Need to test benchmarks

How

Added tests for MySql, MsSql and Postgres RDS DBs.

🚨 User Impact 🚨

No user impacts, tests only

Pre-merge Checklist

Expand the relevant checklist and delete the others.

New Connector

Community member or Airbyter

  • Community member? Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • docs/SUMMARY.md
    • docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
    • docs/integrations/README.md
    • airbyte-integrations/builds.md
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing.
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the connector is published, connector added to connector index as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here

Updating a connector

Community member or Airbyter

  • Grant edit access to maintainers (instructions)
  • Secrets in the connector's spec are annotated with airbyte_secret
  • Unit & integration tests added and passing. Community members, please provide proof of success locally e.g: screenshot or copy-paste unit, integration, and acceptance test output. To run acceptance tests for a Python connector, follow instructions in the README. For java connectors run ./gradlew :airbyte-integrations:connectors:<name>:integrationTest.
  • Code reviews completed
  • Documentation updated
    • Connector's README.md
    • Connector's bootstrap.md. See description and examples
    • Changelog updated in docs/integrations/<source or destination>/<name>.md including changelog. See changelog example
  • PR name follows PR naming conventions

Airbyter

If this is a community PR, the Airbyte engineer reviewing this PR is responsible for the below items.

  • Create a non-forked branch based on this PR and test the below items on it
  • Build is successful
  • Credentials added to Github CI. Instructions.
  • /test connector=connectors/<name> command is passing.
  • New Connector version released on Dockerhub by running the /publish command described here
  • After the new connector version is published, connector version bumped in the seed directory as described here
  • Seed specs have been re-generated by building the platform and committing the changes to the seed spec files, as described here

@github-actions github-actions bot added the area/connectors Connector related issues label Nov 23, 2021
@etsybaev etsybaev temporarily deployed to more-secrets November 23, 2021 20:14 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 23, 2021 21:07 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 24, 2021 17:18 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 24, 2021 17:42 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 24, 2021 17:48 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 25, 2021 08:55 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 25, 2021 15:58 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 25, 2021 16:12 Inactive
@etsybaev etsybaev temporarily deployed to more-secrets November 25, 2021 16:31 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 25, 2021 16:31 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 25, 2021 16:39 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 25, 2021 18:16 Inactive
@etsybaev etsybaev temporarily deployed to more-secrets November 29, 2021 10:43 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 29, 2021 11:17 Inactive
@etsybaev etsybaev temporarily deployed to more-secrets November 29, 2021 12:07 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 29, 2021 12:08 Inactive
@etsybaev etsybaev temporarily deployed to more-secrets November 29, 2021 13:04 Inactive
@etsybaev
Copy link
Contributor Author

etsybaev commented Nov 29, 2021

/test-performance connector=connectors/source-mysql

🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1516129500
❌ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1516129500
🐛 https://gradle.com/s/bbyoelwbtnk2u
🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1516129500
✅ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1516129500
🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1516129500
✅ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1516129500

@jrhizor jrhizor temporarily deployed to more-secrets November 29, 2021 13:26 Inactive
@etsybaev etsybaev temporarily deployed to more-secrets November 29, 2021 14:45 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets November 29, 2021 16:01 Inactive
…of github.com:airbytehq/airbyte into akorotkov/source-perfomance-test-memory-and-cpu
@andriikorotkov andriikorotkov temporarily deployed to more-secrets December 10, 2021 13:38 Inactive
@andriikorotkov andriikorotkov temporarily deployed to more-secrets December 10, 2021 13:40 Inactive
@andriikorotkov
Copy link
Contributor

@tuliren, I work on being able to pass CPU and memory values as gradle parameters for performance tests. for this I used this pull request from @etsybaev. I added my minor changes to this pull request, for buildSrc / src / main / groovy / airbyte-performance-test-java.gradle and airbyte-integrations / bases / standard-source-test / src / main / java / io / airbyte /integrations/standardtest/source/AbstractSourceConnectorTest.java

@etsybaev etsybaev temporarily deployed to more-secrets December 10, 2021 15:06 Inactive
@etsybaev etsybaev temporarily deployed to more-secrets December 10, 2021 22:34 Inactive
Copy link
Contributor

@tuliren tuliren left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that there are still four databases we are testing against:

ID Tables Columns Records Purpose
D2 1K 240 200 Many tables.
D3 5K 240 200 Lots of tables.
D4 1K 8 10K Many tables + moderate records.
D6 25 8 50K Lots of records.

The cloud cannot handle 5000 tables currently. It fails at retrieving the catalog. Let's drop D3, and only run against D2 for now.

@tuliren
Copy link
Contributor

tuliren commented Dec 11, 2021

Another thought indirectly related to this PR. Method AbstractSourceConnectorTest#runReadVerifyNumberOfReceivedMsgs mimics DefaultReplicationWorker#getReplicationRunnable to test the source connector.

  1. It is vital that the former follows the same pattern as the latter. Is there a way to ensure that?
  2. The latter method currently uses the sourceMessageTracker to track the stats of a sync on the source side. Ideally the former method should do the same.

Hi @tuliren . If I got you correctly - in the first case, we collect all received messages to collection and then verify that received messages are exactly as was expected. Meanwhile, such a direct approach wouldn't work for performance tests. We will quickly end up with outOfMemory (for the test itself thread, not for connector's) error if try to collect all messages to a collection. In the case of performance tests, we are not interested in how properly it was read, but we care about the stream name and number of messages. So the former approach checks more carefully but doesn't work as it is for cases with big DBs. Hope that was what you asked for. Please let me know if I got your question incorrectly. Thanks

I guess we are probably talking about different things. My first bullet point means the performance test should mimics the how the source is read as much as possible. Currently, we do that by replicating the production procedure (DefaultReplicationWorker#getReplicationRunnable) in AbstractSourceConnectorTest#runReadVerifyNumberOfReceivedMsgs. However, if the production procedure changes, the performance will not use the new procedure. So the test cannot truly reflect what happens in production. Ideally, the performance test should reference some of the production procedure directly, so that it is change-proof.

This can be worked on in a follow up PR though. Don't worry about it in this one.

I created an issue to track this: #8718.

@andriikorotkov andriikorotkov temporarily deployed to more-secrets December 11, 2021 15:49 Inactive
@etsybaev etsybaev temporarily deployed to more-secrets December 12, 2021 12:51 Inactive
@etsybaev
Copy link
Contributor Author

etsybaev commented Dec 12, 2021

/test-performance connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/1569568144
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/1569568144

@etsybaev
Copy link
Contributor Author

etsybaev commented Dec 12, 2021

/test-performance connector=connectors/source-mssql

🕑 connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/1569568506
✅ connectors/source-mssql https://github.com/airbytehq/airbyte/actions/runs/1569568506

@etsybaev
Copy link
Contributor Author

etsybaev commented Dec 12, 2021

/test-performance connector=connectors/source-mysql

🕑 connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1569568636
✅ connectors/source-mysql https://github.com/airbytehq/airbyte/actions/runs/1569568636

@jrhizor jrhizor temporarily deployed to more-secrets December 12, 2021 13:21 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets December 12, 2021 13:21 Inactive
@jrhizor jrhizor temporarily deployed to more-secrets December 12, 2021 13:21 Inactive
@etsybaev
Copy link
Contributor Author

etsybaev commented Dec 12, 2021

/test connector=connectors/source-postgres

🕑 connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/1569775596
✅ connectors/source-postgres https://github.com/airbytehq/airbyte/actions/runs/1569775596
No Python unittests run

@jrhizor jrhizor temporarily deployed to more-secrets December 12, 2021 15:17 Inactive
@etsybaev etsybaev merged commit 70d2f46 into master Dec 12, 2021
@etsybaev etsybaev deleted the etsybaev/source-mysql-added-base-rds-performance-test branch December 12, 2021 17:29
@tuliren tuliren mentioned this pull request Dec 12, 2021
6 tasks
schlattk pushed a commit to schlattk/airbyte that referenced this pull request Jan 4, 2022
…ehq#8215)

* Added RDS base performance tests for source-postgres, source-mssql and source-mysql
* updated perfomance test with cpu and memory limit


Co-authored-by: andriikorotkov <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/connectors Connector related issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants