[EPIC] E2E testing tool #15152

DoNotPanicUA · 2022-08-01T09:35:08Z

Tell us about the problem you're trying to solve

Our main goal is to implement an E2E testing tool that should test Airbyte connections via CI. This testing tool will test different connector versions close to the user experience. Such E2E testing will help us to detect possible integration issues before a version release.
Potential issues:

Performance degradation (Benchmark)
Critical changes (backward compatibility check)
Incompatible with other connectors (integration compatibility check)
Incompatible with Airbyte core (core compatibility check)

Note! This solution is inspired by the previously designed similar tool. #8243

Describe the solution you’d like

Stage 1. POC - Done ✔️

The initial stage provides us with fundamental functionality.
In addition, it's enough to start integration with potential benchmark frameworks.

Main flow diagram

Scenario example

{
  "scenarioName" : "Poc Scenario",
  "usedInstances" : [
    {
      "instanceName" : "airbyte_1",
      "instanceType" : "AIRBYTE"
    },
    {
      "instanceName" : "source_1",
      "instanceType" : "SOURCE"
    },
    {
      "instanceName": "destination_1",
      "instanceType": "DESTINATION"
    },
    {
      "instanceName": "connection_1",
      "instanceType": "CONNECTION"
    }
  ],
  "preparationActions" : [
    {
      "action" : "CONNECT_AIRBYTE_API",
      "resultInstance" : "airbyte_1"
    },
    {
      "action" : "CREATE_SOURCE",
      "requiredInstances" : ["airbyte_1"],
      "resultInstance" : "source_1"
    },
    {
      "action": "CREATE_DESTINATION",
      "requiredInstances" : ["airbyte_1"],
      "resultInstance": "destination_1"
    },
    {
      "action" : "CREATE_CONNECTION",
      "requiredInstances" : ["airbyte_1", "source_1", "destination_1"],
      "resultInstance" : "connection_1"
    }
  ],
  "scenarioActions" : [
    {
      "action" : "SYNC_CONNECTION",
      "requiredInstances" : ["airbyte_1", "connection_1"]
    }
  ]
}

Stage 2. Credential customization - Done ✔️

This stage allows specifying the Airbyte instance, source, and destination credentials.

Stage 3. Run configuration - Done ✔️

Stage 4. Docker & CI - Done ✔️

Configure docker
Configure CI commands
Provide summary result class
Store result class into file
Read result class in the GA and put it into the commet

Example `List all scenarios` command :

Example `Help` command :

Example `Run sync` command :

Example `Fail sync run` command :

Checkpoint - Reached 🎉

We have a fully operational E2E test tool that can interact with an existing Airbity instance and running sources or destinations.
The CI commands and predefined configs allow us to run integration tests for specific source-destination combinations.
In this state, we can already cover such cases:

Incompatible with other connectors (integration compatibility check)
Incompatible with Airbyte core (core compatibility check)

Stage 6. Benchmark - In progress 🏗️

Integrate the benchmark framework with the testing tool

Stage 5. Autonomous run - Done ✔️

Extend the core to handle autonomous instances
Add possibility to up local Airbyte instance
Add possibility to up source/destination instances (Common logic with implementation few the most popular source/destination connectors)
Use normalization by default
Implement autonomous Postgres destination instance
Add GA by pushing the project into the docker hub
Publish the project docker image
Pull the image in the GAs instead of the image build
Integrate the tool with main repository GAs

Stage 5.1. Implement destination containers

Implement autonomous MySql destination instance
Implement autonomous Oracle destination instance
Implement autonomous MsSql destination instance
Implement autonomous MariaDb destination instance
...

Stage 7. Test data generation on the fly

Extend source/destination handler by testing data population methods
Design test data config files
Implement

Stage 8. Result comparison

To detect possible issues in the new version, we should compare the results of the current version and the new version's results. If we don't expect any changes in the result, the structure and data should be equal.
Note! Some changes lead to different results (like fixes). In this case, we will accept a flag like diff_is_expected.

Add the possibility to run a few different versions and collect their results
Implement common comparison logic

Checkpoint

Here we have an automated testing tool that can be scheduled on CI tasks or run on demand with different configurations and data sets.
The main advantage of the tool is true E2E. Such testing guarantee that we validate the whole system before a version release.

The text was updated successfully, but these errors were encountered:

DoNotPanicUA · 2022-08-02T18:54:04Z

@alexandr-shegeda
Please review

alexandr-shegeda · 2022-08-02T19:38:14Z

@DoNotPanicUA all looks good, the only suggestion is to move Stage 8. Test data population closer to 1-2 stages

DoNotPanicUA · 2022-08-02T20:05:55Z

@DoNotPanicUA all looks good, the only suggestion is to move Stage 8. Test data population closer to 1-2 stages

This step means filling test data using config files. The tool will generate data on the fly.
Before automatization and local run, we will prepare test data manually and reuse it.
I will rephrase a bit to make it more clear.

grishick · 2022-08-04T18:26:13Z

Tagging @bleonard, @sherifnada and @davinchia for review

grishick · 2022-08-05T15:33:05Z

I like the approach. Please file Github issues for the first stage and include @davinchia and me as reviewers when creating PRs.

evantahler · 2022-08-05T15:59:25Z

Some suggestions:

Use the Octavia CLI! We have a CLI tool for setting up sources, destinations, and syncs. It might be helpful. This repo (https://github.com/airbytehq/airflow-summit-airbyte-2022) has some examples of automating the octavia CLI within Github Actions CI.
For setting up sample data, maybe source-faker can help - This source produces N "user", "purchase", and "product" records. They can be randomly seeded or with a fixed seed to always produce the same data.

alafanechere · 2022-10-03T14:08:04Z

Use the Octavia CLI!

+1 , using the CLI will reduce the maintenance burden in the case of Airbyte API evolutions: the CLI is responsible for adapting to Airbyte API changes.

DoNotPanicUA · 2022-10-03T15:12:17Z

I've inspected the possibility of using Octavia CLI as part of the solution. I don't see a good integration between the E2E testing tool and Octavia CLI.
But I assume that when I finish the original architecture and list the main use cases, we can decrease the tool's flexibility and reuse some other modules to improve the nonfunctional aspects of the tool.

DoNotPanicUA added type/enhancement New feature or request Epic needs-triage team/connectors-java labels Aug 1, 2022

DoNotPanicUA self-assigned this Aug 1, 2022

DoNotPanicUA changed the title ~~[DRAFT] [EPIC] E2E testing tool~~ [DRAFT] 📜 [EPIC] E2E testing tool Aug 1, 2022

DoNotPanicUA changed the title ~~[DRAFT] 📜 [EPIC] E2E testing tool~~ [DRAFT] [EPIC] E2E testing tool Aug 1, 2022

DoNotPanicUA changed the title ~~[DRAFT] [EPIC] E2E testing tool~~ [EPIC] E2E testing tool Aug 2, 2022

alexandr-shegeda removed the needs-triage label Aug 2, 2022

DoNotPanicUA mentioned this issue Aug 19, 2022

Support MariaDB Source #2011

Closed

grishick added the from/connector-ops label Sep 27, 2022

grishick removed the team/connectors-java label Oct 7, 2022

DoNotPanicUA removed their assignment Dec 27, 2022

evantahler closed this as completed Apr 3, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[EPIC] E2E testing tool #15152

[EPIC] E2E testing tool #15152

DoNotPanicUA commented Aug 1, 2022 •

edited

Loading

DoNotPanicUA commented Aug 2, 2022

alexandr-shegeda commented Aug 2, 2022

DoNotPanicUA commented Aug 2, 2022

grishick commented Aug 4, 2022

grishick commented Aug 5, 2022

evantahler commented Aug 5, 2022

alafanechere commented Oct 3, 2022 •

edited

Loading

DoNotPanicUA commented Oct 3, 2022

[EPIC] E2E testing tool #15152

[EPIC] E2E testing tool #15152

Comments

DoNotPanicUA commented Aug 1, 2022 • edited Loading

Tell us about the problem you're trying to solve

Describe the solution you’d like

Stage 1. POC - Done ✔️

Main flow diagram

Scenario example

Stage 2. Credential customization - Done ✔️

Stage 3. Run configuration - Done ✔️

Stage 4. Docker & CI - Done ✔️

Example List all scenarios command :

Example Help command :

Example Run sync command :

Example Fail sync run command :

Checkpoint - Reached 🎉

Stage 6. Benchmark - In progress 🏗️

Stage 5. Autonomous run - Done ✔️

Stage 5.1. Implement destination containers

Stage 7. Test data generation on the fly

Stage 8. Result comparison

Checkpoint

DoNotPanicUA commented Aug 2, 2022

alexandr-shegeda commented Aug 2, 2022

DoNotPanicUA commented Aug 2, 2022

grishick commented Aug 4, 2022

grishick commented Aug 5, 2022

evantahler commented Aug 5, 2022

alafanechere commented Oct 3, 2022 • edited Loading

DoNotPanicUA commented Oct 3, 2022

DoNotPanicUA commented Aug 1, 2022 •

edited

Loading

Example `List all scenarios` command :

Example `Help` command :

Example `Run sync` command :

Example `Fail sync run` command :

alafanechere commented Oct 3, 2022 •

edited

Loading