Skip to content

Commit

Permalink
Integrate the DpathEnhancingExtractor in the UI of Airbyte.
Browse files Browse the repository at this point in the history
Created a DPath Enhancing Extractor
Refactored the record enhancement logic - moved to the extracted class
Split the tests of DPathExtractor and DPathEnhancingExtractor

Fix the failing tests:

FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_custom_components[test_create_custom_component_with_subcomponent_that_uses_parameters]
FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_custom_components_do_not_contain_extra_fields
FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_parse_custom_component_fields_if_subcomponent
FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_page_increment
FAILED unit_tests/sources/declarative/parsers/test_model_to_component_factory.py::test_create_offset_increment
FAILED unit_tests/sources/file_based/test_file_based_scenarios.py::test_file_based_read[simple_unstructured_scenario]
FAILED unit_tests/sources/file_based/test_file_based_scenarios.py::test_file_based_read[no_file_extension_unstructured_scenario]

They faile because of comparing string and int values of the page_size (public) attribute.
Imposed an invariant:
  on construction, page_size can be set to a string or int
  keep only values of one type in page_size for uniform comparison (convert the values of the other type)
  _page_size holds the internal / working value
... unless manipulated directly.

Merged:
feat(low-code concurrent): Allow async job low-code streams that are incremental to be run by the concurrent framework (airbytehq#228)
fix(low-code): Fix declarative low-code state migration in SubstreamPartitionRouter (airbytehq#267)
feat: combine slash command jobs into single job steps (airbytehq#266)
feat(low-code): add items and property mappings to dynamic schemas (airbytehq#256)
feat: add help response for unrecognized slash commands (airbytehq#264)
ci: post direct links to html connector test reports (airbytehq#252) (airbytehq#263)
fix(low-code): Fix legacy state migration in SubstreamPartitionRouter (airbytehq#261)
fix(airbyte-cdk): Fix RequestOptionsProvider for PerPartitionWithGlobalCursor (airbytehq#254)
feat(low-code): add profile assertion flow to oauth authenticator component (airbytehq#236)
feat(Low-Code Concurrent CDK): Add ConcurrentPerPartitionCursor (airbytehq#111)
fix: don't mypy unit_tests (airbytehq#241)
fix: handle backoff_strategies in CompositeErrorHandler (airbytehq#225)
feat(concurrent cursor): attempt at clamping datetime (airbytehq#234)
fix(airbyte-cdk): Fix RequestOptionsProvider for PerPartitionWithGlobalCursor (airbytehq#254)
feat(low-code): add profile assertion flow to oauth authenticator component (airbytehq#236)
feat(Low-Code Concurrent CDK): Add ConcurrentPerPartitionCursor (airbytehq#111)
fix: don't mypy unit_tests (airbytehq#241)
fix: handle backoff_strategies in CompositeErrorHandler (airbytehq#225)
feat(concurrent cursor): attempt at clamping datetime (airbytehq#234)
ci: use `ubuntu-24.04` explicitly (resolves CI warnings) (airbytehq#244)
Fix(sdm): module ref issue in python components import (airbytehq#243)
feat(source-declarative-manifest): add support for custom Python components from dynamic text input (airbytehq#174)
chore(deps): bump avro from 1.11.3 to 1.12.0 (airbytehq#133)
docs: comments on what the `Dockerfile` is for (airbytehq#240)
chore: move ruff configuration to dedicated ruff.toml file (airbytehq#237)
Fix(sdm): module ref issue in python components import (airbytehq#243)
feat(low-code): add DpathFlattenFields (airbytehq#227)
feat(source-declarative-manifest): add support for custom Python components from dynamic text input (airbytehq#174)
chore(deps): bump avro from 1.11.3 to 1.12.0 (airbytehq#133)
docs: comments on what the `Dockerfile` is for (airbytehq#240)
chore: move ruff configuration to dedicated ruff.toml file (airbytehq#237)
  • Loading branch information
rpopov committed Jan 26, 2025
1 parent 4d95444 commit 9dece24
Show file tree
Hide file tree
Showing 81 changed files with 6,915 additions and 941 deletions.
61 changes: 58 additions & 3 deletions .github/workflows/connector-tests.yml
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ concurrency:
jobs:
cdk_changes:
name: Get Changes
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
permissions:
statuses: write
pull-requests: read
Expand Down Expand Up @@ -62,7 +62,7 @@ jobs:
# Forked PRs are handled by the community_ci.yml workflow
# If the condition is not met the job will be skipped (it will not fail)
# runs-on: connector-test-large
runs-on: ubuntu-latest
runs-on: ubuntu-22.04
timeout-minutes: 360 # 6 hours
strategy:
fail-fast: false
Expand Down Expand Up @@ -96,6 +96,8 @@ jobs:
name: "Check: '${{matrix.connector}}' (skip=${{needs.cdk_changes.outputs['src'] == 'false' || needs.cdk_changes.outputs[matrix.cdk_extra] == 'false'}})"
permissions:
checks: write
contents: write # Required for creating commit statuses
pull-requests: read
steps:
- name: Abort if extra not changed (${{matrix.cdk_extra}})
id: no_changes
Expand Down Expand Up @@ -123,6 +125,26 @@ jobs:
repository: airbytehq/airbyte
ref: master
path: airbyte
- name: Set up Python
uses: actions/setup-python@v5
with:
python-version: "3.10"
# Create initial pending status for test report
- name: Create Pending Test Report Status
if: steps.no_changes.outputs.status != 'cancelled'
env:
GH_TOKEN: ${{ secrets.GH_PAT_MAINTENANCE_OCTAVIA }}
run: |
HEAD_SHA="${{ github.event.pull_request.head.sha || github.sha }}"
gh api \
--method POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
repos/${{ github.repository }}/statuses/$HEAD_SHA \
-f state="pending" \
-f description="Running connector tests..." \
-f context="${{ matrix.connector }} Test Report"
- name: Test Connector
if: steps.no_changes.outputs.status != 'cancelled'
timeout-minutes: 90
Expand All @@ -131,7 +153,7 @@ jobs:
POETRY_DYNAMIC_VERSIONING_BYPASS: "0.0.0"
run: |
cd airbyte
make tools.airbyte-ci-binary.install
make tools.airbyte-ci-dev.install
airbyte-ci \
--ci-report-bucket-name=airbyte-ci-reports-multi \
connectors \
Expand Down Expand Up @@ -169,6 +191,39 @@ jobs:
echo "success=${success}" >> $GITHUB_OUTPUT
echo "html_report_url=${html_report_url}" >> $GITHUB_OUTPUT
# Update the test report status with results
- name: Update Test Report Status
if: always() && steps.no_changes.outputs.status != 'cancelled' && steps.evaluate_output.outcome == 'success'
env:
GH_TOKEN: ${{ secrets.GH_PAT_MAINTENANCE_OCTAVIA }}
run: |
HEAD_SHA="${{ github.event.pull_request.head.sha || github.sha }}"
gh api \
--method POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
repos/${{ github.repository }}/statuses/$HEAD_SHA \
-f state="${{ steps.evaluate_output.outputs.success == 'true' && 'success' || 'failure' }}" \
-f target_url="${{ steps.evaluate_output.outputs.html_report_url }}" \
-f description="Click Details to view the test report" \
-f context="${{ matrix.connector }} Test Report"
# Create failure status if report generation failed
- name: Create Report Generation Failed Status
if: always() && steps.no_changes.outputs.status != 'cancelled' && steps.evaluate_output.outcome != 'success'
env:
GH_TOKEN: ${{ secrets.GH_PAT_MAINTENANCE_OCTAVIA }}
run: |
HEAD_SHA="${{ github.event.pull_request.head.sha || github.sha }}"
gh api \
--method POST \
-H "Accept: application/vnd.github+json" \
-H "X-GitHub-Api-Version: 2022-11-28" \
repos/${{ github.repository }}/statuses/$HEAD_SHA \
-f state="failure" \
-f description="Failed to run connector tests." \
-f context="${{ matrix.connector }} Test Report"
# Upload the job output to the artifacts
- name: Upload Job Output
id: upload_job_output
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/pdoc_preview.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,7 +8,7 @@ on:

jobs:
preview_docs:
runs-on: ubuntu-latest
runs-on: ubuntu-24.04

steps:
- name: Checkout code
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/pdoc_publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ concurrency:

jobs:
publish_docs:
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
environment:
name: "github-pages"
url: ${{ steps.deployment.outputs.page_url }}
Expand Down
8 changes: 4 additions & 4 deletions .github/workflows/pypi_publish.yml
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ on:
jobs:
build:
name: Build Python Package
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
- name: Detect Release Tag Version
if: startsWith(github.ref, 'refs/tags/v')
Expand Down Expand Up @@ -107,7 +107,7 @@ jobs:

publish_cdk:
name: Publish CDK version to PyPI
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
needs: [build]
permissions:
id-token: write
Expand Down Expand Up @@ -156,7 +156,7 @@ jobs:
(github.event_name == 'workflow_dispatch' &&
github.event.inputs.publish_to_dockerhub == 'true'
)
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
needs: [build]
environment:
name: DockerHub
Expand Down Expand Up @@ -257,7 +257,7 @@ jobs:
env:
VERSION: ${{ needs.build.outputs.VERSION }}
IS_PRERELEASE: ${{ needs.build.outputs.IS_PRERELEASE }}
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
- uses: actions/setup-python@v5
with:
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/pytest_fast.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ on:
jobs:
test-build:
name: Build and Inspect Python Package
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
- name: Checkout code
uses: actions/checkout@v4
Expand All @@ -36,7 +36,7 @@ jobs:
pytest-fast:
name: Pytest (Fast)
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
# Common steps:
- name: Checkout code
Expand Down
6 changes: 3 additions & 3 deletions .github/workflows/python_lint.yml
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ on:
jobs:
ruff-lint-check:
name: Ruff Lint Check
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
# Common steps:
- name: Checkout code
Expand All @@ -32,7 +32,7 @@ jobs:

ruff-format-check:
name: Ruff Format Check
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
# Common steps:
- name: Checkout code
Expand All @@ -55,7 +55,7 @@ jobs:

mypy-check:
name: MyPy Check
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
# Common steps:
- name: Checkout code
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/release_drafter.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
permissions:
contents: write
pull-requests: write
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
# Drafts the next Release notes as Pull Requests are merged into "main"
- uses: release-drafter/release-drafter@v6
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/semantic_pr_check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ permissions:
jobs:
validate_pr_title:
name: Validate PR title
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
- uses: amannn/action-semantic-pull-request@v5
if: ${{ github.event.pull_request.draft == false }}
Expand Down
36 changes: 34 additions & 2 deletions .github/workflows/slash_command_dispatch.yml
Original file line number Diff line number Diff line change
Expand Up @@ -8,11 +8,13 @@ jobs:
slashCommandDispatch:
# Only allow slash commands on pull request (not on issues)
if: ${{ github.event.issue.pull_request }}
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
- name: Slash Command Dispatch
id: dispatch
uses: peter-evans/slash-command-dispatch@v4
# TODO: Revert to `peter-evans/slash-command-dispatch@v4` after PR merges:
# - https://github.com/peter-evans/slash-command-dispatch/pull/372/files
uses: aaronsteers/slash-command-dispatch@aj/fix/add-dispatched-bool-output
with:
repository: ${{ github.repository }}
token: ${{ secrets.GH_PAT_MAINTENANCE_OCTAVIA }}
Expand All @@ -36,3 +38,33 @@ jobs:
comment-id: ${{ github.event.comment.id }}
body: |
> Error: ${{ steps.dispatch.outputs.error-message }}
- name: Generate help text
id: help
if: >
startsWith(github.event.comment.body, '/') &&
!steps.dispatch.outputs.dispatched
run: |
HELP_TEXT="The following slash commands are available:
- \`/autofix\` - Corrects any linting or formatting issues
- \`/test\` - Runs the test suite
- \`/poetry-lock\` - Re-locks dependencies and updates the poetry.lock file
- \`/help\` - Shows this help message"
if [[ "${{ github.event.comment.body }}" == "/help" ]]; then
echo "body=$HELP_TEXT" >> $GITHUB_OUTPUT
else
echo "body=It looks like you are trying to enter a slash command. Either the slash command is unrecognized or you don't have access to call it.
$HELP_TEXT" >> $GITHUB_OUTPUT
fi
- name: Post help message
if: >
startsWith(github.event.comment.body, '/') &&
!steps.dispatch.outputs.dispatched
uses: peter-evans/create-or-update-comment@v4
with:
comment-id: ${{ github.event.comment.id }}
body: ${{ steps.help.outputs.body }}
6 changes: 3 additions & 3 deletions .github/workflows/test-command.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ on:
jobs:
start-workflow:
name: Append 'Starting' Comment
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
- name: Get PR JSON
id: pr-info
Expand Down Expand Up @@ -127,7 +127,7 @@ jobs:
log-success-comment:
name: Append 'Success' Comment
needs: [pytest-on-demand]
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
- name: Append success comment
uses: peter-evans/create-or-update-comment@v4
Expand All @@ -143,7 +143,7 @@ jobs:
# This job will only run if the workflow fails
needs: [pytest-on-demand, start-workflow]
if: always() && needs.pytest-on-demand.result == 'failure'
runs-on: ubuntu-latest
runs-on: ubuntu-24.04
steps:
- name: Append failure comment
uses: peter-evans/create-or-update-comment@v4
Expand Down
7 changes: 7 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
@@ -1,3 +1,10 @@
# This Dockerfile is used to build `airbyte/source-declarative-manifest` image that in turn is used
# 1. to build Manifest-only connectors themselves
# 2. to run manifest (Builder) connectors published into a particular user's workspace in Airbyte
#
# A new version of source-declarative-manifest is built for every new Airbyte CDK release, and their versions are kept in sync.
#

FROM docker.io/airbyte/python-connector-base:3.0.0@sha256:1a0845ff2b30eafa793c6eee4e8f4283c2e52e1bbd44eed6cb9e9abd5d34d844

WORKDIR /airbyte/integration_code
Expand Down
3 changes: 2 additions & 1 deletion airbyte_cdk/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -92,7 +92,7 @@
from .sources.declarative.declarative_stream import DeclarativeStream
from .sources.declarative.decoders import Decoder, JsonDecoder
from .sources.declarative.exceptions import ReadException
from .sources.declarative.extractors import DpathExtractor, RecordSelector
from .sources.declarative.extractors import DpathEnhancingExtractor, DpathExtractor, RecordSelector
from .sources.declarative.extractors.record_extractor import RecordExtractor
from .sources.declarative.extractors.record_filter import RecordFilter
from .sources.declarative.incremental import DatetimeBasedCursor
Expand Down Expand Up @@ -234,6 +234,7 @@
"DefaultPaginator",
"DefaultRequestOptionsProvider",
"DpathExtractor",
"DpathEnhancingExtractor",
"FieldPointer",
"HttpMethod",
"HttpRequester",
Expand Down
6 changes: 6 additions & 0 deletions airbyte_cdk/cli/source_declarative_manifest/_run.py
Original file line number Diff line number Diff line change
Expand Up @@ -171,6 +171,12 @@ def create_declarative_source(
"Invalid config: `__injected_declarative_manifest` should be provided at the root "
f"of the config but config only has keys: {list(config.keys() if config else [])}"
)
if not isinstance(config["__injected_declarative_manifest"], dict):
raise ValueError(
"Invalid config: `__injected_declarative_manifest` should be a dictionary, "
f"but got type: {type(config['__injected_declarative_manifest'])}"
)

return ConcurrentDeclarativeSource(
config=config,
catalog=catalog,
Expand Down
1 change: 1 addition & 0 deletions airbyte_cdk/connector_builder/connector_builder_handler.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,6 +52,7 @@ def get_limits(config: Mapping[str, Any]) -> TestReadLimits:
def create_source(config: Mapping[str, Any], limits: TestReadLimits) -> ManifestDeclarativeSource:
manifest = config["__injected_declarative_manifest"]
return ManifestDeclarativeSource(
config=config,
emit_connector_builder_messages=True,
source_config=manifest,
component_factory=ModelToComponentFactory(
Expand Down
Loading

0 comments on commit 9dece24

Please sign in to comment.