-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: Async Retriever
change url path for download retriever
#192
Conversation
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Async Retriever
change url path for download retriever
airbyte_cdk/sources/declarative/requesters/http_job_repository.py
Outdated
Show resolved
Hide resolved
Signed-off-by: Artem Inzhyyants <[email protected]>
Async Retriever
change url path for download retrieverAsync Retriever
change url path for download retriever
📝 WalkthroughWalkthroughThis pull request introduces modifications across multiple files in the Airbyte CDK, focusing on enhancing record processing and stream slice handling. The changes primarily involve updating the Changes
Sequence DiagramsequenceDiagram
participant Factory as ModelToComponentFactory
participant Retriever as AsyncRetriever
participant Selector as RecordSelector
Factory->>Retriever: create_async_retriever()
Retriever->>Selector: Initialize with transformations
Selector-->>Retriever: Configured selector
Possibly related PRs
Suggested labels
Suggested reviewers
Hey there! 👋 I noticed these changes look quite interesting. Would you like me to elaborate on any specific aspect of the modifications? Wdyt about the sequence of changes? 🤔 Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (2)
airbyte_cdk/sources/types.py (1)
156-157
: Should we consider extending the docstring for clarity?
Currently, the method is straightforward, but it might help future readers if we explained that aStreamSlice
is considered truthy whenever its main or extra fields are non-empty. wdyt?unit_tests/sources/declarative/requesters/test_http_job_repository.py (1)
87-87
: Could we safeguard against missing 'url' inextra_fields
?
When referencing{{stream_slice.extra_fields['url']}}
, a KeyError could occur if'url'
is absent. Would it make sense to provide a default or fail gracefully? wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(1 hunks)airbyte_cdk/sources/declarative/requesters/http_job_repository.py
(1 hunks)airbyte_cdk/sources/types.py
(1 hunks)unit_tests/sources/declarative/requesters/test_http_job_repository.py
(1 hunks)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/requesters/http_job_repository.py (1)
192-197
: Any concerns about overwriting an existing 'url' inextra_fields
?
When mergingextra_fields
with{"url": url}
, the new key unconditionally overrides. If theextra_fields
dictionary already contained aurl
entry, it would be lost. Is this desired, or should we handle it differently? wdyt?airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
2257-2257
: Are transformations intended for all download operations?
We are passingtransformations=transformations
into theRecordSelector
for the download retriever. Should we allow users to configure a distinct transformations list exclusively for downloads? wdyt?
📝 WalkthroughWalkthroughThis pull request introduces modifications across multiple files in the Airbyte CDK, focusing on enhancing record processing and stream slice handling. The changes primarily affect the Changes
Sequence DiagramsequenceDiagram
participant Factory as ModelToComponentFactory
participant Selector as RecordSelector
participant Job as AsyncHttpJobRepository
participant Slice as StreamSlice
Factory->>Selector: Create with transformations
Job->>Slice: Construct with job parameters
Slice-->>Job: Provide context for record fetching
Possibly related PRs
Suggested labels
Suggested reviewers
Hey there! 👋 I noticed these changes look quite interesting. Would you like me to elaborate on any specific aspect of the modifications? The transformation handling and stream slice updates seem particularly intriguing. Wdyt? 🤔 Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
airbyte_cdk/sources/types.py (1)
155-157
: Would you consider expanding the docstring to explain the new boolean evaluation?Currently,
__bool__
returns true if either the main slice orextra_fields
is non-empty. It might be helpful to clarify this in the docstring or comment, so future maintainers understand why it’s deemed “truthy” if either portion is present. wdyt?airbyte_cdk/sources/declarative/requesters/http_job_repository.py (1)
192-197
: Would you consider verifying whether the “url” key already exists injob_slice.extra_fields
?Merging
"url"
intoextra_fields
might accidentally overwrite a name collision. Checking this in advance or documenting the assumption could prevent unexpected behavior. wdyt?airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
2257-2257
: Any interest in logging or clarifying transformations usage here?We’re now passing
transformations
to theSimpleRetriever
’sRecordSelector
. It might be good to describe in code comments how these transformations are applied when reading job results. wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(1 hunks)airbyte_cdk/sources/declarative/requesters/http_job_repository.py
(1 hunks)airbyte_cdk/sources/types.py
(1 hunks)unit_tests/sources/declarative/requesters/test_http_job_repository.py
(1 hunks)
🔇 Additional comments (1)
unit_tests/sources/declarative/requesters/test_http_job_repository.py (1)
87-87
: Could we handle missing “url” more gracefully?In
path="{{stream_slice.extra_fields['url']}}"
, a KeyError could arise if'url'
is absent fromextra_fields
. Perhaps we could add a default or an assertion? wdyt?
What
url
asextra_field
to ignore it in state managertransformations
to download retrieverCaution
changing url path in
stream_slice
for download retriever is technically a breaking change, but I don't want to bump major version sinceAsyncRetriever
is anExperimentalClass
Reason
see #192 (comment)
Summary by CodeRabbit
Release Notes
New Features
StreamSlice
class with boolean evaluation supportImprovements
The changes introduce more dynamic and flexible data processing capabilities within the Airbyte CDK, allowing for more nuanced record transformations and stream handling.