-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(concurrent cdk): Properly call set_initial_state() on the cursor that is initialized on the ClientSideIncrementalRecordFilterDecorator #310
Conversation
…n the ClientSideIncrementalRecordFilterDecorator instance
📝 WalkthroughWalkthroughThe pull request refines the initialization comments in the Changes
Sequence Diagram(s)sequenceDiagram
participant T as Test
participant CDS as ConcurrentDeclarativeSource
participant CIFD as ClientSideIncrementalRecordFilterDecorator
T ->> CDS: Initialize with manifest (client-side incremental enabled)
CDS ->> CIFD: Initialize cursor state
T ->> CDS: Retrieve stream
CDS -->> T: Return DefaultStream with proper cursor state
Possibly related PRs
Suggested labels
Suggested reviewers
✨ Finishing Touches
Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
unit_tests/sources/declarative/test_concurrent_declarative_source.py (1)
1653-1689
: LGTM! Comprehensive test coverage for client-side incremental cursor state.The test thoroughly verifies:
- Proper state initialization
- Correct type casting of components
- Accurate cursor value propagation
One suggestion: Would you consider adding a test case with a null/empty state to verify the behavior when no previous state exists? wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py
(1 hunks)unit_tests/sources/declarative/test_concurrent_declarative_source.py
(2 hunks)
⏰ Context from checks skipped due to timeout of 90000ms (8)
- GitHub Check: Check: 'source-pokeapi' (skip=false)
- GitHub Check: Check: 'source-the-guardian-api' (skip=false)
- GitHub Check: Check: 'source-shopify' (skip=false)
- GitHub Check: Check: 'source-hardcoded-records' (skip=false)
- GitHub Check: Pytest (All, Python 3.11, Ubuntu)
- GitHub Check: Pytest (Fast)
- GitHub Check: Pytest (All, Python 3.10, Ubuntu)
- GitHub Check: Analyze (python)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (2)
478-479
: LGTM! Clear documentation improvement.The comment split provides better clarity about the state initialization for StopConditionPaginationStrategyDecorator.
483-492
: LGTM! Good fix for client-side incremental filtering.The addition ensures proper state initialization for the ClientSideIncrementalRecordFilterDecorator cursor, which is crucial for semi-incremental streams using is_client_side_incremental to filter properly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. I'm fine with this fix for now. Soon we should have a plan to remove declarative cursors. Maybe it'll be easier when we won't use DeclarativeStream
anymore?
Also note that this seems weird to me as I would expect to have DatetimeBasedCursor
too as part of the isinstance
else we create a new cursor where set_initial_state
won't be called. Should we update this?
I think it will get easier, but I worry there. may still be some gaps during deprecation where we find out we're using
My guess here is that its actually synonymous with this code within airbyte-python-cdk/airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py Lines 1733 to 1754 in 126e233
So if it doesn't match those either
I think I'll investigate this separately because its a good callout, but I want to get this current PR in to unblock the community dev work we're slating in the coming days which depends on this. |
What
While I was scoping some of the custom cursor deprecation issues, I noticed that our client side semi incremental filtering might not be working correctly on the CDK. This issue fixes how we assign state to the cursor so that it filters records correctly
How
The issue arises because in
model_to_component_factory.py
, we actual instantiate a separateDatetimeBasedCursor
or any cursor when we do client side filtering. See this block of code https://github.com/airbytehq/airbyte-python-cdk/blob/main/airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py#L1547-L1569 .Because these are two separate instances, we need to call
set_initial_state()
on both. It feels a little odd to have to do this twice, but I was unsure if we made this separate for a reason so I wanted to keep the flow as close to before as possibleTested against
source-chargebee
+ CDK unit testsSummary by CodeRabbit