-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(low-code cdk): add component resolver and http component resolver #88
Conversation
Warning Rate limit exceeded@lazebnyi has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 22 minutes and 31 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📝 Walkthrough📝 Walkthrough📝 Walkthrough📝 WalkthroughWalkthroughThe pull request introduces several changes across multiple files within the Airbyte CDK. Key modifications include updates to the Changes
Possibly related PRs
Suggested reviewers
Wydt about reaching out to these reviewers for their insights? Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 5
🧹 Outside diff range and nitpick comments (19)
airbyte_cdk/sources/declarative/partition_routers/__init__.py (1)
9-9
: LGTM! Consider grouping related imports?The changes look good! The new
PartitionRouter
import and its addition to__all__
are well-structured.Hey, what do you think about grouping the router imports together? We could move the base
PartitionRouter
import before its implementations for better readability, wdyt? 😊from airbyte_cdk.sources.declarative.partition_routers.partition_router import PartitionRouter from airbyte_cdk.sources.declarative.partition_routers.cartesian_product_stream_slicer import CartesianProductStreamSlicer from airbyte_cdk.sources.declarative.partition_routers.list_partition_router import ListPartitionRouter from airbyte_cdk.sources.declarative.partition_routers.single_partition_router import SinglePartitionRouter from airbyte_cdk.sources.declarative.partition_routers.substream_partition_router import SubstreamPartitionRouterAlso applies to: 11-11
unit_tests/sources/declarative/resolvers/test_http_components_resolver.py (3)
1-11
: Consider adding module and class level docstrings?The file structure looks clean! Would you consider adding a module-level docstring to describe the purpose of these tests? This could help other developers understand the test coverage better. wdyt?
# # Copyright (c) 2024 Airbyte, Inc., all rights reserved. # +""" +Unit tests for the HttpComponentsResolver class. + +These tests verify the component resolution logic with different mapping configurations +and conditions, ensuring proper template population based on HTTP retriever responses. +""" + import pytest
13-52
: Test cases look good! Consider adding more edge cases?The parametrized test cases effectively cover basic scenarios. Would you consider adding a few more edge cases to strengthen the test coverage? Some suggestions:
- Empty component values
- Invalid value types
- Complex nested conditions
- Multiple records in retriever_data
What do you think about adding descriptions for each test case using ids in parametrize? This could make test failures more descriptive:
@pytest.mark.parametrize( "components_mapping, retriever_data, stream_template_config, expected_result", [ ( # ... existing test case 1 ), ( # ... existing test case 2 ), ], + ids=[ + "multiple_keys_with_conditions", + "single_key_update", + ] )
53-75
: Test implementation looks solid! Want to add some additional assertions?The test implementation is clean and well-structured! A few suggestions to make it even more robust:
- Maybe we could verify that the retriever was called with the expected parameters?
- How about adding assertions for the number of times the retriever was called?
- Would it make sense to verify the type of the resolved values?
Here's what that might look like:
# Assert the resolved components match the expected result assert result == expected_result + # Verify retriever interaction + mock_retriever.read_records.assert_called_once() + # Verify types of resolved values + for resolved_item in result: + for key, value in resolved_item.items(): + if value is not None: + assert isinstance(value, str)airbyte_cdk/sources/declarative/concurrent_declarative_source.py (1)
184-185
: Consider enhancing documentation for stream configuration handlingThe code handles stream configurations carefully, but would it be helpful to add some documentation about:
- How dynamic streams interact with the concurrent processing model
- Any limitations or considerations when using dynamic streams with concurrent processing
- Examples of valid stream configurations
This could help future maintainers understand the interaction between dynamic streams and concurrent processing. wdyt? 📝
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (4)
2707-2753
: The ComponentMappingDefinition looks good, but would you consider enhancing the documentation?The component is well-structured, but a few additions could make it even more user-friendly:
- Could we add examples of common use cases for the
condition
field? For instance, filtering based on date ranges or status values.- What happens when type inference fails for
value_type
? Should we document the fallback behavior?What do you think about adding these clarifications to help users better understand the component's behavior?
2754-2778
: The HttpComponentsResolver looks solid, but could we enhance the documentation for production readiness?Since this is an experimental component that makes HTTP calls, would you consider adding documentation about:
- Error handling: How does it handle network failures or invalid responses?
- Retry behavior: Are there built-in retries for failed requests?
- Caching: Is there any caching mechanism to prevent redundant requests?
These additions would help users understand the component's behavior in production scenarios. What are your thoughts?
2779-2790
: The DynamicDeclarativeStream looks promising, but could we add more implementation guidance?For this experimental component, would you consider adding:
- Documentation about stream naming conventions and potential conflicts
- Clarification about state management for dynamic streams
- Examples of common use cases (e.g., multi-tenant scenarios, dynamic API endpoints)
These additions would help users understand when and how to use this component effectively. What do you think?
22-25
: Consider adding version information for experimental featuresThe dynamic_streams array is well-integrated into the main schema. Since this is part of experimental features, would you consider:
- Adding a comment about the minimum CDK version required
- Including a note about potential breaking changes in future versions
- Adding a link to migration guides (when available)
This would help users make informed decisions about adopting these experimental features. Thoughts?
airbyte_cdk/sources/declarative/resolvers/components_resolver.py (2)
35-35
: Is using '@deprecated' appropriate for marking an experimental class?The
@deprecated
decorator is typically used to indicate that a class or method is obsolete and may be removed in future versions. SinceComponentsResolver
is experimental rather than deprecated, perhaps using a custom warning or simply adding a note in the docstring might be clearer. Would you consider adjusting this to more accurately reflect its experimental status? WDYT?
21-21
: Is an empty string the appropriate default for 'condition'?The
condition
field is set to an empty string by default, but its type allows forNone
(Optional[...]
). Would setting the default toNone
be more appropriate, especially if an empty string isn't a meaningful condition? WDYT?Also applies to: 32-32
airbyte_cdk/sources/declarative/resolvers/http_components_resolver.py (1)
21-21
: Consider using a custom warning instead of@deprecated
Currently, the
@deprecated
decorator is used to indicate thatHttpComponentsResolver
is experimental:@deprecated("This class is experimental. Use at your own risk.", category=ExperimentalClassWarning)
Since
@deprecated
typically signals that a feature is outdated and should be avoided because it may be removed in future versions, perhaps it's clearer to use a custom warning or decorator to mark experimental classes. This could prevent confusion for users who might think the class is being phased out rather than being in an experimental phase. Wdyt?airbyte_cdk/sources/declarative/manifest_declarative_source.py (1)
305-307
: Appending dynamic stream configs—any concerns about duplicates?By extending
stream_configs
with dynamic stream configurations, is there a possibility of duplicate stream names causing conflicts? Should we consider handling or warning about duplicates to prevent potential issues? Wdyt?airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3)
1056-1064
: Consider usingNone
as the default value forcondition
fieldCurrently, the
condition
field inComponentMappingDefinition
has a default value of an empty string""
. Since it's declared asOptional[str]
, would setting the default toNone
improve consistency with other optional fields? Wdyt?
1784-1784
: Add a description tocomponents_mapping
fieldFor clarity and consistency, should we add a
Field
with a description to thecomponents_mapping
attribute inHttpComponentsResolver
? Wdyt?
1792-1795
: Correct the description ofcomponents_resolver
fieldThe current description is "Component resolve and populates stream templates with components values." Perhaps it should be "Component that resolves and populates stream templates with component values." Wdyt?
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)
1295-1319
: Should we support other retriever models in_build_stream_slicer_from_partition_router
?Currently, the method checks if the model is an instance of
SimpleRetrieverModel
. Would it make sense to extend this to supportAsyncRetrieverModel
orCustomRetrieverModel
for greater flexibility? WDYT?
2207-2221
: Would adding a docstring tocreate_components_mapping_definition
improve clarity?Providing a docstring can help future maintainers understand the purpose and usage of this method. What do you think?
2222-2255
: Consider adding a docstring tocreate_http_components_resolver
for better maintainability.Adding a docstring could enhance readability and assist others in understanding the functionality of this method. WDYT?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (11)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py
(1 hunks)airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(2 hunks)airbyte_cdk/sources/declarative/manifest_declarative_source.py
(3 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(3 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(7 hunks)airbyte_cdk/sources/declarative/partition_routers/__init__.py
(1 hunks)airbyte_cdk/sources/declarative/resolvers/__init__.py
(1 hunks)airbyte_cdk/sources/declarative/resolvers/components_resolver.py
(1 hunks)airbyte_cdk/sources/declarative/resolvers/http_components_resolver.py
(1 hunks)unit_tests/sources/declarative/resolvers/__init__.py
(1 hunks)unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
(1 hunks)
✅ Files skipped from review due to trivial changes (2)
- airbyte_cdk/sources/declarative/resolvers/init.py
- unit_tests/sources/declarative/resolvers/init.py
🔇 Additional comments (13)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py (2)
184-185
: Implementation looks solid! 👍
The change to use self._stream_configs()
is clean and maintains all the existing thread-safety guarantees while enabling dynamic stream support. Nice work!
184-185
: Verify stream configuration handling with dynamic streams
The change to use self._stream_configs()
aligns well with the PR's objective of supporting dynamic streams. However, we should ensure this works correctly with concurrent processing.
A few questions to consider:
- How does this handle the case when dynamic streams are added/removed during runtime?
- Could there be any race conditions between stream configuration resolution and concurrent processing?
wdyt about adding some debug logging here to track stream configuration changes? 🤔
Let's verify the stream configuration handling:
✅ Verification successful
No race conditions found in stream configuration handling 🎯
After analyzing the code, I can confirm that the change to use self._stream_configs()
is safe in concurrent processing. The method is called during stream initialization before any concurrent processing begins, and its results are used to create a mapping of stream configurations. The concurrent processing only starts after this initialization is complete.
A few observations that support this:
- Stream configs are resolved once during initialization
- The mapping is created before any concurrent stream processing begins
- No shared state modifications during stream config handling
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Check for potential race conditions or inconsistencies in stream handling
# Look for any existing debug/error logging related to stream configuration
rg -A 3 "stream.*config" --type py
# Check for any existing race condition handling
rg -A 3 "concurrent|parallel|race|atomic" --type py
# Look for any similar pattern in other declarative sources
ast-grep --pattern 'self._stream_configs($manifest, $config)'
Length of output: 116614
Script:
#!/bin/bash
# Let's check how stream configuration is handled in concurrent processing
# Look for any concurrent stream configuration handling
ast-grep --pattern 'def _stream_configs($manifest, $config)'
# Check for any race condition handling in stream configuration
rg -A 5 'self._stream_configs.*config' --type py
# Look for any concurrent stream processing related to configuration
rg -A 5 'concurrent.*stream.*config' --type py
Length of output: 2395
airbyte_cdk/sources/declarative/resolvers/components_resolver.py (1)
20-20
: Is the use of 'InitVar' for 'parameters' intentional?
In both ComponentMappingDefinition
and ResolvedComponentMappingDefinition
, the parameters
field is defined as an InitVar
, which means it won't be stored as an instance attribute after initialization. Is this your intention, or should parameters
be accessible as an instance variable? WDYT?
Also applies to: 31-31
airbyte_cdk/sources/declarative/manifest_declarative_source.py (4)
33-34
: Importing COMPONENTS_RESOLVER_TYPE_MAPPING
—looks good!
The import statement correctly adds COMPONENTS_RESOLVER_TYPE_MAPPING
, which is needed for component resolution.
124-124
: Updated method call with new config
parameter—great job!
The call to _stream_configs
now includes the config
parameter, aligning with the updated method signature.
299-301
: Adding config
parameter to _stream_configs
—are all usages updated?
The _stream_configs
method now accepts config
as a parameter. Have we ensured that all calls to this method are updated accordingly? Also, are there any subclasses or overrides that might need adjustment? Wdyt?
327-331
:
Safely accessing 'type' in components_resolver_config
—what do you think?
We access components_resolver_config["type"]
directly, which might lead to a KeyError
if 'type' is missing. Should we use get()
to handle this more gracefully? Wdyt?
Suggested change:
-resolver_type = components_resolver_config["type"]
+resolver_type = components_resolver_config.get("type")
if not resolver_type:
raise ValueError(
f"Missing 'type' in components resolver configuration: {components_resolver_config}"
)
Likely invalid or redundant comment.
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
1372-1372
: Addition of dynamic_streams
field to DeclarativeSource
looks good
Including dynamic_streams
enhances the flexibility of the declarative schema by allowing dynamic stream configurations.
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (5)
33-36
: LGTM!
The new imports for HttpComponentsResolver
and ComponentMappingDefinition
are correctly added.
284-289
: LGTM!
The model imports for HttpComponentsResolverModel
and ComponentMappingDefinitionModel
are properly added.
304-304
: LGTM!
The import statement for PartitionRouter
and related slicers looks good.
475-476
: LGTM!
Adding the new models to PYDANTIC_MODEL_TO_CONSTRUCTOR
is appropriate.
1320-1329
: Is it intentional to create a ResumableFullRefreshCursor
only when stream_slicer
is None
?
In _build_resumable_cursor_from_paginator
, the method returns a ResumableFullRefreshCursor
only if there's no stream_slicer
. Should we consider handling cases where both a paginator and a stream_slicer
are present to ensure all scenarios are covered? WDYT?
unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
Outdated
Show resolved
Hide resolved
airbyte_cdk/sources/declarative/resolvers/components_resolver.py
Outdated
Show resolved
Hide resolved
airbyte_cdk/sources/declarative/resolvers/http_components_resolver.py
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make sense to have one integration tests where we provide a manifest and run a small read operation on it? I already started on branch maxi297/playing-with-dynamic-streams
(see this compare) in order to make sure I understood properly and I got blocked on the check
method. I would like to see a test for read as well.
I'm almost wondering if we need a test with a parent stream as well to confirm that this would be working. I know there are some specifics like _initialize_cache_for_parent_streams
being called to set up caching and things like that.
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Outdated
Show resolved
Hide resolved
airbyte_cdk/sources/declarative/resolvers/components_resolver.py
Outdated
Show resolved
Hide resolved
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (8)
airbyte_cdk/sources/declarative/manifest_declarative_source.py (1)
311-352
: Consider adding debug logging and extracting validation logic.The method implementation looks solid! A couple of suggestions to consider:
Would it be helpful to add debug logging similar to what we have in
_stream_configs
? This could help with troubleshooting dynamic stream resolution.The validation logic (lines 317-335) could potentially be extracted into a separate method for better readability. Wdyt?
Here's a possible refactor approach:
def _dynamic_stream_configs( self, manifest: Mapping[str, Any], config: Mapping[str, Any] ) -> List[Dict[str, Any]]: dynamic_stream_definitions: List[Dict[str, Any]] = manifest.get("dynamic_streams", []) dynamic_stream_configs: List[Dict[str, Any]] = [] + def _validate_resolver_config(resolver_config: Dict[str, Any]) -> None: + if not resolver_config: + raise ValueError( + f"Missing 'components_resolver' in dynamic definition: {dynamic_definition}" + ) + + resolver_type = resolver_config.get("type") + if not resolver_type: + raise ValueError( + f"Missing 'type' in components resolver configuration: {resolver_config}" + ) + + if resolver_type not in COMPONENTS_RESOLVER_TYPE_MAPPING: + raise ValueError( + f"Invalid components resolver type '{resolver_type}'. " + f"Expected one of {list(COMPONENTS_RESOLVER_TYPE_MAPPING.keys())}." + ) for dynamic_definition in dynamic_stream_definitions: components_resolver_config = dynamic_definition["components_resolver"] + self.logger.debug(f"Processing dynamic stream definition: {dynamic_definition}") + _validate_resolver_config(components_resolver_config) - if not components_resolver_config: - raise ValueError( - f"Missing 'components_resolver' in dynamic definition: {dynamic_definition}" - ) - - resolver_type = components_resolver_config.get("type") - if not resolver_type: - raise ValueError( - f"Missing 'type' in components resolver configuration: {components_resolver_config}" - ) - - if resolver_type not in COMPONENTS_RESOLVER_TYPE_MAPPING: - raise ValueError( - f"Invalid components resolver type '{resolver_type}'. " - f"Expected one of {list(COMPONENTS_RESOLVER_TYPE_MAPPING.keys())}." - )airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3)
1034-1063
: Would adding validation examples help future maintainers? wdyt?The
ComponentMappingDefinition
class looks good, but consider adding docstring examples showing validation scenarios, like:
- Invalid field paths
- Type mismatches
- Parameter validation
This would help other developers understand the expected behavior and constraints.
1811-1820
: Consider enhancing the field documentation for better clarity?The
HttpComponentsResolver
class looks solid, but thecomponents_mapping
field could use more detailed documentation explaining:
- The relationship between retriever and mapping
- Expected mapping structure
- Error handling behavior
This would make it easier for developers to understand how to use this class correctly.
Line range hint
1362-1423
: Would adding migration guidance help users upgrading their code?The split of
DeclarativeSource
into two variants (DeclarativeSource1
andDeclarativeSource2
) is a good approach for type safety. Consider adding:
- Migration guide for users upgrading from the old version
- Examples showing when to use each variant
- Documentation explaining the rationale behind the split
This would help users understand which variant to use in their specific case.
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (4)
1295-1318
: Consider adding type hints for the return value.The method implementation looks good, but adding a return type hint would improve code clarity and maintainability. What do you think about adding
-> Optional[PartitionRouter]
?def _build_stream_slicer_from_partition_router( self, model: Union[AsyncRetrieverModel, CustomRetrieverModel, SimpleRetrieverModel], config: Config, - ): + ) -> Optional[PartitionRouter]:
1320-1329
: Consider adding docstring for clarity.The method looks good but could benefit from a docstring explaining its purpose and return value. What do you think about adding one?
def _build_resumable_cursor_from_paginator( self, model: Union[AsyncRetrieverModel, CustomRetrieverModel, SimpleRetrieverModel], stream_slicer: Optional[StreamSlicer], ) -> Optional[StreamSlicer]: + """ + Builds a resumable cursor from a paginator if needed. + + Args: + model: The retriever model + stream_slicer: The current stream slicer + + Returns: + A ResumableFullRefreshCursor if the model has a paginator and no stream slicer, + otherwise returns None + """
2225-2258
: Consider adding error handling for invalid configurations.The implementation looks good but might benefit from some defensive programming. What do you think about adding validation for required components and proper error messages?
def create_http_components_resolver( self, model: HttpComponentsResolverModel, config: Config ) -> Any: + if not model.retriever: + raise ValueError("HTTP components resolver requires a retriever") + if not model.components_mapping: + raise ValueError("HTTP components resolver requires at least one component mapping") stream_slicer = self._build_stream_slicer_from_partition_router(model.retriever, config)Also, consider being more specific with the return type instead of
Any
:- ) -> Any: + ) -> HttpComponentsResolver:
Line range hint
1330-1374
: Consider breaking down the complex merge logic.The
_merge_stream_slicers
method handles multiple cases and could benefit from being broken down into smaller, more focused methods. What do you think about extracting the cursor creation logic into separate methods?For example:
def _create_global_substream_cursor( self, incremental_sync_model, stream_slicer: StreamSlicer, config: Config ) -> GlobalSubstreamCursor: cursor_component = self._create_component_from_model( model=incremental_sync_model, config=config ) return GlobalSubstreamCursor( stream_cursor=cursor_component, partition_router=stream_slicer ) def _create_partition_with_global_cursor( self, incremental_sync_model, stream_slicer: StreamSlicer, config: Config ) -> PerPartitionWithGlobalCursor: cursor_component = self._create_component_from_model( model=incremental_sync_model, config=config ) return PerPartitionWithGlobalCursor( cursor_factory=CursorFactory( lambda: self._create_component_from_model( model=incremental_sync_model, config=config ), ), partition_router=stream_slicer, stream_cursor=cursor_component, )
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (8)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py
(1 hunks)airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(3 hunks)airbyte_cdk/sources/declarative/manifest_declarative_source.py
(4 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(4 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(7 hunks)airbyte_cdk/sources/declarative/resolvers/components_resolver.py
(1 hunks)airbyte_cdk/sources/declarative/resolvers/http_components_resolver.py
(1 hunks)unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- airbyte_cdk/sources/declarative/concurrent_declarative_source.py
- airbyte_cdk/sources/declarative/resolvers/components_resolver.py
- airbyte_cdk/sources/declarative/resolvers/http_components_resolver.py
- unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
🔇 Additional comments (8)
airbyte_cdk/sources/declarative/manifest_declarative_source.py (3)
33-34
: LGTM! Import statement is well-placed.
The import of COMPONENTS_RESOLVER_TYPE_MAPPING
is appropriately positioned with related imports.
241-242
: LGTM! Validation logic properly handles both stream types.
The validation now correctly checks for either static streams or dynamic streams, which aligns well with the new dynamic stream capability.
124-127
:
Consider protecting against multiple calls to _stream_configs.
Hey! I noticed we're extending the stream configs directly. Since _stream_configs
might be called multiple times, we could end up adding the same dynamic streams repeatedly. What do you think about creating a deep copy of the configs before combining them? Wdyt?
Here's a suggested approach:
- stream_configs = self._stream_configs(self._source_config) + self._dynamic_stream_configs(
- self._source_config, config
- )
+ stream_configs = deepcopy(self._stream_configs(self._source_config)) + deepcopy(self._dynamic_stream_configs(
+ self._source_config, config
+ ))
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)
33-36
: LGTM! Clean import organization.
The new imports for HttpComponentsResolver and ComponentMappingDefinition are properly grouped with related resolvers.
475-476
: LGTM! Factory mapping registration.
The new components are properly registered in the PYDANTIC_MODEL_TO_CONSTRUCTOR mapping.
2208-2223
: LGTM! Well-structured component mapping definition.
The implementation properly handles:
- Interpolated string creation for values
- Field path interpolation
- Value type conversion
- Parameter passing
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)
11-15
: LGTM! The root schema changes look good.
The addition of anyOf
constraint and dynamic_streams
property maintains backward compatibility while adding support for dynamic streams. Nice work on ensuring that either streams
or dynamic_streams
must be present!
Also applies to: 26-29
2711-2756
: Consider moving the experimental warning to the beginning of the description?
The experimental nature of this component is quite important for users to know upfront. Would you consider moving "This component is experimental. Use at your own risk." to the start of the description? This way, users will see it immediately before reading about the component's functionality, wdyt?
airbyte_cdk/sources/declarative/models/declarative_component_schema.py
Outdated
Show resolved
Hide resolved
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Outdated
Show resolved
Hide resolved
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (6)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (3)
1040-1069
: Consider adding value type validation?The
ComponentMappingDefinition
looks good! One thought - would it make sense to add a validator to ensure the providedvalue_type
matches the actual type of thevalue
when specified? This could help catch type mismatches early, wdyt?@validator('value_type') def validate_value_type(cls, v, values): if v and 'value' in values: # Add type validation logic here pass return v
Line range hint
1368-1429
: Consider using inheritance to reduce duplication?The current implementation splits the source definition into two separate models with a lot of duplicate fields. What do you think about using inheritance to share common fields between
DeclarativeSource1
andDeclarativeSource2
? Something like:class BaseDeclarativeSource(BaseModel): type: Literal["DeclarativeSource"] check: CheckStream version: str schemas: Optional[Schemas] = None definitions: Optional[Dict[str, Any]] = None spec: Optional[Spec] = None # ... other common fields class DeclarativeSource1(BaseDeclarativeSource): streams: List[DeclarativeStream] dynamic_streams: Optional[List[DynamicDeclarativeStream]] = None class DeclarativeSource2(BaseDeclarativeSource): streams: Optional[List[DeclarativeStream]] = None dynamic_streams: List[DynamicDeclarativeStream]This could make the code more maintainable and reduce the chance of inconsistencies when updating common fields. What are your thoughts?
1829-1850
: Add more detailed documentation for dynamic streams?The implementation looks solid! Would it be helpful to add more detailed documentation about:
- How the
stream_template
is used- The lifecycle of component resolution
- Examples of common use cases
This could help other developers understand and use these features effectively. What do you think?
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (3)
2711-2756
: Two suggestions for the ComponentMappingDefinition schema.
Would you consider moving the experimental warning to the start of the description for better visibility? This is consistent with the pattern used in other components.
There seems to be an inconsistency in the field_path property - it uses
interpolation_content
while other properties useinterpolation_context
. Should we align these? wdyt?
2757-2781
: LGTM! One minor suggestion for HttpComponentsResolver.The component is well-defined with proper required fields and references. Would you consider moving the experimental warning to the start of the description for better visibility? wdyt?
2782-2800
: Two suggestions for the DynamicDeclarativeStream schema.
Would you consider moving the experimental warning to the start of the description for better visibility?
The schema is missing the
required
field. Given thatstream_template
andcomponents_resolver
seem essential for this component to work, should we add:required: - type - stream_template - components_resolverWhat are your thoughts on this?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (2)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(3 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(10 hunks)
🔇 Additional comments (1)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
11-15
: LGTM! The root schema changes look well-structured.
The addition of anyOf
to enforce either streams
or dynamic_streams
must be present is a good approach. The dynamic_streams
property is properly defined as an array of DynamicDeclarativeStream
references.
Also applies to: 26-29
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🧹 Outside diff range and nitpick comments (7)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)
1295-1319
: Consider adding docstring to explain the method's purposeThe new helper method
_build_stream_slicer_from_partition_router
looks well-structured, but could benefit from documentation explaining its role and return value semantics. What do you think about adding a docstring?def _build_stream_slicer_from_partition_router( self, model: Union[AsyncRetrieverModel, CustomRetrieverModel, SimpleRetrieverModel], config: Config, ) -> Optional[PartitionRouter]: + """ + Builds a stream slicer from the partition router configuration if one exists. + + Args: + model: The retriever model containing partition router configuration + config: The connector configuration + + Returns: + PartitionRouter if the model has a partition_router configured, None otherwise + """
1320-1329
: Consider adding docstring and improving variable namingThe
_build_resumable_cursor_from_paginator
method could be clearer with documentation and more descriptive variable names. What do you think about these suggestions?def _build_resumable_cursor_from_paginator( self, model: Union[AsyncRetrieverModel, CustomRetrieverModel, SimpleRetrieverModel], stream_slicer: Optional[StreamSlicer], ) -> Optional[StreamSlicer]: + """ + Creates a resumable cursor for pagination if needed. + + Args: + model: The retriever model containing paginator configuration + stream_slicer: Existing stream slicer if any + + Returns: + ResumableFullRefreshCursor if paginator exists and no stream slicer is present, None otherwise + """ - if hasattr(model, "paginator") and model.paginator and not stream_slicer: + has_paginator = hasattr(model, "paginator") and model.paginator + if has_paginator and not stream_slicer:
2207-2258
: Consider adding error handling for empty components_mappingThe
create_http_components_resolver
method looks solid, but might benefit from validation of the components_mapping list. What do you think about adding a check?def create_http_components_resolver( self, model: HttpComponentsResolverModel, config: Config ) -> Any: + if not model.components_mapping: + raise ValueError("components_mapping cannot be empty for HttpComponentsResolver") + stream_slicer = self._build_stream_slicer_from_partition_router(model.retriever, config)Also, consider being more specific with the return type instead of
Any
:-) -> Any: +) -> HttpComponentsResolver:airbyte_cdk/sources/declarative/declarative_component_schema.yaml (4)
2713-2713
: Consider moving the experimental warning to the start of the description?The experimental warning might be more visible if it's at the beginning of the description, similar to how it's done in other components. wdyt?
2759-2759
: Consider moving the experimental warning to the start of the description?The experimental warning might be more visible if it's at the beginning of the description, similar to how it's done in other components. wdyt?
2759-2760
: Consider enhancing the component description?The current description is quite brief. Would it be helpful to add more details about:
- When to use this component
- How it interacts with other components
- Example usage scenarios
This could help users better understand its purpose and proper usage. wdyt?
2784-2784
: Consider moving the experimental warning to the start of the description?The experimental warning might be more visible if it's at the beginning of the description, similar to how it's done in other components. wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (3)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(3 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(7 hunks)unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
🔇 Additional comments (4)
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (3)
33-36
: LGTM: Clean import organization
The new imports for HttpComponentsResolver
and ComponentMappingDefinition
are properly grouped with related imports.
Line range hint 1330-1374
: LGTM: Well-structured stream slicer merging logic
The refactored _merge_stream_slicers
method has good separation of concerns and handles different cursor types appropriately. The early returns and clear conditionals make the flow easy to follow.
2208-2223
: LGTM: Well-implemented component mapping definition
The create_components_mapping_definition
method properly handles interpolation and type conversion. The use of list comprehension for field_path transformation is clean and efficient.
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
11-15
: LGTM! Good validation addition
The anyOf constraint ensures either streams
or dynamic_streams
must be present, which is a good validation to have.
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Outdated
Show resolved
Hide resolved
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
airbyte_cdk/sources/declarative/manifest_declarative_source.py (2)
125-127
: Consider defensive copying for stream configurationsHey! To prevent any potential side effects from list modifications, what do you think about using defensive copying here? Something like:
- stream_configs = self._stream_configs(self._source_config) + self._dynamic_stream_configs( - self._source_config, config - ) + stream_configs = list(self._stream_configs(self._source_config)) + list(self._dynamic_stream_configs( + self._source_config, config + ))This ensures we're working with new list instances. Wdyt? 🤔
325-330
: Consider enhancing error message clarityThe error message could be more helpful by including the available resolver types in the error message itself. What do you think about something like:
- raise ValueError( - f"Missing 'type' in components resolver configuration: {components_resolver_config}" - ) + available_types = list(COMPONENTS_RESOLVER_TYPE_MAPPING.keys()) + raise ValueError( + f"Missing 'type' in components resolver configuration: {components_resolver_config}. " + f"Available types are: {available_types}" + )airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
2208-2225
: Consider refining type annotations to eliminatetype: ignore
?In the
create_components_mapping_definition
method, there's a# type: ignore[arg-type]
comment for thefield_path
parameter. Perhaps adjusting the type annotations can remove the need for this comment, enhancing type safety and code clarity. Wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (6)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py
(1 hunks)airbyte_cdk/sources/declarative/manifest_declarative_source.py
(4 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(8 hunks)airbyte_cdk/sources/declarative/resolvers/components_resolver.py
(1 hunks)airbyte_cdk/sources/declarative/resolvers/http_components_resolver.py
(1 hunks)unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (4)
- airbyte_cdk/sources/declarative/concurrent_declarative_source.py
- airbyte_cdk/sources/declarative/resolvers/components_resolver.py
- airbyte_cdk/sources/declarative/resolvers/http_components_resolver.py
- unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
🔇 Additional comments (5)
airbyte_cdk/sources/declarative/manifest_declarative_source.py (3)
42-42
: LGTM! Import follows existing patterns.
The new import for COMPONENTS_RESOLVER_TYPE_MAPPING
is well-placed within the existing import structure.
241-242
: LGTM! Validation logic properly handles both stream types
The validation correctly checks for either static or dynamic streams, aligning with the new dynamic stream capability.
344-346
: Verify template processing security
Could we verify that the template processing is secure against injection attacks?
airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)
2226-2259
: Great implementation of the create_http_components_resolver
method!
The create_http_components_resolver
method is well-crafted and integrates the component creation smoothly. Nice work!
Line range hint 1296-1375
: The refactoring of stream slicer construction looks solid
Splitting _merge_stream_slicers
into _build_stream_slicer_from_partition_router
and _build_resumable_cursor_from_paginator
improves modularity and readability. Great job!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Outside diff range and nitpick comments (3)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (3)
2713-2713
: Consider moving the experimental warning to the startThe experimental nature of this component is important information that might be missed in the middle of the description. Would you consider moving it to the start? wdyt?
- description: (This component is experimental. Use at your own risk.) Specifies a mapping definition to update or add fields in a record or configuration. This allows dynamic mapping of data by interpolating values into the template based on provided contexts. + description: | + THIS COMPONENT IS EXPERIMENTAL. USE AT YOUR OWN RISK. + + Specifies a mapping definition to update or add fields in a record or configuration. This allows dynamic mapping of data by interpolating values into the template based on provided contexts.
2759-2759
: Consider moving the experimental warning to the startSimilar to ComponentMappingDefinition, would you consider moving the experimental warning to the start of the description for better visibility? wdyt?
- description: Component resolve and populates stream templates with components fetched via an HTTP retriever. (This component is experimental. Use at your own risk.) + description: | + THIS COMPONENT IS EXPERIMENTAL. USE AT YOUR OWN RISK. + + Component resolve and populates stream templates with components fetched via an HTTP retriever.
2784-2784
: Consider moving the experimental warning to the startSimilar to other components, would you consider moving the experimental warning to the start of the description for better visibility? wdyt?
- description: A component that described how will be created declarative streams based on stream template. (This component is experimental. Use at your own risk.) + description: | + THIS COMPONENT IS EXPERIMENTAL. USE AT YOUR OWN RISK. + + A component that describes how declarative streams will be created based on stream template.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(4 hunks)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)
1332-1332
: LGTM: Enhanced field_path flexibility with interpolation context
The addition of interpolation context to field_path allows for dynamic configuration, which is a nice improvement.
2726-2729
:
Consider enhancing field_path validation
The items schema for field_path looks incorrect. Should it be:
field_path:
type: array
items:
- - type: string
+ type: string
This matches the pattern used in other components. wdyt?
Likely invalid or redundant comment.
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
airbyte_cdk/sources/declarative/manifest_declarative_source.py (3)
241-242
: Enhance validation error message clarityThe validation now correctly checks for either static or dynamic streams, but the error message only mentions
streams
. Should we update it to be more descriptive? Wdyt?- raise ValidationError(f"A valid manifest should have at least one stream defined. Got {streams}") + raise ValidationError(f"A valid manifest should have at least one stream defined (either in 'streams' or 'dynamic_streams'). Got streams: {streams}, dynamic_streams: {dynamic_streams}")
311-355
: Consider breaking down the method for better maintainabilityThe method handles multiple responsibilities (validation, resolver creation, stream configuration). Would it make sense to extract some of these into helper methods? For example:
def _validate_resolver_config(self, components_resolver_config: Dict[str, Any]) -> str: """Validates resolver config and returns resolver type""" if not components_resolver_config: raise ValueError(...) resolver_type = components_resolver_config.get("type") if not resolver_type: raise ValueError(...) if resolver_type not in COMPONENTS_RESOLVER_TYPE_MAPPING: raise ValueError(...) return resolver_typeThis could make the code more maintainable and testable. Wdyt?
337-338
: Document caching behavior for dynamic streamsThe code enables caching for all retrievers in dynamic streams, which is great for performance. Should we add a comment explaining this behavior and its implications? Something like:
+ # Enable caching for all retrievers in dynamic streams to improve performance + # when the same component needs to be resolved multiple times if "retriever" in components_resolver_config: components_resolver_config["retriever"]["requester"]["use_cache"] = True
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
airbyte_cdk/sources/declarative/manifest_declarative_source.py
(4 hunks)
🔇 Additional comments (1)
airbyte_cdk/sources/declarative/manifest_declarative_source.py (1)
124-127
: Consider thread-safety implications of concatenating stream configs
The concatenation of static and dynamic streams looks clean, but since streams()
can be called multiple times, we should ensure thread-safety. Would it make sense to create a deep copy of the configs before concatenation to prevent any potential side effects? Wdyt?
/autofix
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🧹 Outside diff range and nitpick comments (1)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (1)
2711-2756
: Consider enhancing the experimental warning visibilityThe experimental warning is currently embedded in the description. Would you consider moving it to the start for better visibility? This matches the pattern used in other experimental components. wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (1)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(4 hunks)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)
11-15
: LGTM! Clean implementation of the dynamic streams feature
The schema changes elegantly handle both traditional and dynamic streams by:
- Using
anyOf
to require eitherstreams
ordynamic_streams
- Defining
dynamic_streams
as an array ofDynamicDeclarativeStream
references
Also applies to: 26-29
1332-1332
: LGTM! Enhanced field path interpolation capabilities
The addition of config
, components_values
, and stream_template_config
to the interpolation context enables more flexible path extraction.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (4)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (4)
1034-1063
: LGTM! The ComponentMappingDefinition model looks well-structured.The model is well documented with clear examples and type definitions. I particularly like the flexibility provided by the optional value_type field that allows for both explicit and inferred typing.
One thought - would it make sense to add validation to ensure that when value_type is specified, the provided value matches that type? wdyt?
@validator('value') def validate_value_type(cls, v, values): if 'value_type' in values and values['value_type']: # Add type validation logic here pass return v
1811-1820
: The HttpComponentsResolver implementation looks clean and focused.The model effectively combines a retriever with component mappings. The design is simple yet flexible enough to handle various resolution scenarios.
Quick thought - should we add validation to ensure components_mapping is not empty? wdyt?
@validator('components_mapping') def validate_components_mapping(cls, v): if not v: raise ValueError('components_mapping cannot be empty') return v
1822-1832
: The DynamicDeclarativeStream model looks well designed.Good job on making the stream_template and components_resolver required fields. This ensures that all necessary components are present for dynamic stream generation.
Have you considered adding an optional description field to help users understand the purpose of each dynamic stream? wdyt?
description: Optional[str] = Field( None, description="A description of what this dynamic stream represents and how it's configured.", )
Line range hint
1362-1423
: Smart approach to handling backward compatibility with DeclarativeSource.The use of Union type with DeclarativeSource1 and DeclarativeSource2 is a clean way to support both static and dynamic streams while maintaining backward compatibility.
One suggestion - would it be helpful to add a deprecation warning to DeclarativeSource1 to encourage migration to the newer format? wdyt?
class DeclarativeSource1(BaseModel): def __init__(self, **data): warnings.warn( "DeclarativeSource1 is deprecated. Please migrate to the new format supporting dynamic streams.", DeprecationWarning, stacklevel=2 ) super().__init__(**data)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm pre-emptively approving but I would like the feedback from @lmossman on this eventually. We have marked the interfaces as experimental and will keep them as such until it is fully integrated with the Connector Builder so let's not stop our progress for that and we can always fix them later if needs be
unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
Outdated
Show resolved
Hide resolved
unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
Outdated
Show resolved
Hide resolved
…hq/airbyte-python-cdk into lazebnyi/add-components-resolver
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Outside diff range and nitpick comments (3)
unit_tests/sources/declarative/test_manifest_declarative_source.py (2)
74-82
: Consider enhancing the docstring for better clarity, wdyt?The docstring could provide more context about the fixture's purpose and usage. For example:
- """Base manifest without streams or dynamic streams.""" + """ + Provides a base manifest configuration without streams or dynamic streams. + Used as a foundation for testing manifest validation scenarios. + """
83-134
: Consider extracting nested configurations into separate fixtures for better reusability, wdyt?The
declarative_stream_config
function contains deeply nested configurations that could be split into separate fixtures. This would improve maintainability and allow reuse in other test cases. For example:
requester_config
paginator_config
schema_loader_config
airbyte_cdk/sources/declarative/parsers/manifest_component_transformer.py (1)
Line range hint
11-91
: Would you consider some readability improvements? 🤔A couple of suggestions to make the code even more maintainable:
- What do you think about adding type hints to the mapping values? Something like:
DEFAULT_MODEL_TYPES: Mapping[str, Literal["DeclarativeStream", "HttpComponentsResolver", ...]] = {
- Maybe we could group related mappings together with comments? For example:
DEFAULT_MODEL_TYPES: Mapping[str, str] = { # Stream Components "DeclarativeStream.retriever": "SimpleRetriever", "DynamicDeclarativeStream.stream_template": "DeclarativeStream", # Resolver Components "HttpComponentsResolver.retriever": "SimpleRetriever", "HttpComponentsResolver.components_mapping": "ComponentMappingDefinition", # ... }What are your thoughts on these suggestions? They're not critical but might help with maintainability! 😊
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
📒 Files selected for processing (4)
airbyte_cdk/sources/declarative/concurrent_declarative_source.py
(1 hunks)airbyte_cdk/sources/declarative/parsers/manifest_component_transformer.py
(1 hunks)unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
(1 hunks)unit_tests/sources/declarative/test_manifest_declarative_source.py
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- airbyte_cdk/sources/declarative/concurrent_declarative_source.py
- unit_tests/sources/declarative/resolvers/test_http_components_resolver.py
🔇 Additional comments (4)
unit_tests/sources/declarative/test_manifest_declarative_source.py (2)
135-183
: LGTM! Well-structured fixture reusing _declarative_stream
The fixture effectively demonstrates the composition of a DynamicDeclarativeStream by reusing the _declarative_stream fixture as a template.
629-659
: LGTM! Well-structured test cases with clear scenarios
The test method effectively covers all validation scenarios:
- Fails when both streams and dynamic streams are missing
- Succeeds with regular streams
- Succeeds with dynamic streams
Good use of fixtures and clear assertions make the test cases easy to understand and maintain.
airbyte_cdk/sources/declarative/parsers/manifest_component_transformer.py (2)
Line range hint 11-91
: Clean integration with existing transformation logic! 👍
The new mappings integrate seamlessly with the existing DEFAULT_MODEL_TYPES, and the transformation logic will handle them automatically. The placement in DEFAULT_MODEL_TYPES (rather than CUSTOM_COMPONENTS_MAPPING) aligns well with their role as core components.
34-39
: LGTM! The new mappings look well-structured.
The mappings for DynamicDeclarativeStream and HttpComponentsResolver follow the established pattern and create a clear hierarchy. Would you like me to verify that all the referenced types exist in the codebase? 🤔
✅ Verification successful
All referenced types exist and are properly defined in the codebase! 🎉
The verification shows that all the types referenced in the new mappings are present:
DeclarativeStream
is defined inairbyte_cdk/sources/declarative/declarative_stream.py
HttpComponentsResolver
is defined inairbyte_cdk/sources/declarative/resolvers/http_components_resolver.py
SimpleRetriever
exists and is widely used throughout the codebaseComponentMappingDefinition
is defined inairbyte_cdk/sources/declarative/resolvers/components_resolver.py
The mappings follow the established pattern and create proper type relationships. Looking good! 👍
🏁 Scripts executed
The following scripts were executed for the analysis:
Script:
#!/bin/bash
# Description: Verify that all referenced types exist in the codebase
# Search for class definitions of the referenced types
echo "Searching for referenced types..."
for type in "DeclarativeStream" "HttpComponentsResolver" "SimpleRetriever" "ComponentMappingDefinition"; do
echo "Looking for $type..."
ast-grep --pattern "class $type"
done
Length of output: 1024
Script:
#!/bin/bash
# Let's try a different approach using ripgrep to search for these types
# This will help us find not just class definitions but also imports and type hints
echo "Searching for DeclarativeStream..."
rg "DeclarativeStream" -A 2
echo "\nSearching for HttpComponentsResolver..."
rg "HttpComponentsResolver" -A 2
echo "\nSearching for SimpleRetriever..."
rg "SimpleRetriever" -A 2
echo "\nSearching for ComponentMappingDefinition..."
rg "ComponentMappingDefinition" -A 2
Length of output: 67933
Script:
#!/bin/bash
# Let's specifically look for the HttpComponentsResolver and ComponentMappingDefinition classes
# since they weren't clearly visible in the previous results
echo "Looking for HttpComponentsResolver class definition..."
fd -e py | xargs grep -l "class HttpComponentsResolver"
echo "\nLooking for ComponentMappingDefinition class definition..."
fd -e py | xargs grep -l "class ComponentMappingDefinition"
echo "\nLooking for imports of ComponentMappingDefinition..."
rg "from.*import.*ComponentMappingDefinition"
Length of output: 1024
What
This PR add dynamic stream that can dynamically generate streams based on response.
Fixed: https://github.com/airbytehq/airbyte-internal-issues/issues/10787?issue=airbytehq%7Cairbyte-internal-issues%7C10788
How
Described schema, models and data classes for
DynamicDeclarativeStream
,HttpComponentsResolver
andComponentMappingDefinition
. Defined behavior for components resolving and managing parsing of the stream template. Integrated the dynamic stream handling into the existing framework, ensuring seamless operation alongside static streams.Summary by CodeRabbit
Release Notes
New Features
dynamic_streams
in the schema.HttpComponentsResolver
andComponentMappingDefinition
for enhanced component mapping and resolution.ManifestDeclarativeSource
.Improvements
DeclarativeSource
for more flexible configurations.PartitionRouter
in the public interface.Tests
HttpComponentsResolver
to validate component mapping resolution.ManifestDeclarativeSource
with new fixtures and validation cases.