-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add CustomSchemaNormalization
#194
base: main
Are you sure you want to change the base?
Conversation
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
CustomSchemaNormalization
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a few nits and one question, otherwise seems pretty straightforward!
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Outdated
Show resolved
Hide resolved
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
Outdated
Show resolved
Hide resolved
airbyte_cdk/sources/declarative/models/declarative_component_schema.py
Outdated
Show resolved
Hide resolved
schema_normalization = ( | ||
TypeTransformer(SCHEMA_TRANSFORMER_TYPE_MAPPING[model.schema_normalization]) | ||
if isinstance(model.schema_normalization, SchemaNormalizationModel) | ||
else self._create_component_from_model(model.schema_normalization, config=config) # type: ignore[arg-type] # custom normalization model expected here |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably this CustomNormalization
component would need to already adhere or inherit from the TypeTransformer
interface/class?
And that's why we don't need to invoke the TypeTransformer
constructor?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, most of the time we use customtypetransformer
by registration some custom function, but generally speaking we need only transform
interface, I can add abstract class for this
airbyte-python-cdk/airbyte_cdk/sources/declarative/extractors/record_selector.py
Lines 110 to 116 in 3984559
self, records: Iterable[Mapping[str, Any]], schema: Optional[Mapping[str, Any]] | |
) -> Iterable[Mapping[str, Any]]: | |
if schema: | |
# record has type Mapping[str, Any], but dict[str, Any] expected | |
for record in records: | |
normalized_record = dict(record) | |
self.schema_normalization.transform(normalized_record, schema) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah I think we need to create an interface in the declarative package for this. I don't think we want users to be re-implementing TypeTransformer
for a couple reasons:
- It looks like it has a lot of method we don't actually care about
- We want to avoid the pattern of customers re-implementing concrete classes since it couples our internal implementation to customer usage w/ unexpected breaking changes
So we should just create a new TypeTransformer interface in low-code (i'll leave naming to you), and then I think RecordSelector.schema_normalization
will need to be changed to a union of these two classes interfaces. Once we have these added will 👍 the PR. thanks for making the other adjustments!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added AbstractTypeTransformer
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
Signed-off-by: Artem Inzhyyants <[email protected]>
📝 WalkthroughWalkthroughThis pull request introduces a new Changes
Sequence DiagramsequenceDiagram
participant RecordSelector
participant SchemaTransformer
participant CustomNormalization
RecordSelector->>SchemaTransformer: Check normalization type
alt Standard Normalization
SchemaTransformer-->>RecordSelector: Apply standard normalization
else Custom Normalization
RecordSelector->>CustomNormalization: Instantiate custom normalization
CustomNormalization-->>RecordSelector: Apply custom normalization strategy
end
Possibly Related PRs
Suggested Reviewers
Hey there! 👋 I noticed a few things that might be worth discussing:
Feel free to share your thoughts on these suggestions! 🚀 Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (8)
airbyte_cdk/sources/declarative/extractors/record_selector.py (2)
17-17
: Consider clarifying the import usage.You’re importing
TypeTransformer
here but it’s not referenced until much later. Would consolidating related imports help readability and keep them closer to the usage site, wdyt?
Line range hint
2012-2015
: Add a safety check for unsupported modes.In this conditional expression, we assume
model.schema_normalization
must be eitherSchemaNormalizationModel
or a custom type. Would it help to explicitly handle unknown modes (e.g., raising a descriptive exception), to avoid silent misconfigurations, wdyt?airbyte_cdk/sources/declarative/models/declarative_component_schema.py (2)
Line range hint
187-189
: Confirm the naming alignment.We’re aliasing
CustomSchemaNormalization
toCustomSchemaNormalizationModel
here. Would it be clearer if we kept consistent naming across all imports, e.g., dropping “Model” to preserve brevity and clarity, wdyt?
1532-1536
: Question about the default.The
Field
default forschema_normalization
is set toSchemaNormalization.None_
. Did you intend to freely toggle betweenNone
and a custom transformer? If so, do you think an explicitNone
might be clearer, wdyt?airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (2)
500-500
: Request more clarify around custom component creation.You’re adding
CustomSchemaNormalizationModel
tocreate_custom_component
. Could we ensure it enforces an interface likeTypeTransformer
? This might reduce future confusion, wdyt?
2012-2015
: Evaluate future extension for merging transformations.
TypeTransformer
or aCustomSchemaNormalizationModel
is chosen. One day we might combine standard transformations with custom transformations. Is that something you’d consider supporting in your logic, wdyt?airbyte_cdk/sources/declarative/declarative_component_schema.yaml (2)
670-691
: Alphabetical order for consistency.We introduced
CustomSchemaNormalization
near line 670. Could we keep alphabetical ordering (like other custom components) to make searching simpler, wdyt?
2580-2584
: Improve clarity around “anyOf” usage.We allow either a standard
SchemaNormalization
orCustomSchemaNormalization
inRecordSelector
. Would clarifying in the schema that “None” is the baseline default help reduce confusion for integrators, wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
airbyte_cdk/sources/declarative/declarative_component_schema.yaml
(2 hunks)airbyte_cdk/sources/declarative/extractors/record_selector.py
(1 hunks)airbyte_cdk/sources/declarative/models/declarative_component_schema.py
(2 hunks)airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py
(6 hunks)
🔇 Additional comments (2)
airbyte_cdk/sources/declarative/models/declarative_component_schema.py (1)
Line range hint
100-102
: Validate usage ofCustomStateMigration
.Would it be useful to verify that
CustomStateMigration
indeed inherits from a recognized migration interface or base class to prevent runtime errors, wdyt?airbyte_cdk/sources/declarative/parsers/model_to_component_factory.py (1)
447-451
: Suggest verifying mapping coverage.You’ve reintroduced
SCHEMA_TRANSFORMER_TYPE_MAPPING
forSchemaNormalizationModel
. Would it be safer to confirm all possible enum values inSchemaNormalizationModel
are covered here to avoid mismatches, wdyt?
Signed-off-by: Artem Inzhyyants <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
airbyte_cdk/sources/declarative/extractors/type_transformer.py (2)
10-11
: Using @DataClass on an abstract class
Seems elegant. Are you sure you need dataclass features for an ABC that currently has no fields? If a future extension is planned, this is fine, otherwise a simple ABC might suffice. wdyt?
35-55
: Potential return of the transformed record
Currently, the method does not return a record, but modifies it in place. Would returning a new record be clearer, or is in-place mutation the intended design? wdyt?airbyte_cdk/sources/declarative/extractors/record_selector.py (1)
13-13
: Importing AbstractTypeTransformer
Great to see the usage of AbstractTypeTransformer. Any thoughts on adding a brief mention in the class docstring to clarify that both TypeTransformer and AbstractTypeTransformer can be provided? wdyt?
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
airbyte_cdk/sources/declarative/extractors/__init__.py
(1 hunks)airbyte_cdk/sources/declarative/extractors/record_selector.py
(2 hunks)airbyte_cdk/sources/declarative/extractors/type_transformer.py
(1 hunks)
🔇 Additional comments (3)
airbyte_cdk/sources/declarative/extractors/__init__.py (2)
12-12
: Good addition of AbstractTypeTransformer import.
Nice step to unify type transformation strategies. Would it be helpful to add a short comment here indicating its primary usage? wdyt?
15-15
: Exporting AbstractTypeTransformer in all
This makes it publicly available, which is great. Might be worth ensuring that external users are guided toward this new abstraction. wdyt?airbyte_cdk/sources/declarative/extractors/record_selector.py (1)
37-37
: Union type for schema_normalization
Allowing both TypeTransformer and AbstractTypeTransformer is flexible. Are there any potential pitfalls with inconsistent method signatures between them? Maybe reinforcing type hints or documentation could help. wdyt?
What
CustomSchemaNormalization
to declarative schemaSummary by CodeRabbit
New Features
Improvements
RecordSelector
to support custom normalization approaches.Technical Updates