Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add protocol methods to protocol source #17

Open
wants to merge 2 commits into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
@@ -0,0 +1,28 @@
# Protocol Methods

The purpose of this doc is to define the interface on the docker containers that implement the Airbyte Protocol. It is purposefully lean, as we describe the protocol at length in [our docs](https://docs.airbyte.com/understanding-airbyte/airbyte-protocol). We just want to define the interface here, so that as it evolves, we can version the change as part of the protocol version.

## Method Interfaces

We describe these interfaces in pseudocode for clarity. Clarifications on the pseudocode semantics:
* Any `Stream~ that is mentioned as input arg, is passed to the docker contained via STDIN.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like the ~ is throwing me off here, I was expecting it to mean something, maybe just empty brackets or filled with a generic Type would be clearer?

Stream<>
Stream<T>
Stream<...>

* All other parameters are passed in as command line args (e.g. --config <path to config file>).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe a controversial opinion - I think trying to represent both the stdin/stdout values and the method parameters in the same step is confusing.

Maybe stating that the method returns an I/O stream and define what that I/O accepts and returns as a secondary step?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW, I also think this is a problem in our other docs describing these methods

* Each input parameter is described as its type (as defined in airbyte_protocol.yml and the name of the parameter).
* If an input parameter has no name, then it is passed via STDIN.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this holds up, or I am misunderstanding it.

Reading this I expected all signatures to be like

methodName(argName: ArgType)
// or
methodName(ArgType argName)

And then there would be a distinction for args passed via stdin which would only be type


In addition to the return types mentioned below, all methods can return the following message types: `AirbyteLogMessage | AirbyteTraceMessage`.
Comment on lines +12 to +13
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 I missed this note at first


Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A note about additional arguments should be ignored and not validated should be here somewhere

**interface of both source and destination**
```
spec() -> Stream<AirbyteConnectorSpecification>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth making these objects links into the protocol.yaml file so it is easy to lookup the AirbyteConnectorSpecification, etc or having an appendix at the bottom with links to them?

check(Config, Optional<ConfiguredCatalog>) -> Stream<AirbyteConnectionStatus>
```
**source only**
```
discover(Config) -> AirbyteCatalog
read(Config, ConfiguredAirbyteCatalog, State) -> Stream<AirbyteRecordMessage | AirbyteStateMessage | AirbyteControlMessage>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this method-style representation is a bit misleading. The flag argument isn't --ConfiguredAirbyteCatalog it is --catalog. We also don't explain if are passing the object itself (e.g. stringified JSON) or a file path.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps a way to keep this method-like signature could be:

read(config -> File<Config>, catalog -> File<ConfiguredAirbyteCatalog>, state -> File<State>) -> Stream<...>

... but now I'm just making things up.

I think JSONSchema for this might work better for this:

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "additionalProperties": true // this is important now
  "required": ["config", "catalog"] // showing that state is optional
  "arguments": {
    "config": { "type": "file_path", "$ref": Config.yaml },
    "catalog": { "type": "file_path", "$ref": ConfiguredAirbyteCatalog.yaml },
    "state": { "type": "file_path", "$ref": State.yaml },
  }
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here's a real READ command for reference:

docker run --rm -v $(pwd)/secrets:/secrets -v $(pwd)/integration_tests:/integration_tests airbyte/source-faker:dev read --config /secrets/config.json --catalog /integration_tests/configured_catalog.json

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think the JSON schema version is a little clearer, maybe there's a way to designate STDIN/STDOUT parameters from arguments to the method call?

```
**destination only**
```
write(Config config, AirbyteCatalog catalog, Stream<AirbyteMessage>) -> Stream<AirbyteStateMessage | AirbyteControlMessage?
```