Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(data catalog): add head routes #1875

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 10 additions & 9 deletions docs/data_catalog/data_catalog_fabric_bff.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -85,19 +85,19 @@ Here can be found an example of configuration that assumes Fabric BFF and Job Ru

#### Console Communication

In order for Data Catalog UI to know which environments can be linked to [Mia Platform CRUD connections](#todo-crud-section),
In order for Data Catalog UI to know which environments can be linked to [Mia Platform CRUD connections](/data_catalog/frontend/data_catalog_connections.mdx#mia-platform-crud),
the service needs to contact Mia-Platform Console and retrieve the list of Projects that should be accessible from this Data Catalog instance.

To achieve so, it is first necessary that your Company Owner creates a dedicated [Service Account](/development_suite/identity-and-access-management/manage-service-accounts.md)
on your Mia-Platform Console instance and assign to it the proper permissions for listing the Console projects of interest.

:::tip Good practices in permissions assignment
Pay attention to the level of access to the resources that you assign to the Service Account.
For Control Plane use case, a good practice may be to assign the role of `guest` at Company level while granting
the `reporter` role to all the projects that should be visible by Control Plane.
For Data Catalog use case, a good practice may be to assign the role of `guest` at Company level while granting
the `reporter` role to all the projects that should be visible by Data Catalog.
About permissions assignment, it is possible to go even more granular in case you want to allow visibility only to a subset of runtime environments of a specific project.
In fact, to do that, you may opt to assign the role of `guest` even at Project level while granting
the `reporter` role solely to those runtime environments that should be visible by Control Plane.
the `reporter` role solely to those runtime environments that should be visible by Data Catalog.
:::

Once the service account has been registered, your Company Owner needs to hand over to you its credentials, which are:
Expand All @@ -115,7 +115,7 @@ These details then should be inserted in your Fabric BFF service configuration u
:::caution
It is responsibility of your Company Owner to ensure that service account credentials are properly processed according to your company security policies.

Furthermore, it is of <u>extreme importance</u> understanding that **any** Control Plane user will be able to list the project name
Furthermore, it is of <u>extreme importance</u> understanding that **any** Data Catalog user with enough permissions for [connections management](/data_catalog/frontend/data_catalog_connections.mdx) will be able to view the project name
and available environments of all the projects that can be accessed by the service account configured on Fabric BFF.
:::

Expand All @@ -125,10 +125,10 @@ This is and example of `console` property configuration:

:::tip
The following properties support [secret resolution](/fast_data/configuration/secrets_resolution.md):
- `console.target`
- `console.auth.credentials.clientId`
- `console.auth.credentials.clientKeyId`
- `console.auth.credentials.privateKey`
- `console.rest.target`
- `console.rest.auth.credentials.clientId`
- `console.rest.auth.credentials.clientKeyId`
- `console.rest.auth.credentials.privateKey`
:::

A custom x509 certificate can be added to the default root keychain of certificates for any client/reversed-proxy reached by Fabric BFF.
Expand Down Expand Up @@ -230,6 +230,7 @@ The following routes are exposed over the `/api/job-runner` endpoint and are for

| Route | Type | Method | Description |
|------------------------------|-----------|--------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `/feedback` | Websocket | HEAD | Route used by Websocket for authentication. |
| `/feedback` | Websocket | GET | Route for receiving feedback messages from the `Status` method of the [Job Runner gRPC service.](/data_catalog/data_catalog_job_runner.mdx#job-runner) |
| `/job-runner/*` | REST | * | Routes prefixed with `/job-runner` are converted into gRPC requests towards Job Runner service.<br/> For more details please read [corresponding documentation](/data_catalog/data_catalog_job_runner.mdx#grpc-services)|
| `/agent/drivers` | REST | GET | Route for invoking the `ListDrivers` method of the [ODBC Client gRPC service.](/data_catalog/data_catalog_job_runner.mdx#odbc-client) |
Expand Down
1 change: 1 addition & 0 deletions docs/data_catalog/data_catalog_open_lineage.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -217,6 +217,7 @@ The following routes are exposed over the `/api/data-catalog` endpoint.
|------------------------------------------------------------|-----------|--------|-----------------------------------------------------------------------------------------------------------|
| `/assets/search` | REST | GET | Search for dataset assets and their metadata |
| `/assets/search-parents` | REST | GET | Search for name of system of record or table name |
| `/bulk-actions` | Websocket | HEAD | Route used by Websocket for authentication. |
| `/bulk-actions` | Websocket | GET | Route for handling async operations over multiple datasets records |
| `/tags/count` | REST | GET | Count how many unique tags exists among all data assets |
| `/tags/items` | REST | GET | List existing tags associated to data assets |
Expand Down
14 changes: 8 additions & 6 deletions docs/data_catalog/secure_access.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -599,8 +599,8 @@ all the functionalities of Data Catalog system, both the frontend and backend co
| Endpoint | Service | Authentication Required | User Group Permission |
|---------------------|-----------------|:-----------------------:|-------------------------------------|
| `/api/connections` | fabric-bff | ✅ | `false` |
| `/api/data-catalog` | fabric-bff | ✅ | `permissions["update:bulk-action"]` |
| `/api/job-runner` | fabric-bff | ✅ | `permissions["admin:connections"]` |
| `/api/data-catalog` | fabric-bff | ✅ | `false` |
| `/api/job-runner` | fabric-bff | ✅ | `false` |
| `/api/open-lineage` | fabric-bff | ✅ | `false` |
| `/data-catalog` | data-catalog-fe | ✅ | `true` |

Expand Down Expand Up @@ -670,6 +670,7 @@ so that each operation is covered with the correct grant.
| `/metadata-registry/items/:name` | REST | PATCH | `permissions["update:metadata-assets"]` |
| `/metadata-registry/items/:name` | REST | DELETE | `permissions["update:metadata-assets"]` |
| `/metadata-registry/search` | REST | GET | `permissions["read:data-assets"]` |
| `/bulk-actions` | Websocket | HEAD | `permissions["update:bulk-action"]` |
| `/bulk-actions` | Websocket | GET | `permissions["update:bulk-action"]` |

#### Job Runner API
Expand All @@ -685,6 +686,7 @@ so that each operation is covered with the correct grant.
| `/agent/dsn` | REST | GET | `permissions["admin:connections"]` |
| `/agent/drivers` | REST | GET | `permissions["admin:connections"]` |
| `/feedback` | Websocket | GET | `permissions["admin:connections"]` |
| `/feedback` | Websocket | HEAD | `permissions["admin:connections"]` |

#### Open Lineage API

Expand All @@ -698,10 +700,10 @@ so that each operation is covered with the correct grant.
| `/jobs/items` | REST | POST | `permissions["update:lineage"]` |
| `/jobs/items/:id` | REST | PATCH | `permissions["update:lineage"]` |
| `/jobs/items/:id` | REST | DELETE | `permissions["update:lineage"]` |
| `/dataset/items/:id` | REST | GET | `permissions["read:data-assets"]` |
| `/dataset/items` | REST | POST | `permissions["update:lineage"]` |
| `/dataset/items/:id` | REST | PATCH | `permissions["update:lineage"]` |
| `/dataset/items/:id` | REST | DELETE | `permissions["update:lineage"]` |
| `/datasets/items/:id` | REST | GET | `permissions["read:data-assets"]` |
| `/datasets/items` | REST | POST | `permissions["update:lineage"]` |
| `/datasets/items/:id` | REST | PATCH | `permissions["update:lineage"]` |
| `/datasets/items/:id` | REST | DELETE | `permissions["update:lineage"]` |
| `/facets/storage/items/:id` | REST | GET | `permissions["read:data-assets"]` |
| `/facets/storage/items/:id` | REST | DELETE | `permissions["update:lineage"]` |

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -225,7 +225,7 @@ The **Advanced** tab is visible only if the Envoy API Gateway service is enabled
- **Rate limit** (_integer_): the maximum frequency (in terms of requests per second) with which requests are forwarded to the underlying service;
- **Request body size** (_decimal_): the maximum body size of user requests.
- **Iframe embedding options**: the X-Frame-Options directive that is considered when the endpoint response should be embedded in an iframe;
- **Protcol options**: this options instruct Envoy to process the request with the protocol coming from the downstream connection, allowing to dinamically infer the protocol to be used (HTTP/1.1 or HTTP/2);
- **Protocol options**: this options instruct Envoy to process the request with the protocol coming from the downstream connection, allowing to dinamically infer the protocol to be used (HTTP/1.1 or HTTP/2);

:::warning
This `Iframe embedding` option is configurable only for the `Envoy API Gateway`, instead for `Nginx API Gateway` it is required to configure it manually using the `Advanced` section of the Console
Expand Down
5 changes: 0 additions & 5 deletions docs/fast_data/configuration/projections.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,11 +18,6 @@ The creation of a System of Record requires you to insert a System ID, which is

The System of Record is then created.

:::note
In case it is not possible to find the button `Create new System of Record`, it means that a project may have been configured
to expose Systems of Record under the [Data Catalog](/data_catalog/overview.mdx) feature, which allows to visualize them in a read-only fashion.
:::

## Delete a System of Record

To delete a System of Record, you have to click the `Delete` button in the bottom-right corner of the System of Record detail page.
Expand Down
37 changes: 8 additions & 29 deletions docs/fast_data/runtime_management/control_plane.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -168,29 +168,6 @@ necessary to apply any further configuration. If in doubt, please contact your K
In case Control Plane service should be configured to be reachable from outside your cluster, for example because your Fast Data Control Plane application is located in a cluster
different from the one where Fast Data Runtimes are deployed, a few extra advanced configuration steps are necessary. These are listed here:

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
- **Introduce further Envoy [`cluster`](https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/cluster.proto) to expose the gRPC port of Control Plane service through the API Gateway.**
This can be obtained by opening your Mia-Platform Console Project and selecting the Advanced tab from the Design area's sidebar.
:::info
This step is necessary if you are using Mia-Platform Console versions lower than v13.5.1. Otherwise, you can skip the modification to the file `api-gateway-envoy/clusters.yaml` here described.
:::
Within this area of Console it is possible
to extend Envoy API Gateway default configuration, by editing the file `api-gateway-envoy/clusters.yaml` with the following content:
```yaml title=api-gateway-envoy/clusters.yaml
- "@type": type.googleapis.com/envoy.config.cluster.v3.Cluster
name: control-plane
connect_timeout: 30s
http2_protocol_options:
max_concurrent_streams: 100
type: LOGICAL_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: control-plane
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 'control-plane'
port_value: 50051

- **Introduce further Envoy [`cluster`](https://www.envoyproxy.io/docs/envoy/latest/api-v3/config/cluster/v3/cluster.proto) to expose the gRPC port of Control Plane service through the API Gateway.**
This can be obtained by opening your Mia-Platform Console Project and selecting the Advanced tab from the Design area's sidebar. Within this area of Console it is possible
to extend Envoy API Gateway default configuration, by editing the file `api-gateway-envoy/clusters.yaml` with the following content:

```yaml title=api-gateway-envoy/clusters.yaml
- "@type": type.googleapis.com/envoy.config.cluster.v3.Cluster
name: control-plane
connect_timeout: 30s
http2_protocol_options:
max_concurrent_streams: 100
type: LOGICAL_DNS
lb_policy: ROUND_ROBIN
load_assignment:
cluster_name: control-plane
endpoints:
- lb_endpoints:
- endpoint:
address:
socket_address:
address: 'control-plane'
port_value: 50051
```

- **Expose new subdomain from which Control Plane gets accessible** (assuming [`traefik`](https://traefik.io/traefik/) is employed as ingress controller)
In order to carry out this operation you first need to have access to the repository of the Console Project where Control Plane application has been configured.
Within the project repository there is a folder named `overlays`, which should contains a folder for each environment available for the project.
Expand Down Expand Up @@ -219,12 +196,14 @@ or via Fabric BFF.
In case the service must be reached from outside the cluster, the endpoints listed in the table below should be [defined in Console](/development_suite/api-console/api-design/endpoints.md) and assigned to Control Plane service.
These endpoints are necessary to allow gRPC Control Plane Operator requests to reach Control Plane from outside the K8s cluster.

| Endpoint | Rewrite Base Path | Microservice |
|---------------------------------------------|---------------------------------------------|-----------------|
| `/grpc.reflection.v1.ServerReflection` | `/grpc.reflection.v1.ServerReflection` | `control-plane` |
| `/grpc.reflection.v1alpha.ServerReflection` | `/grpc.reflection.v1alpha.ServerReflection` | `control-plane` |
| `/control_plane_fabric.RuntimeManagement` | `/control_plane_fabric.RuntimeManagement` | `control-plane` |
| `/control_plane_fabric.ControlPlane` | `/control_plane_fabric.ControlPlane` | `control-plane` |
| Endpoint | Rewrite Base Path | Microservice | Container Port |
|---------------------------------------------|---------------------------------------------|-----------------|----------------|
| `/grpc.reflection.v1.ServerReflection` | `/grpc.reflection.v1.ServerReflection` | `control-plane` | `50051` |
| `/grpc.reflection.v1alpha.ServerReflection` | `/grpc.reflection.v1alpha.ServerReflection` | `control-plane` | `50051` |
Comment on lines +201 to +202
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it correct to expose these endpoints?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, they are needed to let control plane operator reach the control plane

| `/control_plane_fabric.RuntimeManagement` | `/control_plane_fabric.RuntimeManagement` | `control-plane` | `50051` |
| `/control_plane_fabric.ControlPlane` | `/control_plane_fabric.ControlPlane` | `control-plane` | `50051` |

Also, in the `Advanced Section` of each endpoint, be sure that the [Use DownStream Protocol option](/development_suite/api-console/api-design/endpoints.md#manage-advanced-endpoint-parameters) is set to `true`.

### Routes

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -78,10 +78,10 @@ This is and example of `console` property configuration:

:::tip
The following properties support [secret resolution](/fast_data/configuration/secrets_resolution.md):
- `console.target`
- `console.auth.credentials.clientId`
- `console.auth.credentials.clientKeyId`
- `console.auth.credentials.privateKey`
- `console.rest.target`
- `console.rest.auth.credentials.clientId`
- `console.rest.auth.credentials.clientKeyId`
- `console.rest.auth.credentials.privateKey`
:::

A custom x509 certificate can be added to the default root keychain of certificates for any client/reversed-proxy reached by Fabric BFF.
Expand Down Expand Up @@ -187,6 +187,7 @@ Under the endpoint specified above, the following routes are served by Fabric BF

| Route | Type | Method | Description |
|-----------------------|-----------|--------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| `/fast-data/feedback` | Websocket | HEAD | Verifies whether the pipeline _update_ can be carried<br/>out by current user |
| `/fast-data/feedback` | Websocket | GET | Opens a websocket connection with the client to receive updates<br/>of runtimes and pipelines |
| `/fast-data/control` | REST | HEAD | Verifies whether the pipeline _change state action_ can be carried<br/>out by current user |
| `/fast-data/control` | REST | POST | Receives [JSON-RPC](https://www.jsonrpc.org/specification) from the frontend to change pipelines state |
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -92,6 +92,11 @@ The following properties support [secret resolution](/fast_data/configuration/se
- `upstream.headers.*`
:::

:::caution
In order for a _runtime_ to be recognized and managed by the main Control Plane instance, it has to belong to **one and only one** _runtime view_. Consequently,
_runtimes_ that do not belong to any _runtime view_ are <u>simply ignored</u> by the system, while it is forbidden for a _runtime_ to belong to multiple _runtime views_.
:::

### Pipelines Configuration

Control Plane Operator requires an additional configuration file to learn the topology of Fast Data Pipelines deployed within the runtime
Expand Down Expand Up @@ -156,4 +161,4 @@ The information that can be extracted from these messages is:

Workloads connected to a Control Plane Operator **periodically** generate and send these messages through the established bidirectional channel.

<SchemaViewer schema={FeedbackChannelSchema} />
<SchemaViewer schema={FeedbackChannelSchema} />
Loading
Loading