We wanted to let you know that there are going to be some exciting developments with the Stream Registry project in the very near future. Stream Registry is being adopted by many brands at Expedia Group as a critical component of its digital nervous system for key streams across Expedia Group. Therefore, HomeAway stream registry is finding a new home.
- We will be investing in the project by expanding the existing team with full-time resources in several locations across Expedia Group. Expect greatly increased project activity: contributors, commits, issues, features, releases
- The repository will relocate to the ExpediaGroup open source GitHub org in its entirety, preserving the history and community
- The original vision of Stream Registry as a Stream Discovery and Stream Orchestration platform
- The project will remain open source, and will be joined shortly by other supporting Expedia Group stream platform components
- Licenses, conduct and contribution guidelines will remain unchanged
- The value of your contributions - please keep them coming!
We expect the start of this journey to be a little bumpy, but please bear with us as we work towards the first release of the Expedia Group Stream Registry!
A Stream Registry is what its name implies: it is a registry of streams. As enterprises increasingly scale in size, the need to organize and develop around streams of data becomes paramount. Synchronous calls are attracted to the edge, and a variety of synchronous and asynchronous calls permeate the enterprise. The need for a declarative, central authority for discovery and orchestration of stream management emerges. This is what a stream registry provides. In much the same way that DNS provides a name translation service for an ip address, by way of analogy, a Stream Registry provides a “metadata service” for streams. By centralizing stream metadata, a stream translation service for producer and/or consumer stream coördinates becomes possible. This centralized, yet democratized, stream metadata function thus streamlines operational complexity via stream lifecycle management, stream discovery, stream availability and resiliency.
We believe that as the change to business requirements accelerate, time to market pressures increase, competitive measures grow, migrations to cloud and different platforms are required, and so on, systems will increasingly need to become more reactive and dynamic in nature.
The issue of state arises.
We see many systems adopting event-driven-architectures to facilitate the changing business needs in these high stakes environments. We hypothesize there is an emerging need for a centralized "stream metadata" service in the industry to help streamline the complexities and operations of deploying stream platforms that serve as a distributed federated nervous system in the enterprise.
Put simply, Stream Registry is a centralized service for stream metadata.
The stream registry can answer the following question:
- Who owns the stream?
- Who are the producers and consumers of the stream?
- Management of stream replication across clusters and regions
- Management of stream storage for permanent access
- Management of stream triggers for legacy stream sources
See the architecture/northstar documentation for more details.
Stream Registry is built using OpenJDK 11 and Maven. For convenience, we have wrapped each Maven command in a Makefile
.
If you do not have make
installed, please consult this file for build commands.
Stream Registry is currently packaged as a shaded JAR file. We leave specific deployment considerations up to each team since this varies from enterprise to enterprise. We, do, however provide a vanilla Docker example for teams to use/leverage for demo, learning, or development purposes.
To build Stream Registry as a JAR file, please run
make build
To build Stream Registry as a Docker image, please run the following, which will use the Jib Maven Plugin to build and install the image
make build-docker
Required Local Environment
The local 'dev' version of Stream Registry requires a locally running version of Apache Kafka and Confluent's Schema Registry on ports 9092 and 8081, respectively.
To quickly get a local dev environment set up, we recommend to use the provided Docker Compose. Be sure to first build the Docker image using the command above.
docker-compose up
Alternatively, one can start Confluent Platform locally after downloading the Confluent CLI and running the following command.
Note: The confluent
command is currently only available for macOS and Linux. If using Windows, you'll need to use Docker, or run ZooKeeper, Kafka, and the Schema Registry all individually.
confluent start zookeeper
confluent start kafka
confluent start schema-registry
Stream Registry can then be started
make run
Once Stream Registry has started, check that the application's Swagger API is running at http://localhost:8080/swagger
First create your cluster
curl -X PUT --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{
"clusterKey": {
"vpc": "localRegion",
"env": "local",
"hint": "primary",
"type": null
},
"clusterValue": {
"clusterName": "localCluster",
"bootstrapServers": "localhost:9092",
"zookeeperQuorum": "zookeeper:2181",
"schemaRegistryURL": "http://localhost:8081"
}
}' 'http://localhost:8080/v0/clusters'
Now, declare your stream
Here is a sample stream
curl -X PUT --header 'Content-Type: application/json' --header 'Accept: application/json' -d '{
"name": "sampleStream",
"schemaCompatibility": "BACKWARD",
"latestKeySchema": {
"id": "string",
"version": 0,
"schemaString": "\"string\"",
"created": "string",
"updated": "string"
},
"latestValueSchema": {
"id": "string",
"version": 0,
"schemaString": "\"string\"",
"created": "string",
"updated": "string"
},
"owner": "string",
"created": 0,
"updated": 0,
"githubUrl": "http://github.com",
"isDataNeededAtRest": true,
"isAutomationNeeded": true,
"tags": {
"productId": 0,
"portfolioId": 0,
"brand": "string",
"assetProtectionLevel": "string",
"componentId": "string",
"hint": "primary"
},
"vpcList": [
"localRegion"
],
"replicatedVpcList": [
],
"topicConfig": {},
"partitions": 1,
"replicationFactor": 1
}' 'http://localhost:8080/v0/streams/sampleStream'
Stream Registry development and initial deployment started with Kafka 0.11.0 / Confluent Platform 3.3.0, and has also been deployed against Kafka 1.1.1 / Confluent Platform 4.1.2.
As per the Kafka Compatibility Matrix, we expect Stream Registry to be compatbile with Kafka 0.10.0 and newer, and the internal Java Kafka clients used by Stream Registry can be found in the pom.xml
.
make tests
Special thanks to the following for making stream-registry possible at HomeAway and beyond!
Adam Westerman 💻 |
Arun Vasudevan 💻 🎨 |
Nathan Walther 💻 👀 |
Jordan Moore 💻 💁 |
Carlos Cordero 💻 |
Ishan Dikshit 💻 📖 |
Vinayak Ponangi 💻 📢 🎨 👀 |
---|
Prabhakaran Thatchinamoorthy 💻 🎨 |
Rui Zhang 💻 |
Miguel Lucero 💻 💁 |
René X Parra 💻 📖 📝 📢 🎨 👀 |
---|
This project follows the all-contributors specification.