go-streams
provides a lightweight and efficient stream processing framework for Go. Its concise DSL allows
for easy definition of declarative data pipelines using composable sources, flows, and sinks.
Wiki
In computing, a pipeline, also known as a data pipeline, is a set of data processing elements connected in series, where the output of one element is the input of the next one. The elements of a pipeline are often executed in parallel or in time-sliced fashion. Some amount of buffer storage is often inserted between elements.
The core module has no external dependencies and provides three key components for constructing stream processing pipelines:
- Source: The entry point of a pipeline, emitting data into the stream. (One open output)
- Flow: A processing unit, transforming data as it moves through the pipeline. (One open input, one open output)
- Sink: The termination point of a pipeline, consuming processed data and often acting as a subscriber. (One open input)
The flow package provides a collection of Flow
implementations for common stream
processing operations. These building blocks can be used to transform and manipulate data within pipelines.
- Map: Transforms each element in the stream.
- FlatMap: Transforms each element into a stream of slices of zero or more elements.
- Filter: Selects elements from the stream based on a condition.
- Reduce: Combines elements of the stream with the last reduced value and emits the new value.
- PassThrough: Passes elements through unchanged.
- Split1: Divides the stream into two streams based on a boolean predicate.
- FanOut1: Duplicates the stream to multiple outputs for parallel processing.
- RoundRobin1: Distributes elements evenly across multiple outputs.
- Merge1: Combines multiple streams into a single stream.
- ZipWith1: Combines elements from multiple streams using a function.
- Flatten1: Flattens a stream of slices of elements into a stream of elements.
- Batch: Breaks a stream of elements into batches based on size or timing.
- Throttler: Limits the rate at which elements are processed.
- SlidingWindow: Creates overlapping windows of elements.
- TumblingWindow: Creates non-overlapping, fixed-size windows of elements.
- SessionWindow: Creates windows based on periods of activity and inactivity.
- Keyed: Groups elements by key for parallel processing of related data.
1 Utility Flows
Standard Source
and Sink
implementations are located in the extension package.
- Go channel inbound and outbound connector
- File inbound and outbound connector
- Standard Go
io.Reader
Source andio.Writer
Sink connectors os.Stdout
andDiscard
Sink connectors (useful for development and debugging)
The following connectors are available as separate modules and have their own dependencies.
- Aerospike
- Apache Kafka
- Apache Pulsar
- AWS (S3)
- Azure (Blob Storage)
- GCP (Storage)
- NATS
- Redis
- WebSocket
See the examples directory for practical code samples demonstrating how to build complete stream processing pipelines, covering various use cases and integration scenarios.
Licensed under the MIT License.