Apache Streams unifies a diverse world of digital profiles and online activities into common formats and vocabularies, and makes these datasets accessible across a variety of databases, devices, and platforms for streaming, browsing, searching, sharing, and analytics use-cases.
Apache Streams contains JRE-based modules that developers can use to easily integrate with online data sources and build polyglot indexes of activities, entities, and relationships - all based on public standards such as Activity Streams, or other published organizational standards.
Streams contains libraries and patterns for specifying, publishing, and inter-linking schemas, and assists with conversion of activities (posts, shares, likes, follows, etc.) and objects (profiles, pages, photos, videos, etc.) between the representation, format, and encoding preferred by supported data providers (Twitter, Instagram, etc.) and storage services (Cassandra, Elasticsearch, HBase, HDFS, Neo4J, etc.).
The project aims to provide simple two-way data interchange with all popular REST APIs in activity streams formats using a universal protocol. No other active open-source project has this ambitious goal, as well as production-worthy implementations for >10 services. Streams' compatibility with multiple storage back-ends and ability to be embedded within any Java-based real-time or batch data processing platform ensures that its interoperability features come with little technical baggage.