Skip to content

v0.12.0

Compare
Choose a tag to compare
@BeciLambrecht BeciLambrecht released this 23 Oct 00:56
· 49 commits to develop since this release
9177831

Automated deduplication on topics is now available 🏋️ Let Ensign do the heavy lifting by managing data quality issues without you having to write custom code.

No need to worry about duplicate data in data sources that you are ingesting data from. We have several options that not only give you a lot of flexibility, but give you the ability to switch from one option to another without concern over data loss.

We've also made a lot of improvements to Ensign core including adding a memory efficient task scheduler, and the ability to destroy event objects when a topic is deleted. We've improved a lot of the internal functionality to make things faster, more efficient, and more consistent, and have fixed several bugs.

Automated Deduplication Options

There are a lot of ways to determine if two events are duplicates of each other. Ensign uses user-defined policies to figure out duplication. Policies you can set are:

  • Strict: Two events with identical metadata, data, mimetype, and type (though provenance, region, publisher, encryption, etc. may differ).
  • Datagram: Two events with identical data regardless of metadata, mimetype, or type information
  • Key Grouped: Two events with identical data and the same value for a user specified key or keys in the metadata
  • Unique Key Constraint: Two events with the same value for specified key or keys in the metadata (unique key index)
  • Unique Field Constraint*: Two events with the same value for a field or fields in the data (unique field index)
  • None: We store all events no matter what

*Note that the unique field constraint requires us to be able to process your data -- which we won't be able to do until we have a schema registry. So although this is technically a deduplication option, in practice it is not usable and will return not implemented errors.

One quick caveat: deduplication happens as a background process right now, not in realtime. However in our next release, we will add real-time deduplication and data quality checks!

Ensign UX/UI Updates

  • Beacon Home UI Improvements to allow users to easily view and change their settings
  • Additional user support including helpful starter videos and increased accessibility to resources

Full Changelog: v0.11.0...v0.12.0