Skip to content

Releases: alexklibisz/elastiknn

0.1.0-PRE52

15 Dec 15:30
d200fd8
Compare
Choose a tag to compare
  • Bumped Elasticsearch version to 7.10.0.

0.1.0-PRE51

23 Nov 02:13
b7252af
Compare
Choose a tag to compare
  • No substantive changes. Just testing out new release setup.

0.1.0-PRE50

13 Nov 04:19
cbc9a1a
Compare
Choose a tag to compare
  • Bumped Elasticsearch version to 7.9.3.

0.1.0-PRE49

12 Nov 05:16
2d9f698
Compare
Choose a tag to compare
  • Fixed the function score query implementation. The first pass was kind of buggy for exact queries and totally wrong for approximate queries.
  • Addressed a perplexing edge case that was causing an out-of-bounds exception in the MatchHashesAndScoreQuery.

0.1.0-PRE48

10 Nov 17:12
34f9831
Compare
Choose a tag to compare

0.1.0-PRE47

06 Nov 00:50
e0a96cf
Compare
Choose a tag to compare
  • Improved the Python ElastiknnModel's handling of empty query responses (i.e. no results).
    Previously it threw an exception. Now it will just not populate the ID and distance arrays for that particular query.

0.1.0-PRE46

02 Nov 05:30
9048b21
Compare
Choose a tag to compare
  • Upgraded to Elasticsearch version 7.9.2. No changes to the API.
    It did require quite a bit of internal refactoring, mostly to the way vector types are implemented.
  • Indices should be backwards compatible, however if you indexed on an earlier version, I'd recommend re-indexing and
    setting the index.elastiknn setting to true.

0.1.0-PRE45

30 Oct 02:50
511a995
Compare
Choose a tag to compare
  • Adds an index-level setting: index.elastiknn = true|false, which defaults to false. Setting this to true tells Elastiknn to use a non-default storage format for doc values fields. Specifically, Elastiknn will use the latest Lucene formats for all fields except doc values, which will use the Lucene70DocValuesFormat. Using this specific doc values format is necessary to disable compression that makes Elastiknn extremely slow when upgraded past Elasticsearch version 7.6.x. Without this format, it's basically impossible to upgrade beyond 7.6.x. The root cause is a change that was made between Lucene 8.4.x and 8.5.x, which introduces more aggressive compression on binary doc values. This compression saves space, but becomes an extreme bottleneck for Elastiknn (40-100x slower queries), since Elastiknn stores vectors as binary doc values. Hopefully the Lucene folks will make this compression optional in the future. Read more here: https://issues.apache.org/jira/browse/LUCENE-9378

0.1.0-PRE44

29 Oct 00:38
45fee6f
Compare
Choose a tag to compare
  • Introduces a shorthand alternative format for dense and sparse vectors that makes it easier to work with ES-connectors that don't allow nested docs.
    • Dense vectors can be represented as a simple array: { "vec": [0.1, 0.2, 0.3, ...] } is equivalent to { "vec": { "values": [0.1, 0.2, 0.3] }}.
    • Sparse vectors can be represented as an array where the first element is the array of true indices, and the second is the number of total indices: {"vec": [[1, 3, 5, ...], 100] } is equivalent to { "vec": { "true_indices": [1,3,5,...], "total_indices": 100 }}
  • Added a logger warning when the approximate query matches fewer candidates than the specified number of candidates.
  • Subtle modification to the DocIdSetIterator created by the MatchHashesAndScoreQuery to address issues 180 and 181.
    The gist of issue 180 is that the binary doc values iterator used to access vectors would attempt to visit the same
    document twice, and on the second visit the call to advanceExact would fail.
    The gist of the change is that the docID was previously initialized to be the smallest candidate docID.
    Initializing it to -1 seems to be the correct convention, and it makes that problem go away.
  • Renamed all exceptions explicitly thrown by Elastiknn to ElastiknnFooException, e.g. ElastiknnIllegalArgumentException.
    This just makes it a bit more obvious where to look when debugging exceptions and errors.

0.1.0-PRE43

24 Oct 22:53
d5e8775
Compare
Choose a tag to compare
  • No longer caching the mapping for the field being queried. Instead, using the internal mapper service to retrieve the mapping.