Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

article: vector database #6364

Merged
merged 12 commits into from
Sep 10, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,8 @@
# RxDB Changelog

<!-- CHANGELOG NEWEST -->

- FIX RxPipeline tries to store metadata that does not match the json schema.
- ADD utilities function for vector search.
<!-- ADD new changes here! -->

<!-- /CHANGELOG NEWEST -->
Expand Down
600 changes: 600 additions & 0 deletions docs-src/docs/articles/javascript-vector-database.md

Large diffs are not rendered by default.

3 changes: 2 additions & 1 deletion docs-src/sidebars.js
Original file line number Diff line number Diff line change
Expand Up @@ -139,8 +139,9 @@ const sidebars = {
'downsides-of-offline-first',
'slow-indexeddb',
'why-nosql',
'react-native-database',
'alternatives',
'react-native-database',
'articles/javascript-vector-database',
'articles/angular-database',
'articles/browser-database',
'articles/browser-storage',
Expand Down
3 changes: 0 additions & 3 deletions docs-src/src/css/custom.css
Original file line number Diff line number Diff line change
Expand Up @@ -67,9 +67,6 @@ html[data-theme="dark"] {
font-size: 1.75rem;
}

.markdown>p {
font-size: 1rem;
}

.footer {
background-color: var(--ifm-navbar-background-color);
Expand Down
37 changes: 37 additions & 0 deletions docs-src/static/files/icons/transformers.js.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs-src/static/files/vector-database-result.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
5 changes: 5 additions & 0 deletions orga/before-next-major.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,6 +19,11 @@ Set the default to 14 and also remove all occurences of `jsonPositionSize`.
In the migration-storage plugin we run a catch on `oldStorageInstance.cleanup(0)` to fix v14->v15 migration.
We should remove that catch in the next major release.


## Change the `RX_PIPELINE_CHECKPOINT_CONTEXT` to `rx-pipeline-checkpoint` in the rx-pipeline.ts file

This was not possible before because it requires adding the new value to the schema enum.

---------------------------------
# Maybe later (not sure if should be done)

Expand Down
8 changes: 7 additions & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -309,6 +309,12 @@
"import": "./dist/esm/plugins/pipeline/index.js",
"default": "./dist/esm/plugins/pipeline/index.js"
},
"./plugins/vector": {
"types": "./dist/types/plugins/vector/index.d.ts",
"require": "./dist/cjs/plugins/vector/index.js",
"import": "./dist/esm/plugins/vector/index.js",
"default": "./dist/esm/plugins/vector/index.js"
},
"./plugins/validate-ajv": {
"types": "./dist/types/plugins/validate-ajv/index.d.ts",
"require": "./dist/cjs/plugins/validate-ajv/index.js",
Expand Down Expand Up @@ -588,4 +594,4 @@
"webpack-cli": "5.1.4",
"webpack-dev-server": "5.0.4"
}
}
}
8 changes: 5 additions & 3 deletions src/plugins/pipeline/rx-pipeline.ts
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,10 @@ import { getChangedDocumentsSince } from '../../rx-storage-helper.ts';
import { mapDocumentsDataToCacheDocs } from '../../doc-cache.ts';
import { getPrimaryKeyOfInternalDocument } from '../../rx-database-internal-store.ts';
import { FLAGGED_FUNCTIONS, blockFlaggedFunctionKey, releaseFlaggedFunctionKey } from './flagged-functions.ts';
export const RX_PIPELINE_CHECKPOINT_CONTEXT = 'rx-pipeline-checkpoint';

export const RX_PIPELINE_CHECKPOINT_CONTEXT = 'OTHER';
// TODO change the context in the next major version.
// export const RX_PIPELINE_CHECKPOINT_CONTEXT = 'rx-pipeline-checkpoint';

export class RxPipeline<RxDocType> {
processQueue = PROMISE_RESOLVE_VOID;
Expand Down Expand Up @@ -191,7 +193,7 @@ export class RxPipeline<RxDocType> {
const writeResult = await insternalStore.bulkWrite([{
previous: checkpointDoc,
document: newDoc,
}], RX_PIPELINE_CHECKPOINT_CONTEXT);
}], 'rx-pipeline');
if (writeResult.error.length > 0) {
throw writeResult.error;
}
Expand Down Expand Up @@ -243,7 +245,7 @@ export async function setCheckpointDoc<RxDocType>(
const writeResult = await insternalStore.bulkWrite([{
previous,
document: newDoc,
}], RX_PIPELINE_CHECKPOINT_CONTEXT);
}], 'rx-pipeline');
if (writeResult.error.length > 0) {
throw writeResult.error;
}
Expand Down
6 changes: 6 additions & 0 deletions src/plugins/utils/utils-array.ts
Original file line number Diff line number Diff line change
Expand Up @@ -173,3 +173,9 @@ export function uniqueArray(arrArg: string[]): string[] {
});
}


export function sortByObjectNumberProperty<T>(property: keyof T) {
return (a: T, b: T) => {
return (b as any)[property] - (a as any)[property];
}
}
Empty file added src/plugins/vector/helper.ts
Empty file.
2 changes: 2 additions & 0 deletions src/plugins/vector/index.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
export type * from './types.ts';
export * from './vector-distance.ts';
1 change: 1 addition & 0 deletions src/plugins/vector/types.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
export type Vector = number[];
34 changes: 34 additions & 0 deletions src/plugins/vector/vector-distance.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
import type { Vector } from './types';

/**
* Vector comparison methods
* @link https://www.restack.io/p/vector-database-knowledge-answer-javascript-cat-ai
* @returns
*/
export function euclideanDistance(A: Vector, B: Vector): number {
return Math.sqrt(A.reduce((sum, a, i) => sum + Math.pow(a - B[i], 2), 0));
}
export function manhattanDistance(A: Vector, B: Vector) {
return A.reduce((sum, a, i) => sum + Math.abs(a - B[i]), 0);
}


export function cosineSimilarity(A: Vector, B: Vector): number {
const dotProduct = A.reduce((sum, a, i) => sum + a * B[i], 0);
const magnitudeA = Math.sqrt(A.reduce((sum, a) => sum + a * a, 0));
const magnitudeB = Math.sqrt(B.reduce((sum, b) => sum + b * b, 0));
return dotProduct / (magnitudeA * magnitudeB);
}


/**
* @link https://github.com/vector5ai/vector5db/blob/main/src/metrics/JaccardSimilarity.ts
*/
export function jaccardSimilarity(a: number[], b: number[]): number {
const setA = new Set(a);
const setB = new Set(b);
const intersection = new Set([...setA].filter((x) => setB.has(x))).size;
const union = new Set([...setA, ...setB]).size;

return 1 - (intersection / union);
}
33 changes: 32 additions & 1 deletion test/unit/rx-pipeline.test.ts
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,14 @@ import {
randomCouchString
} from '../../plugins/core/index.mjs';
import {
HumanWithTimestampDocumentType
HumanWithTimestampDocumentType,
getConfig
} from '../../plugins/test-utils/index.mjs';
import {
schemaObjects,
humansCollection
} from '../../plugins/test-utils/index.mjs';
import { wrappedValidateAjvStorage } from '../../plugins/validate-ajv/index.mjs';
import { RxDBPipelinePlugin } from '../../plugins/pipeline/index.mjs';
addRxPlugin(RxDBPipelinePlugin);
import { RxDBLeaderElectionPlugin } from '../../plugins/leader-election/index.mjs';
Expand Down Expand Up @@ -71,6 +73,35 @@ describeParallel('rx-pipeline.test.js', () => {
await c1.database.destroy();
await c2.database.destroy();
});
it('write some document depending on another with schema validator', async () => {
const storage = wrappedValidateAjvStorage({
storage: getConfig().storage.getStorage()
});
const c1 = await humansCollection.create(0, undefined, undefined, undefined, storage);
const c2 = await humansCollection.create(0, undefined, undefined, undefined, storage);
await c1.addPipeline({
destination: c2,
handler: async (docs) => {
for (const doc of docs) {
await c2.insert(schemaObjects.humanData(doc.passportId));
}
},
identifier: randomCouchString(10)
});

await c1.insert(schemaObjects.humanData('foobar'));

/**
* Here we run the query on the destination directly after
* a write to the source. The pipeline should automatically halt
* the reads to the destination until the pipeline is idle.
*/
const doc2 = await c2.findOne().exec(true);
assert.strictEqual(doc2.passportId, 'foobar');

await c1.database.destroy();
await c2.database.destroy();
});
// it('write some document depending on another', async () => {
// const dbs = await multipleOnSameDB(0);
// const c1 = dbs.collection;
Expand Down
Loading