Saving Embeddings to JSON file

Overview

This is an example Node.js application processes a text corpus, generates embeddings for "chunks", and saves the embeddings to a local file. The embeddings can be used in another application (like a Retrieval Augmentated Generation system or 2D/3D clustering demonstration using UMAP dimensionality reduction)

There are two main scripts in this project:

embeddings-replicate.js: Generates embeddings using the Llama model on Replicate.
embeddings-transformers.js: Generates embeddings using the bge-small model with transformers.js.

Both scripts output the embeddings to embeddings.json.

Replicate with Llama model

.env: API token for Replicate
Using open-source models for faster and cheaper text embeddings

Using transformers.js with bge-small model

Uses the transformers.js package and bge-small model for embeddings generation.
embeddings-transformers.js: Script to process a text file and generate embeddings using the bge-small model.

References

How-To

Install Dependencies

npm install

For Replicate (embeddings-replicate.js)

Set up the .env file with your Replicate API token:

REPLICATE_API_TOKEN=your_api_token_here

Generate the embeddings.json file.

You'll need to hard-code a text filename and adjust how the text is split up depending on the format of your data.

const raw = fs.readFileSync('text-corpus.txt', 'utf-8');
let chunks = raw.split(/\n+/);

Then:

node embeddings-replicate.js

For transformers.js (embeddings-transformers.js)

Generate the embeddings.json file. Adjust the text filename and splitting method as needed:

const raw = fs.readFileSync('text-corpus.txt', 'utf-8');
let chunks = raw.split(/\n+/);

node embeddings-transformers.js

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
embeddings		embeddings
.env-sample		.env-sample
.gitignore		.gitignore
README.md		README.md
clustering.png		clustering.png
embeddings-replicate.js		embeddings-replicate.js
embeddings-transformers.js		embeddings-transformers.js
index.js		index.js
package-lock.json		package-lock.json
package.json		package.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Saving Embeddings to JSON file

Overview

Replicate with Llama model

Using transformers.js with bge-small model

References

How-To

For Replicate (embeddings-replicate.js)

For transformers.js (embeddings-transformers.js)

About

Releases

Packages

Languages

Programming-from-A-to-Z/Save-Embeddings-JSON

Folders and files

Latest commit

History

Repository files navigation

Saving Embeddings to JSON file

Overview

Replicate with Llama model

Using transformers.js with bge-small model

References

How-To

For Replicate (embeddings-replicate.js)

For transformers.js (embeddings-transformers.js)

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages