Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add WASM/JS bindings #50

Open
wants to merge 254 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
254 commits
Select commit Hold shift + click to select a range
c685d99
moving registry into rust
eisber Feb 9, 2023
6dcfa91
return ref
jackgerrits Feb 9, 2023
25306a5
split into multiple libs
eisber Feb 10, 2023
ba69bd9
fixed core
eisber Feb 10, 2023
c98f824
Fix python setup.py for package in workspace
jackgerrits Feb 10, 2023
f248694
java bindings...
eisber Feb 10, 2023
2f371f6
more TODO
eisber Feb 10, 2023
1d3f707
Initial VERY rough impl of JNI layer
jackgerrits Feb 11, 2023
95aef95
Add wasm-bindgen, inline ranks
dqbd Feb 19, 2023
799df7f
v0.2.1
dqbd Feb 19, 2023
5a32b88
Update README.md, polish API
dqbd Feb 19, 2023
97d8dea
improve error handling
eisber Feb 22, 2023
3139d04
chore: add README
dqbd Feb 23, 2023
02d132e
feat: add option to extend special tokens and to provide custom bfe
dqbd Feb 23, 2023
46668df
Merge pull request #2 from dqbd/extendability
dqbd Feb 23, 2023
3a39f24
Bump version, update README
dqbd Feb 23, 2023
d0dc32b
Improve error handling, add support for parameters
dqbd Feb 23, 2023
90ee9f6
Fix `any` in TS files, add core tests
dqbd Feb 23, 2023
f7fe717
Update README.md, add tests, fix disallowed special bug
dqbd Feb 24, 2023
d38c936
Validate the values properly
dqbd Feb 24, 2023
53628a4
Improve error handling in JNI functions
jackgerrits Feb 24, 2023
87603e9
Bump version to 0.4.0
dqbd Feb 24, 2023
42548d8
Merge pull request #1 from eisber/error_handling
eisber Feb 27, 2023
e3ab3f6
moved config into json
eisber Feb 27, 2023
22584d4
add github action
eisber Feb 27, 2023
42efde4
add jar build
eisber Feb 27, 2023
484f15a
fix rust build
eisber Feb 27, 2023
6c45bf6
build jar
eisber Feb 27, 2023
b883d0c
fix path
eisber Feb 27, 2023
98070f1
fix java
eisber Feb 27, 2023
3b39f6b
Merge remote-tracking branch 'upstream/main' into main
eisber Feb 27, 2023
4e70da0
cleanup
eisber Feb 28, 2023
381d32b
update groupid
eisber Feb 28, 2023
f1560bd
remove comments
eisber Feb 28, 2023
01d4f9e
Move to separate js folder
dqbd Mar 1, 2023
4d8f9af
Merge branch 'eisber/main'
dqbd Mar 1, 2023
ef77b1a
Make sure JS builds
dqbd Mar 1, 2023
b04f0cf
Attempt to fix sdist
dqbd Mar 2, 2023
bbcb591
Match sdist
dqbd Mar 2, 2023
d1c4af2
Remove the _js suffix
dqbd Mar 2, 2023
33207e6
Update to newer build wheels
dqbd Mar 2, 2023
2370284
Fix wrong result for None
dqbd Mar 5, 2023
98ac953
Add CI step to build and test
dqbd Mar 5, 2023
d989c22
CI: install initialize wasm-pack
dqbd Mar 5, 2023
01bf979
Fix Python CI
dqbd Mar 5, 2023
82cd413
Replace yum with apt
dqbd Mar 5, 2023
7db26cb
Try yum once again
dqbd Mar 5, 2023
08403a1
debug ci
dqbd Mar 5, 2023
12805e3
Try CI again
dqbd Mar 5, 2023
ae30c13
Fix CI again
dqbd Mar 5, 2023
dabb296
invalid `-y` command
dqbd Mar 5, 2023
8e40682
Revert "debug ci"
dqbd Mar 5, 2023
4866cf9
Optimize performace, enable/disable features of core
dqbd Mar 5, 2023
33bb13d
Add custom initialisation, bundler changes
dqbd Mar 7, 2023
03cdbb3
Flatten the structure
dqbd Mar 8, 2023
7c8cd78
Run WASM build sequentially in CI for now
dqbd Mar 8, 2023
fcb52cf
Rename TiktokenEmbedding to TiktokenEncoding
dqbd Mar 8, 2023
945d4f2
Reverse order of default
dqbd Mar 8, 2023
9ea421b
Merge pull request #12 from dqbd/async-init-bundler
dqbd Mar 8, 2023
680fbc5
Update README.md
dqbd Mar 8, 2023
7c75b04
Merge remote-tracking branch 'upstream/main'
dqbd Mar 8, 2023
c1d11fb
Add gpt-3.5-turbo support in types and matchers
dqbd Mar 8, 2023
268bc5c
Add README.md
dqbd Mar 8, 2023
9f9ad0d
Merge pull request #13 from dqbd/gpt-3.5-turbo
dqbd Mar 8, 2023
15dd0f2
Add caveats for CFW
dqbd Mar 10, 2023
2c7e0e4
Create a lite build which defers loading of weights to consumers
dqbd Mar 11, 2023
586d205
Fix README.md
dqbd Mar 11, 2023
6d8c1dc
Cleanup
dqbd Mar 11, 2023
64178e5
Expose loading script
dqbd Mar 11, 2023
17dd0ba
Add polyfill for Buffer.from
dqbd Mar 11, 2023
4f6745f
Merge pull request #15 from dqbd/lite-build
dqbd Mar 11, 2023
e6f0726
Fix exports for CJS (Node ESM)
dqbd Mar 11, 2023
1ee26c3
Remove bundler, as it is unnecessary
dqbd Mar 11, 2023
8984ea7
Add disclaimer for CFW
dqbd Mar 11, 2023
9c8caec
Add support for GPT-4
dqbd Mar 14, 2023
103d010
Expose model_to_encoding.json and registry.json
dqbd Mar 15, 2023
79ac036
Bump to 1.0.0-alpha.5
dqbd Mar 15, 2023
f748754
Fix lite crash, add README.md
dqbd Mar 15, 2023
bfe3817
Update README.md
dqbd Mar 15, 2023
9847f4d
Bump to 1.0.0-alpha.6
dqbd Mar 15, 2023
c01af19
Add JSON importable modules
dqbd Mar 15, 2023
23bb57d
Bump to 1.0.0-alpha.7
dqbd Mar 15, 2023
a3baa6c
Bump to 1.0.0-alpha.8
dqbd Mar 15, 2023
6ac1a1a
Compress ranks
dqbd Mar 15, 2023
7efba72
Use compressed version to make main WASM smaller
dqbd Mar 15, 2023
4d4b921
Bump to 1.0.0-alpha.10
dqbd Mar 15, 2023
efe3728
Update README.md
dqbd Mar 15, 2023
8476eca
Bump to 1.0.0
dqbd Mar 15, 2023
e1c4313
Fix issues with duplicate initialization
dqbd Mar 16, 2023
c4dfaad
Bump to 1.0.1
dqbd Mar 16, 2023
68efa86
Clarifies usage
dqbd Mar 16, 2023
481fb45
Bump to 1.0.2
dqbd Mar 16, 2023
6823b8e
Update README.md with information about Create React App
dqbd Mar 19, 2023
b9a03a7
Add custom exception when not initialized
dqbd Mar 19, 2023
6bba615
Add gpt-4 to model_to_encoding.json
christophwitzko Apr 4, 2023
10e17c3
Update lib.rs
christophwitzko Apr 4, 2023
a3c88a5
Merge pull request #25 from christophwitzko/patch-1
dqbd Apr 4, 2023
ee98f78
Merge remote-tracking branch 'upstream/main'
dqbd Apr 5, 2023
7261f28
Bump to 1.0.3
dqbd Apr 5, 2023
ff230c8
Fix Next.js + Webpack import issue due to missing exports entry
dqbd Apr 8, 2023
83c2a85
Bump to 1.0.4
dqbd Apr 8, 2023
d40517f
Reimplement node_modules detection for Next 13 appDir
dqbd Apr 11, 2023
496168c
Bump to 1.0.5
dqbd Apr 11, 2023
0c0cf00
Fix broken resolution when used in tests
dqbd Apr 11, 2023
26bf591
Bump to 1.0.6
dqbd Apr 11, 2023
33405e9
Use override for edge-light / NextJS appHandlers
dqbd Apr 28, 2023
a54dd57
Publish 1.0.7-alpha.0 build
dqbd Apr 28, 2023
ea11d75
Bump to 1.0.7
dqbd Apr 28, 2023
754a378
Remove Java bindings
dqbd May 8, 2023
fe721cd
Merge remote-tracking branch 'upstream/main'
dqbd May 8, 2023
9390c03
Replace README.md for clarity
dqbd May 8, 2023
b8e7817
Remove duplicate README.md
dqbd May 8, 2023
83de511
Add weak refs
dqbd May 10, 2023
243c061
Add changesets folder
dqbd May 10, 2023
67d21c3
Move WASM bindings to wasm folder
dqbd May 12, 2023
150cf05
Add JS port
dqbd May 12, 2023
fdf6121
Rename WASM workflows
dqbd May 12, 2023
6b13f93
Generate ranks
dqbd May 13, 2023
7e5ab35
Monorepo changes
dqbd May 13, 2023
1bf31d3
Fix actions
dqbd May 13, 2023
1845548
More descriptive name
dqbd May 13, 2023
3debe21
Add CDN hosting of ranks
dqbd May 13, 2023
990d892
Fix sdist
dqbd May 13, 2023
81bd689
Rename js to `tiktoken-js`, rename wasm to `tiktoken`
dqbd May 13, 2023
f245e69
Rename description
dqbd May 13, 2023
6b00942
Add publish CI step
dqbd May 13, 2023
7c6a304
Normalise workflows
dqbd May 13, 2023
975ed0f
Rename package once again
dqbd May 13, 2023
14f108c
fix: typo
rpidanny May 14, 2023
3240a8e
Split into smaller files
dqbd May 14, 2023
dd6a6ef
Merge pull request #40 from rpidanny/fix-typo
dqbd May 14, 2023
2d18fcc
Expose model to encoding mapping, fix types
dqbd May 15, 2023
1da0b9d
Bump js-tiktoken to 1.0.1
dqbd May 15, 2023
621ae35
Add missing main and types for compat
dqbd May 15, 2023
de929d4
Add disallowed tokens
dqbd May 15, 2023
dba22a9
Version Packages
github-actions[bot] May 15, 2023
97fe465
Merge pull request #42 from dqbd/changeset-release/main
dqbd May 15, 2023
265e01d
Expose ranks
dqbd May 15, 2023
7fbd899
Bump to 1.0.4
dqbd May 15, 2023
4d3f97f
Bump js-tiktoken to 1.0.5
dqbd May 15, 2023
1fcb052
Bump to 1.0.6
dqbd May 17, 2023
3f2c59c
Add disclaimer about Svelte + Cloudflare Workers
dqbd Jun 2, 2023
c4a6180
Clarify the difference between `js-tiktoken` and `tiktoken`
dqbd Jun 2, 2023
c7faff0
Update README.md
dqbd Jun 2, 2023
ebfc814
Add new models
rodumani Jun 14, 2023
a5fb1f4
Merge remote-tracking branch 'upstream/main'
dqbd Jun 15, 2023
d1f8f9f
Merge pull request #51 from rodumani/main
dqbd Jun 15, 2023
844343f
Merge remote-tracking branch 'origin/main'
dqbd Jun 15, 2023
baca5b1
Add changeset
dqbd Jun 15, 2023
a25079d
Version Packages
github-actions[bot] Jun 15, 2023
0d7e051
Merge pull request #52 from dqbd/changeset-release/main
dqbd Jun 15, 2023
5a7be3c
feat: Add missing models
kdwkr Jun 16, 2023
1bc9ada
Update lib.rs
kdwkr Jun 16, 2023
b75f172
Merge pull request #53 from kdwkr/patch-1
dqbd Jun 22, 2023
3310bfe
Bump tiktoken to 1.0.9
dqbd Jun 22, 2023
15b8a8b
Remove python CI steps for now, write .npmrc in actions
dqbd Jun 22, 2023
71db434
Use NODE_AUTH_TOKEN
dqbd Jun 22, 2023
8e776d3
Remove prefix from folder
dqbd Jun 24, 2023
a691152
Add changeset
dqbd Jun 24, 2023
b1260d9
Merge pull request #57 from dqbd/prefix-fix
dqbd Jun 24, 2023
41bb68a
Version Packages
github-actions[bot] Jun 24, 2023
c86963f
Merge pull request #58 from dqbd/changeset-release/main
dqbd Jun 24, 2023
82d44e1
Add Electron setup instructions to README
nikwen Aug 5, 2023
072dd12
Merge pull request #66 from nikwen/electron-readme
dqbd Aug 7, 2023
db9a804
feat: add gpt3.5-turbo-instruct model
Prince-Mendiratta Sep 19, 2023
4a91a81
Merge remote-tracking branch 'upstream/main'
dqbd Oct 1, 2023
4cda379
Add new encodings
dqbd Nov 15, 2023
32882f7
Update WASM and JS to add new models
dqbd Nov 15, 2023
10302a2
Update examples
dqbd Nov 15, 2023
bd3a360
Add new changeset
dqbd Nov 15, 2023
20d610f
Version Packages
github-actions[bot] Nov 15, 2023
74c147e
Merge pull request #81 from dqbd/changeset-release/main
dqbd Nov 15, 2023
ff9c637
Add new models
risu729 Jan 26, 2024
e1b2472
Merge pull request #86 from risu729/main
dqbd Jan 28, 2024
5414d83
Merge pull request #71 from Prince-Mendiratta/main
dqbd Jan 28, 2024
221a30e
Fix build
dqbd Jan 28, 2024
43e99b4
Add support for new models, instruct models
dqbd Jan 28, 2024
708f77c
Version Packages
github-actions[bot] Jan 28, 2024
a020c6a
Merge pull request #87 from dqbd/changeset-release/main
dqbd Jan 28, 2024
de6d1c7
Fix invalid model
dqbd Jan 28, 2024
4491997
Backport changes to @dqbd/tiktoken
dqbd Jan 28, 2024
e3fcb12
Fix typings for wasm
dqbd Jan 28, 2024
f39912b
Fix typo
dqbd Jan 28, 2024
d54f04c
Fix invalid model when requesting an instruct model
dqbd Jan 28, 2024
f6b2819
Version Packages
github-actions[bot] Jan 28, 2024
ffa8283
Merge pull request #89 from dqbd/changeset-release/main
dqbd Jan 28, 2024
110eef4
Update development README.md
dqbd Jan 28, 2024
3768231
Add GPT-4 Turbo
akynau Apr 10, 2024
1d8a762
Merge pull request #97 from akynau/add-gpt-4-turbo
dqbd Apr 12, 2024
f386b28
Add changelog
dqbd Apr 12, 2024
f6165e7
Version Packages
github-actions[bot] Apr 12, 2024
045f0e4
Merge pull request #98 from dqbd/changeset-release/main
dqbd Apr 12, 2024
f2e1ac2
Add support for GPT-4-O, "Omni" model
pkallos May 13, 2024
3c118a6
Merge pull request #104 from neonredwood/support-gpt4-omni
dqbd May 13, 2024
ed9cc4f
fix(js): specify exports in js-tiktoken
dqbd May 13, 2024
e228b47
Add entry to tsup
dqbd May 13, 2024
5de6be5
Merge pull request #106 from dqbd/dqbd/o200k_base
dqbd May 13, 2024
791cf62
Add changeset
dqbd May 13, 2024
50f70dd
Version Packages
github-actions[bot] May 13, 2024
a7cce99
Merge pull request #107 from dqbd/changeset-release/main
dqbd May 13, 2024
28ae6b3
add support for gpt-4o-mini model
mbarretol Jul 18, 2024
25bb43a
Update js/src/core.ts
mbarretol Jul 22, 2024
396a8d0
Update tiktoken/model_to_encoding.json
mbarretol Jul 22, 2024
be5db34
Update wasm/src/lib.rs
mbarretol Jul 22, 2024
f7e3df1
Update wasm/src/lib.rs
mbarretol Jul 22, 2024
de6a052
Update js/src/core.ts
mbarretol Aug 10, 2024
fb1ca7b
Update tiktoken/model_to_encoding.json
mbarretol Aug 10, 2024
26ec6cc
Update wasm/src/lib.rs
mbarretol Aug 10, 2024
a32f0a4
Update wasm/src/lib.rs
mbarretol Aug 10, 2024
bba4ed6
Fix unreachable code
dqbd Aug 15, 2024
4796f27
Add changeset
dqbd Aug 15, 2024
9d385cb
Add text-embedding-3-small and text-embedding-3-large
dqbd Aug 15, 2024
831a7e8
Add changeset
dqbd Aug 15, 2024
e77e339
Merge pull request #112 from mbarretol/add-gpt4o-mini
dqbd Aug 15, 2024
ed2e785
Version Packages
github-actions[bot] Aug 15, 2024
60ff9bb
Merge pull request #116 from dqbd/changeset-release/main
dqbd Aug 15, 2024
a825d5f
fix(js): add missing rank files back to js-tiktoken
dqbd Aug 15, 2024
74e7870
Changeset
dqbd Aug 15, 2024
25f0656
Merge pull request #117 from dqbd/dqbd/js-tiktoken-missing-files
dqbd Aug 15, 2024
386acb8
Version Packages
github-actions[bot] Aug 15, 2024
f8aa455
Merge pull request #118 from dqbd/changeset-release/main
dqbd Aug 15, 2024
0ba7dbb
Add missing o1 models
dqbd Oct 3, 2024
53bd14a
Changeset
dqbd Oct 3, 2024
503da21
Merge pull request #121 from dqbd/dqbd/o1
dqbd Oct 3, 2024
7e6cddc
Version Packages
github-actions[bot] Oct 3, 2024
b742a86
Merge pull request #122 from dqbd/changeset-release/main
dqbd Oct 4, 2024
f40be08
Add new o1-2024-12-17 model
lmg8 Dec 17, 2024
c73f19e
Add new o1-2024-12-17 model
lmg8 Dec 17, 2024
8a20c2d
Merge remote-tracking branch 'origin/main'
lmg8 Dec 17, 2024
919eb6b
Add changeset
dqbd Dec 19, 2024
9c95cc5
Merge pull request #124 from lmg8/main
dqbd Dec 19, 2024
7a8687d
Version Packages
github-actions[bot] Dec 19, 2024
638e6a2
Merge pull request #125 from dqbd/changeset-release/main
dqbd Dec 19, 2024
46e3c7c
feat: add o3-mini, missing o1 model
dqbd Feb 2, 2025
5982def
Add changeset
dqbd Feb 2, 2025
335b291
Add missing types for text-embedding-*
dqbd Feb 2, 2025
a29d523
Merge pull request #128 from dqbd/dqbd/o3-mini
dqbd Feb 2, 2025
1c78c18
Version Packages
github-actions[bot] Feb 2, 2025
37c993f
Merge pull request #129 from dqbd/changeset-release/main
dqbd Feb 2, 2025
5568320
feat: add missing gpt-4o-2024-11-20 models
chenqianhe Feb 3, 2025
4dbb38e
feat: add missing o3-mini
chenqianhe Feb 3, 2025
9fc37ed
Correct other models as well
dqbd Feb 4, 2025
5f92348
Add changeset
dqbd Feb 4, 2025
de25276
Merge pull request #130 from chenqianhe/main
dqbd Feb 4, 2025
4c0b023
Version Packages
github-actions[bot] Feb 4, 2025
e3437a3
Merge pull request #131 from dqbd/changeset-release/main
dqbd Feb 4, 2025
77ba758
Update guidance on lite
dqbd Feb 14, 2025
fc87e46
Merge pull request #134 from dqbd/dqbd/js-readme
dqbd Feb 14, 2025
3827a62
Version Packages
github-actions[bot] Feb 14, 2025
8963e56
Merge pull request #135 from dqbd/changeset-release/main
dqbd Feb 14, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 11 additions & 0 deletions .changeset/config.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
{
"$schema": "https://unpkg.com/@changesets/[email protected]/schema.json",
"changelog": "@changesets/cli/changelog",
"commit": false,
"fixed": [],
"linked": [],
"access": "restricted",
"baseBranch": "main",
"updateInternalDependencies": "patch",
"ignore": []
}
23 changes: 23 additions & 0 deletions .github/workflows/build_js.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
name: Build and Test JS/WASM

on: [push, pull_request, workflow_dispatch]

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

jobs:
build_js:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v1
with:
node-version: 18
registry-url: "https://registry.npmjs.org"
cache: yarn
- name: Install
run: curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh
- run: yarn install --frozen-lockfile
- run: yarn run build
- run: yarn run test
83 changes: 0 additions & 83 deletions .github/workflows/build_wheels.yml

This file was deleted.

30 changes: 30 additions & 0 deletions .github/workflows/publish.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
name: Publish JS
on:
push:
branches:
- "main"

concurrency: ${{ github.workflow }}-${{ github.ref }}

jobs:
publish:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- uses: actions/setup-node@v3
with:
node-version: 18
registry-url: "https://registry.npmjs.org"
cache: yarn
- name: Install
run: curl https://rustwasm.github.io/wasm-pack/installer/init.sh -sSf | sh
- run: yarn install --frozen-lockfile
- name: Create Release Pull Request or Publish
id: changesets
uses: changesets/action@v1
with:
publish: yarn run publish
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}
NPM_TOKEN: ${{ secrets.NPM_TOKEN }}
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -41,3 +41,7 @@ htmlcov

Cargo.lock
target/

ranks/
node_modules
.turbo
28 changes: 11 additions & 17 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,21 +1,15 @@
[package]
name = "tiktoken"
version = "0.5.1"
edition = "2021"
rust-version = "1.57.0"
[workspace]

[lib]
name = "_tiktoken"
crate-type = ["cdylib"]

[dependencies]
pyo3 = { version = "0.19.0", features = ["extension-module"] }

# tiktoken dependencies
fancy-regex = "0.11.0"
regex = "1.8.3"
rustc-hash = "1.1.0"
bstr = "1.5.0"
members = [
"core",
"python",
"wasm"
]

[profile.release]
incremental = true
opt-level = 's' # Optimize for size
lto = true # Enable link-time optimization
codegen-units = 1 # Reduce number of codegen units to increase optimizations
panic = 'abort' # Abort on panic
strip = true # Strip symbols from binary*
12 changes: 11 additions & 1 deletion MANIFEST.in
Original file line number Diff line number Diff line change
@@ -1,8 +1,18 @@
include *.svg
include *.toml
include *.md
exclude yarn.lock
include Makefile
global-include py.typed
recursive-include scripts *.py
recursive-include tests *.py
recursive-include src *.rs
recursive-include core *.rs *.toml
recursive-include python *.rs *.toml
recursive-exclude jni *
recursive-exclude java *
recursive-exclude js *
recursive-exclude wasm *
recursive-exclude static *
recursive-exclude .changeset *
recursive-exclude scripts *.ts *.json
include tiktoken *.json
Loading