Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: statistical chunker improvements #13

Merged
merged 3 commits into from
Jul 4, 2024
Merged

feat: statistical chunker improvements #13

merged 3 commits into from
Jul 4, 2024

Conversation

simjak
Copy link
Member

@simjak simjak commented Jul 3, 2024

Improved async method for statistical chunker
call duration: 46.23 seconds
acall duration: 2.05 seconds

However, I noticed some minor differences in how the text is split between sync and async. See the notebook in the docs

@simjak simjak requested review from jamescalam and ashraq1455 July 3, 2024 13:44
@simjak simjak self-assigned this Jul 3, 2024
Copy link

github-actions bot commented Jul 3, 2024

Failed to generate code suggestions for PR

@jamescalam
Copy link
Member

@simjak this is great, could I just ask we do a couple minor things before merging:

  • keep the original chunking notebook (ie sync) as notebook 00 but then add the async notebook as a new example — we can add references/links to 00 along the lines of "Note: by using the [async methods here](link) docs can be processed *40x* faster." — this way we show examples for both sync and async while keep the notebooks simple and also demonstrating how much faster async can be.
  • resolve lint issues :)

@jamescalam jamescalam merged commit 168e96c into main Jul 4, 2024
6 checks passed
@jamescalam jamescalam deleted the simonas/async branch July 4, 2024 03:39
@jamescalam
Copy link
Member

resolves #15

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants