[Fix]:add initial_sentences param and fix custom tokenizer does not work #86
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Feature Enhancement and Bug Fixes for Semantic Chunking
Introduced a new initial_sentence parameter to the SemanticChunker class
Allows users to specify the number of sentences used as initial context when starting a new chunk in accumulation mode
Default value set to 1
Provides increased flexibility in controlling semantic-based sentence grouping
Enhanced input validation for the tokenizer_or_token_counter parameter
Added support for methods using inspect.ismethod() check
Increases robustness of tokenizer input processing
Enables more versatile tokenization strategies
Fixes for Current Version (v0.2.2)
Resolves TypeError when setting initial_sentences
Addresses ValueError related to custom tokenizer usage
Supports broader range of tokenization approaches