Releases: Aleph-Alpha/intelligence-layer-sdk
v1.2.0
v1.2.0
We did a major revamp of the ArgillaEvaluator
to separate an AsyncEvaluator
from the normal evaluation scenario.
This comes with easier to understand interfaces, more information in the EvaluationOverview
and a simplified aggregation step for Argilla that is no longer dependent on specific Argilla types.
Check the how-to for detailed information here
Breaking Changes
- rename:
AggregatedInstructComparison
toAggregatedComparison
- rename
InstructComparisonArgillaAggregationLogic
toComparisonAggregationLogic
- remove:
ArgillaAggregator
- the regular aggregator now does the job - remove:
ArgillaEvaluationRepository
-ArgillaEvaluator
now usesAsyncRepository
which extend existingEvaluationRepository
for the human-feedback use-case ArgillaEvaluationLogic
now usesto_record
andfrom_record
instead ofdo_evaluate
. The signature of theto_record
stays the same. TheField
andQuestion
are now defined in the logic instead of passed to theArgillaRepository
ArgillaEvaluator
now takes theArgillaClient
as well as theworkspace_id
. It inherits from the abstractAsyncEvaluator
and no longer hasevalaute_runs
andevaluate
. Instead it hassubmit
andretrieve
.EvaluationOverview
gets attributesend_date
,successful_evaluation_count
andfailed_evaluation_count
- rename:
start
is now calledstart_date
and no longer optional
- rename:
- we refactored the internals of
Evaluator
. This is only relevant if you subclass from it. Most of the typing and data handling is moved toEvaluatorBase
New Features
- Add
ComparisonEvaluation
for the elo evaluation to abstract from the Argilla record - Add
AsyncEvaluator
for human-feedback evaluation.ArgillaEvaluator
inherits from this.submit
pushes all evaluations to Argilla to label them- Add
PartialEvaluationOverview
to store the submission details. .retrieve
then collects all labelled records from Argilla and stores them in anAsyncRepository
.- Add
AsyncEvaluationRepository
to store and retrievePartialEvaluationOverview
. Also addedAsyncFileEvaluationRepository
andAsyncInMemoryEvaluationRepository
- Add
EvaluatorBase
andEvaluationLogicBase
for base classes for both async and synchronous evaluation.
Fixes
- Improve description of using artifactory tokens for installation of IL
- Change
confusion_matrix
inSingleLabelClassifyAggregationLogic
such that it can be persisted in a file repository
Full Changelog: v1.1.0...v1.2.0
v1.1.0
New Features
AlephAlphaModel
now supports acontext_size
-property- Add new
IncrementalEvaluator
for easier addition of runs to existing evaluations without repeated evaluation.- Add
IncrementalEvaluationLogic
for use inIncrementalEvaluator
- Add
Full Changelog: v1.0.0...v1.1.0
Addendum to the initial release:
- The
use-cases
folder was renamed toexamples
.
Initial Release
Please see the readme for further details.
Thanks to all contributors and feedback!
v0.11.0
Breaking Changes
- breaking_change:
HuggingFaceDatasetRepository
now has a parametercaching
, which caches a examples of a dataset once loaded. This isTrue
by default. This drastically reduces network traffic. For a non-breaking change, set it toFalse
. - breaking_change:
MultipleChunkRetrieverQa
does not takeinsert_chunk_size
-parameter but instead takesExpandChunks
-task - breaking_change: the
issue_cassification_user_journey
notebook moved to its own repository
New Features
- feature:
Llama2InstructModel
to support llama-2 models in Aleph Alpha API - feature:
Llama3InstructModel
to support llama-3 models in Aleph Alpha API - feature:
ExpandChunks
-task caches chunked documents by ID - feature:
DocumentIndexClient
now supports
-create_index
-index_configuration
-assign_index_to_collection
-delete_index_from_collection
-list_assigned_index_names
- feature:
DocumentIndexRetriever
now supportsindex_name
- feature:
Runner.run_dataset
now has a configurable number of workers viamax_workers
and defaults to the previous value, which is 10. - feature: In case a
BusyError
is raised during acomplete
theLimitedConcurrencyClient
will retry untilmax_retry_time
is reached. - feature:
FileTracer
now accepts aslog_file_path
both, astr
and aPath
Fixes
- refactor: rename
index
parameter inDocumentIndex.search()
toindex_name
- fix:
HuggingFaceRepository
no longer is a dataset repository. This also means thatHuggingFaceAggregationRepository
no longer is a dataset repository.
Full Changelog: v0.10.0...v0.11.0
v0.10.0
Breaking Changes
- breaking change:
ExpandChunksOutput
now returnsChunkWithStartEndIndices
instead ofTextChunk
- breaking change:
MultipleChunkRetrieverQa
'sAnswerSource
now containsEnrichedChunk
instead of just theTextChunk
New Features
Fixes
- fix:
ChunkWithIndices
now additionally returns end_index - fix:
DocumentPath
andCollectionPath
are now immutable
v0.9.1
Breaking Changes
- breaking change:
MultipleChunkRetrieverQaOutput
now returnsources
andsearch_results
New Features
- feature:
ExpandChunks
task takes a retriever and some search results to expand the chunks to the desired length
Fixes
- fix:
ExpectedSearchOutput
has only relevant fields and supports generic document-ID
rather than just str - fix:
SearchEvaluationLogic
explicitly compares documents by ids - fix: In
RecusrsiveSummarize.do_run
,num_generated_tokens
not uninitialized anymore. See Issue 743.. - fix: Reverted pydantic to 2.6.* because of FastAPI incompatibility.
Full Changelog: v0.9.0...v0.9.1
v0.9.0
Breaking Changes
- breaking change: Renamed the field
chunk
ofAnswerSource
tosearch_result
for multi chunk retriever qa. - breaking change: The implementation of the HuggingFace repository creation and deletion got moved to
HuggingFaceRepository
New Features
- feature: HuggingFaceDataset- & AggregationRepositories now have an explicit
create_repository
function. - feature: Add
MultipleChunkRetrieverBasedQa
, a task that performs better on faster on retriever-QA, especially with longer context models
Full Changelog: v0.8.2...v0.9.0
v0.8.2
0.8.2
New Features
- feature: Add
SearchEvaluationLogic
andSearchAggregationLogic
to evaluateSearch
-use-cases - feature: Trace viewer and IL python package are now deployed to artifactory
Fixes
- Documentation
- fix: Add missing link to
issue_classification_user_journey
notebook to the tutorials section of README. - fix: Confusion matrix in
issue_classification_user_journey
now have rounded numbers.
- fix: Add missing link to
Full Changelog: v0.8.1...v0.8.2
v0.8.1
v0.8.0
What's Changed
New Features
-
feature: Expose start and end index in DocumentChunk
-
feature: Add sorted_scores property to
SingleLabelClassifyOutput
. -
feature: Error information is printed to the console on failed runs and evaluations.
-
feature: The stack trace of a failed run/evaluation is included in the
FailedExampleRun
/FailedExampleEvaluation
object -
feature: The
Runner.run_dataset(..)
andEvaluator.evaluate_run(..)
have an optional flagabort_on_error
to stop running/evaluating when an error occurs. -
feature: Added
Runner.failed_runs(..)
andEvaluator.failed_evaluations(..)
to retrieve all failed run / evaluation lineages -
feature: Added
.successful_example_outputs(..)
and.failed_example_outputs(..)
toRunRepository
to match the evaluation repository -
feature: Added optional argument to set an id when creating a
Dataset
viaDatasetRepository.create_dataset(..)
-
feature: Traces now log exceptions using the
ErrorValue
type. -
Documentation:
- feature: Add info on how to run tests in VSCode
- feature: Add
issue_classification_user_journey
notebook. - feature: Add documentation of newly added data retrieval methods
how_to_retrieve_data_for_analysis
- feature: Add documentation of release workflow
Fixes
- fix: Fix version number in pyproject.toml in IL
- fix: Fix instructions for installing IL via pip.
Full Changelog: v0.7.0...v0.8.0