Releases · sacdallago/biotrainer

01 Nov 10:05

SebieF

v0.9.4

69698af

v0.9.4 Latest

Latest

29.10.2024 - Version 0.9.4

Bug fixes

Hotfix for incorrect precision mode setting by @SebieF in #116

Maintenance

Updating dependencies: removing python3.9 support
Updating CI workflow to be compatible with Windows

Known problems

Currently, there are compatibility problems with ONNX on some machines, please refer to the following issue: #111

Contributors

SebieF

Assets 2

14 Oct 10:03

SebieF

v0.9.3

3111aa7

v0.9.3

14.10.2024 - Version 0.9.3

Features

Adding support for ProstT5 embedder by @SebieF in #110

Maintenance

Adding troubleshooting guide by @SebieF in #109

Bug fixes

Adding improved onnx saving and inferencer fixes by @SebieF in #112

Contributors

SebieF

Assets 2

26 Aug 16:54

SebieF

v0.9.2

76a831e

v0.9.2

26.08.2024 - Version 0.9.2

Features

Improving memory management of embedding calculation by @SebieF in #96
Use a strategy for sequence preprocessing by @SebieF in #99
Adding ONNX support by @SebieF in #101
Adding saprot embedder example by @SebieF in #106

Maintenance

BREAKING Improving masking mechanisms in CNN and LightAttention models by @SebieF in #102
Improving embedder model and tokenizer class recognition by @SebieF in #105
Optimize Memory Handling in Embedding Computations and Refactor EmbeddingService by @heispv in #103
Updating dependencies

Contributors

SebieF and heispv

Assets 2

14 Jul 10:09

SebieF

v0.9.1

9589d12

v0.9.1

10.07.2024 - Version 0.9.1

Maintenance

Fixing error in type checking for device
Updating dependencies
Updating inference examples
Adding hint for version mismatch in inferencer
Adding class weights to out.yml if they are calculated
Adding contributors file

Features

Improving fallback mechanism of embedder models. Now, cpu mode is exited once there is enough
RAM again for shorter sequences
Changing model storage format from .pt to .safetensors.
Safetensors is safer for model sharing. Legacy .pt format is still supported, and can be converted via

from biotrainer.inference import Inferencer
inferencer, out_file = Inferencer.create_from_out_file(out_file_path="out.yml", allow_torch_pt_loading=True)
inferencer.convert_all_checkpoints_to_safetensors()

Assets 2

16 Jun 10:35

SebieF

v0.9.0

cfbef9c

v0.9.0

16.06.2024 - Version 0.9.0

Maintenance

Adding more extensive code documentation
Optimizing imports
Applying consistent file naming
Updating dependencies. Note that jupyter was removed as a direct optional dependency.
You can always add it via poetry add jupyter.
Adding simple differentiation between t5 and esm tokenizer and models in embedders module

Features

Adding new residues_to_value protocol.
Similar to the residues_to_class protocol,
this protocol predicts a value for each sequence, using per-residue embeddings. It might, in some situations, outperform
the sequence_to_value protocol.

Bug fixes

For huggingface_transformer_embedder.py, all special tokens are now always deleted from the final embedding
(e.g. first/last for esm1b, last for t5)

Assets 2

06 Jun 13:24

SebieF

v0.8.4

726b92e

v0.8.4

04.06.2024 - Version 0.8.4

Maintenance

Updating dependencies
Adding pip-audit dependency check to CI pipeline

Assets 2

04 May 09:07

SebieF

v0.8.3

8fc86c2

v0.8.3

04.05.2024 - Version 0.8.3

Maintenance

Updating dependencies

Features

Adding mps device for macOS. Use by setting the following configuration option: device: mps.
Note that MPS is still under development, use it at your responsibility.
Adding flags to the compute_embedding method of EmbeddingService

force_output_dir: Do not change the given output directory within the method
force_recomputing: Always re-compute the embeddings, even if an existing file is found

These changes are made to make the embedders module of biotrainer easier usable outside the biotrainer pipeline itself.

Assets 2

28 Feb 09:22

SebieF

v0.8.2

020e8dd

v0.8.2

Maintenance

Updating dependencies

Features

Adding option to ignore verification of files in configurator.py. This makes it possible to verify a biotrainer
configuration independently of the provided files.
Adding new compute_embeddings_from_list function to embedding_service.py. This allows to compute embeddings directly
from sequence strings.

Assets 2

12 Jan 14:39

SebieF

v0.8.1

926838e

v0.8.1

12.01.2024 - Version 0.8.1

Maintenance

Updating dependencies after removing bio_embeddings, notably upgrading torch and adding accelerate
Updating examples, documentation, config and test files for inferencer tests to match the new compile mode
Replaced the exception with a warning if dropout_rate was set for a model that does not support it (e.g. LogReg)

Features

Enable pytorch compile mode. The feature exists since torch 2.0 and is now available in biotrainer. It can be enabled via

disable_pytorch_compile: False

Assets 2

09 Jan 09:29

SebieF

v0.8.0

6879b6b

v0.8.0

04.01.2024 - Version 0.8.0

Maintenance

Removing dependency on bio_embeddings entirely. bio_embeddings is not really maintained
anymore (last commit 2 years ago) and being dependent on a specific external module for embeddings calculation
shrinks the overall capabilities of biotrainer. Now, for example, adding LORA layers becomes much easier.
While bio_embeddings does have its advantages such as a well-defined pipeline and a lot of utilities, it also
provides a lot of functionalities that is not used by biotrainer. Therefore, a new embedders module was introduced
to biotrainer that mimics some aspects of bio_embeddings and takes inspiration from it. However, it is built in a more
generic way and enables, in principle, all huggingface transformer embedders to be used by biotrainer.
Ankh custom embedder was removed, because it can now be used directly in biotrainer:

embedder_name: ElnaggarLab/ankh-large

Adding new use_half_precision option for transformer embedders
Adding missing device option

Bug fixes

Fixed a minor problem for model saving in Solver.py:
If a new model was trained, and it does not improve until early_stop is triggered, it was not saved as a checkpoint.

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

29.10.2024 - Version 0.9.4

Bug fixes

Maintenance

Known problems

Contributors

14.10.2024 - Version 0.9.3

Features

Maintenance

Bug fixes

Contributors

26.08.2024 - Version 0.9.2

Features

Maintenance

Contributors

10.07.2024 - Version 0.9.1

Maintenance

Features

16.06.2024 - Version 0.9.0

Maintenance

Features

Bug fixes

04.06.2024 - Version 0.8.4

Maintenance

04.05.2024 - Version 0.8.3

Maintenance

Features

Maintenance

Features

12.01.2024 - Version 0.8.1

Maintenance

Features

04.01.2024 - Version 0.8.0

Maintenance

Bug fixes

Releases: sacdallago/biotrainer

v0.9.4

29.10.2024 - Version 0.9.4

Bug fixes

Maintenance

Known problems

Contributors

v0.9.3

14.10.2024 - Version 0.9.3

Features

Maintenance

Bug fixes

Contributors

v0.9.2

26.08.2024 - Version 0.9.2

Features

Maintenance

Contributors

v0.9.1

10.07.2024 - Version 0.9.1

Maintenance

Features

v0.9.0

16.06.2024 - Version 0.9.0

Maintenance

Features

Bug fixes

v0.8.4

04.06.2024 - Version 0.8.4

Maintenance

v0.8.3

04.05.2024 - Version 0.8.3

Maintenance

Features

v0.8.2

Maintenance

Features

v0.8.1

12.01.2024 - Version 0.8.1

Maintenance

Features

v0.8.0

04.01.2024 - Version 0.8.0

Maintenance

Bug fixes