Skip to content

Version 3.2.0

Compare
Choose a tag to compare
@hbredin hbredin released this 08 May 09:51
· 40 commits to develop since this release

New features

  • feat(task): add option to cache task training metadata to speed up training (with @clement-pages)
  • feat(model): add receptive_field, num_frames and dimension to models (with @Bilal-Rahou)
  • feat(model): add fbank_only property to WeSpeaker models
  • feat(util): add Powerset.permutation_mapping to help with permutation in powerset space (with @FrenchKrab)
  • feat(sample): add sample file at pyannote.audio.sample.SAMPLE_FILE
  • feat(metric): add reduce option to diarization_error_rate metric (with @Bilal-Rahou)
  • feat(pipeline): add Waveform and SampleRate preprocessors

Fixes

  • fix(task): fix random generators and their reproducibility (with @FrenchKrab)
  • fix(task): fix estimation of training set size (with @FrenchKrab)
  • fix(hook): fix torch.Tensor support in ArtifactHook
  • fix(doc): fix typo in Powerset docstring (with @lukasstorck)

Improvements

  • improve(metric): add support for number of speakers mismatch in diarization_error_rate metric
  • improve(pipeline): track both Model and nn.Module attributes in Pipeline.to(device)
  • improve(io): switch to torchaudio >= 2.2.0
  • improve(doc): update tutorials (with @clement-pages)

Breaking changes

  • BREAKING(model): get rid of Model.example_output in favor of num_frames method, receptive_field property, and dimension property
  • BREAKING(task): custom tasks need to be updated (see "Add your own task" tutorial)

Community contributions