Version 3.1.0
TL;DR
pyannote/speaker-diarization-3.1
no longer requires unpopular ONNX runtime
Full changelog
New features
- feat(model): add WeSpeaker embedding wrapper based on PyTorch
- feat(model): add support for multi-speaker statistics pooling
- feat(pipeline): add
TimingHook
for profiling processing time - feat(pipeline): add
ArtifactHook
for saving internal steps - feat(pipeline): add support for list of hooks with
Hooks
- feat(utils): add
"soft"
option toPowerset.to_multilabel
Fixes
- fix(pipeline): add missing "embedding" hook call in
SpeakerDiarization
- fix(pipeline): fix
AgglomerativeClustering
to honornum_clusters
when provided - fix(pipeline): fix frame-wise speaker count exceeding
max_speakers
or detectednum_speakers
inSpeakerDiarization
pipeline
Improvements
- improve(pipeline): compute
fbank
on GPU when requested
Breaking changes
- BREAKING(pipeline): rename
WeSpeakerPretrainedSpeakerEmbedding
toONNXWeSpeakerPretrainedSpeakerEmbedding
- BREAKING(setup): remove
onnxruntime
dependency.
You can still use ONNXhbredin/wespeaker-voxceleb-resnet34-LM
but you will have to installonnxruntime
yourself. - BREAKING(pipeline): remove
logging_hook
(useArtifactHook
instead) - BREAKING(pipeline): remove
onset
andoffset
parameter inSpeakerDiarizationMixin.speaker_count
You should now binarize segmentations before passing them tospeaker_count