Skip to content

Beethoven Piano Sonatas Dataset: Python scripts to convert raw data into the structured format

Notifications You must be signed in to change notification settings

groupmm/BPSD_scripts

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Beethoven Piano Sonatas Dataset (BPSD): Scripts

Johannes Zeitler ([email protected]), 2024

Accompanying code for J. Zeitler, C. Weiß, V. Arifi-Müller, and M. Müller, “BPSD: A coherent multi-version dataset for analyzing the first movements of Beethoven’s piano sonatas.” Transactions of the International Society for Music Information Retrieval, 7(1), 195-212, 2024.

Overview

This repository contains all scripts that are used to convert raw data (audio recordings, score information, annotations) into the structured format of the BPSD. All contents of this folder should be copied in ./3_Scripts/ in the BPSD folder structure.

Prerequisites

Contents

  • 01_audio_modifications.ipynb: Copy audio files from 0_RawData/audio_ripped to 1_Audio, apply structural modifications if necessary
  • 02_transcription.ipynb: Transcribe all audio files using the NoteEM transcription model from Maman and Bermano
  • 03_audio_audio_measureTransfer.ipynb: Transfer manual measure annotations from Kempff recordings (WK64) to all other audio versions
  • 04_xml_to_csv.ipynb: parse score from muxicXML to csv format
  • 05_audio_score_sync.ipynb: Align notes from score (in csv format) with audio recordings, using measure positions as anchors.
  • 06_snap_notes_to_measures.ipynb: "snap" all note onsets that occur on downbeats to the annotated measure positions
  • 07_create_syncInfo.ipynb: Create tuples for note onsets (musical and physical time) as sampling points for the transfer of score-based annotations
  • 08_getBeats_and_timeSig.ipynb: Export beat positions and time signatures
  • 09_score_audio_transfer.ipynb: Transfer score-based annotations to individual audio versions
  • 10_write_midis.ipynb: Export MIDIs from note event annotations, snthesize MIDIs using FluidSynth
  • 11_check_sync_accuracy.ipynb: use a heuristic to quantify the accuracy of the audio synchronization pipeline
  • 12_get_duration.ipynb: Compute and output statistics for duration of individual sonatas, audio versions, etc.
  • midi_utils.py: Provides some utility functions for writing MIDI files

About

Beethoven Piano Sonatas Dataset: Python scripts to convert raw data into the structured format

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published