Skip to content

Latest commit

 

History

History

Chapter 5 - Audio generation with NSynth and GANSynth

Magenta Version 1.1.7

This chapter will show audio generation. We'll first provide an overview of WaveNet, an existing model for audio generation, especially efficient in text to speech applications. In Magenta, we'll use NSynth, a Wavenet Autoencoder model, to generate small audio clips, that can serve as instruments for a backing MIDI score. NSynth also enables audio transformation like scaling, time stretching and interpolation. We'll also use GANSynth, a faster approach based on generative adversarial network (GAN).

Magenta Versioning

- A newer version of this code is available.

This branch shows the code for Magenta v1.1.7, which corresponds to the code in the book. For a more recent version, use the updated Magenta v2.0.1 branch.

Utils

There are some audio utilities in the audio_utils.py file, useful for saving and loading the encodings (save_encoding and load_encodings), time stretching (timestretch) them and saving spectrogram plots (save_spectrogram_plot and save_rainbowgram_plot).

Sounds and MIDI

We provide some sound samples and MIDI files for the examples.

Code

Before you start, follow the installation instructions for Magenta 1.1.7.

This example shows how to use NSynth to interpolate between pairs of sounds.

# Runs the example, the output audio will be in the "output/nsynth" folder
python chapter_05_example_01.py

This example shows how to use GANSynth to generate intruments for a backing score from a MIDI file.

# Runs the example, the output audio will be in the "output/gansynth" folder
python chapter_05_example_02.py