- Baselines
- HMM
- RTRBM
- RNN-RBM
- N-gram language model: might be hard to sample, but could be used to score outputs
- Interpolating between two sound bytes
- Run forward and backward LSTMs, combine hidden states, emit from those states
- Multiple ways to combine hidden states:
- Concatenate at each time step and use. Might be bad because further away states are neglected.
- NMT by Jointly Learning to Align and Translate: Attention mechanism taking weighted combination of all hidden states to be interpolated over.
- Constraining length of output
- Motivation: phrases are not plausibly long
- Can introduce countdown to 0 in training and sampling procedures
- t-SNE of learned neural embedding, do similar chords map similarly
- Plot activations of hidden state over time
- How does it know a chord has at most 4 notes? (I'm expecting to see a "chord-end" memory cell)
- Modulation : how to get LSTM to do it?
- Interpolation : how to combine hidden states and emit? How to train?
- Constraining length : sanity check, how to carry information forward across phrases?
Keras notes
-
To do a convolutional/time distribed operation,
TimeDistributed
assumes the 1st axis (excluding sample axis 0) is the time dimension. This means thatPermute
should be used to satisfy this assumption -
For some reason my sharing of embedding matrices is only supported by the tensorflow backend...
"Modelling, Visualising and Summarising Documents with a Single Convolutional Neural Network", Denil et al 2014
- Music is also sequence built up of individual measures, phrases, parts, etc, amenable to time-invariant convolution
- Try building a convolutional representation for music then put discriminative classifiers on top
- e.g. Bach vs XYZ, Major vs Minor Key
Pitch Classes
- Don't really matter... 0.96 accuracy on pitch classes vs 0.95 without
- Future experiments should include octaves since significantly improves generated output
Many papers use pitch classes (i.e. mod 12), removing octave information...
Things to try:
- Segment phrases based on fermatas
- Encode (pitch, duration, chord) like (Lichtenwalter 2009)
Spent rest of day setting up keras
and tensorflow
; seems to be easier
to use for building new models...
Previous inputs had a new input per (pitch|REST,duration)
, these experiments
expand this into (pitch|REST)
tokens outputted at constant duration intervals.
Best result:
{u'num_layers': 1.0,
u'rnn_size': 512.03541493830664,
u'seq_length': 16.002948219489955,
u'val_loss': 0.15576986968517303,
u'wordvec_size': 6.0277702732883034}
seq_length | rnn_size | val_loss | wordvec_size | num_layers |
---|---|---|---|---|
16.0029 | 512.035 | 0.15577 | 6.02777 | 1 |
16.0029 | 471.29 | 0.175634 | 60.1177 | 1 |
16.0029 | 512.035 | 0.194531 | 1 | 1 |
11.3981 | 512.035 | 0.216492 | 128.825 | 1 |
11.1927 | 122.295 | 0.222993 | 128.825 | 2.13249 |
16.0029 | 512.035 | 0.241461 | 128.825 | 1 |
6.25468 | 460.95 | 0.247955 | 128.596 | 2.88079 |
6.92759 | 72.8999 | 0.272705 | 128.825 | 8.00018 |
4.00037 | 22.6282 | 0.306297 | 11.3501 | 2.82846 |
14.4828 | 13.1409 | 0.522982 | 1 | 1 |
1.00973 | 391.191 | 0.527153 | 59.2535 | 6.22639 |
15.3933 | 1 | 0.934034 | 128.825 | 1 |
15.9987 | 1 | 0.998908 | 128.825 | 1 |
1.8637 | 1.47925 | 1.16437 | 94.7932 | 5.22018 |
1 | 1 | 1.30671 | 1 | 1 |
16.0029 | 1 | 1.37111 | 1.35772 | 7.85394 |
1.85749 | 312.957 | 1.43217 | 27.7185 | 1.2719 |
1.15772 | 512.035 | 2.05152 | 128.825 | 3.09914 |
3.79838 | 512.035 | 3.99939 | 3.2196 | 8.00018 |
4.28328 | 1.08049 | 9.4367 | 128.825 | 7.73301 |
1.89024 | 1.44436 | 13.1816 | 12.317 | 8.00018 |
- Used
Spearmint
to do hyperparam optimization over major soprano monophonic LSTM models - Best result
val_loss=1.13967
withseq_length=6.94253
,rnn_size=29.5404
,wordvec_size=126.366
,num_layers=1.00082
, all floored.- Sampling with
temp=0.8
yielded believable melody lines
- Sampling with
All Spearmint
results:
val_loss | seq_length | rnn_size | wordvec_size | num_layers |
---|---|---|---|---|
1.81279 | 1 | 1 | 1 | 1 |
1.25221 | 4.00037 | 22.6282 | 11.3501 | 2.82846 |
1.54454 | 16.0029 | 86.9708 | 128.825 | 7.15475 |
1.91175 | 12.0003 | 10.0163 | 11.2472 | 6.57259 |
9.7702 | 1.03734 | 438.864 | 114.248 | 7.57239 |
3.72573 | 16.0029 | 1 | 128.825 | 1 |
10.5524 | 2.18817 | 142.192 | 1 | 2.83128 |
1.94992 | 1.34606 | 1 | 17.2647 | 8.00018 |
4.68856 | 16.0029 | 1 | 4.0772 | 1 |
3.09284 | 1 | 54.9462 | 21.2293 | 3.10958 |
1.84045 | 1 | 1.54607 | 12.446 | 2.11687 |
1.45445 | 16.0029 | 512.035 | 14.0071 | 1.09933 |
20.4365 | 2.49828 | 512.035 | 8.91201 | 1.17965 |
1.92739 | 15.4095 | 210.947 | 11.8957 | 3.7034 |
3.39995 | 7.74704 | 1 | 13.2716 | 6.58817 |
1.44211 | 16.0029 | 512.035 | 16.0497 | 1 |
1.83048 | 1 | 1 | 2.04846 | 1 |
4.70739 | 16.0029 | 1 | 38.4966 | 8.00018 |
3.84151 | 16.0029 | 1 | 128.825 | 1 |
1.41617 | 16.0029 | 512.035 | 128.825 | 8.00018 |
6.90625 | 5.17655 | 512.035 | 97.6266 | 8.00018 |
1.82316 | 3.89022 | 1.28814 | 1 | 1 |
1.58859 | 9.94785 | 512.035 | 128.825 | 1 |
1.61969 | 12.3247 | 512.035 | 128.825 | 1 |
1.33559 | 10.284 | 26.3633 | 128.825 | 6.37936 |
1.68185 | 4.7608 | 2.06105 | 128.825 | 1.00727 |
1.21108 | 4.29163 | 13.2033 | 128.825 | 1.62681 |
1.69945 | 1.39915 | 37.5867 | 2.38933 | 5.68261 |
2.26959 | 1.30102 | 7.32666 | 1 | 3.71789 |
1.20485 | 6.03521 | 14.0379 | 128.825 | 1 |
13.4733 | 4.98242 | 27.8648 | 1 | 8.00018 |
1.82551 | 4.37409 | 1.1129 | 128.825 | 1.0037 |
1.99838 | 3.55431 | 1 | 1.22266 | 1 |
1.57108 | 1 | 14.1768 | 3.16177 | 8.00018 |
1.36905 | 3.86123 | 4.4941 | 128.825 | 1 |
1.27876 | 5.44864 | 50.7731 | 128.825 | 1 |
1.67601 | 16.0029 | 483.363 | 128.825 | 6.73011 |
2.17016 | 1.02489 | 1 | 1.51668 | 6.55008 |
1.92507 | 1.52215 | 1 | 128.825 | 5.84352 |
1.17033 | 4.04997 | 48.6965 | 128.825 | 1.18371 |
1.45292 | 1 | 47.8396 | 128.825 | 7.77974 |
1.46929 | 16.0029 | 91.7265 | 128.825 | 5.99395 |
1.19416 | 4.06726 | 27.956 | 128.825 | 1.28507 |
2.51698 | 1.60321 | 20.3026 | 1 | 7.90567 |
4.53957 | 16.0029 | 512.035 | 1 | 7.85759 |
1.8674 | 1 | 1.72167 | 128.825 | 7.97761 |
2.04807 | 1.34766 | 3.16714 | 128.825 | 7.541 |
2.05087 | 10.6888 | 348.024 | 128.825 | 4.01615 |
2.03063 | 1.45055 | 1 | 3.80915 | 7.99896 |
4.48324 | 1.39391 | 338.201 | 18.1749 | 8.00018 |
1.8183 | 1.44086 | 1 | 1 | 1 |
2.18131 | 1.37276 | 1 | 1 | 4.3693 |
2.73249 | 1.48716 | 138.966 | 1.01189 | 7.2744 |
2.16369 | 1.41637 | 1 | 128.825 | 7.77549 |
2.24593 | 1.29825 | 1 | 1.00003 | 7.99566 |
2.49271 | 1 | 1.26242 | 1 | 7.51788 |
1.43529 | 1.17257 | 19.4415 | 128.825 | 8.00016 |
1.85919 | 4.48557 | 51.0752 | 1 | 1 |
1.27045 | 6.94479 | 85.2623 | 128.825 | 2.02529 |
1.29271 | 4.3174 | 41.9469 | 128.825 | 2.71045 |
1.16407 | 6.15871 | 41.2952 | 128.825 | 1 |
1.179 | 4.48587 | 53.2404 | 128.825 | 1 |
5.25998 | 3.96968 | 42.0234 | 1.14234 | 1 |
1.92197 | 3.77989 | 1 | 128.825 | 1.23153 |
1.67657 | 4.6666 | 23.0382 | 3.14412 | 1 |
14.7627 | 5.04117 | 174.806 | 1.52616 | 1 |
1.1712 | 5.66671 | 40.1667 | 128.825 | 1 |
2.09475 | 4.43712 | 16.4887 | 2.62326 | 1 |
1.3684 | 4.60999 | 4.36251 | 128.825 | 1 |
1.87167 | 3.95252 | 1 | 123.594 | 8.00018 |
2.0122 | 4.28938 | 1 | 1 | 1 |
1.72728 | 5.43109 | 4.92454 | 120.619 | 1 |
1.72963 | 3.97031 | 1 | 128.825 | 1 |
3.21394 | 4.18858 | 50.1616 | 123.625 | 8.00018 |
2.30705 | 3.70729 | 1 | 1 | 8.00018 |
1.75668 | 3.58032 | 118.189 | 93.7408 | 8.00018 |
2.83069 | 1.05197 | 78.8487 | 1.12416 | 8.00018 |
3.01681 | 2.96283 | 53.8838 | 126.457 | 8.00018 |
2.69333 | 1 | 12.5461 | 1.04687 | 5.61842 |
6.16394 | 16.0029 | 512.034 | 3.00566 | 7.79508 |
2.50471 | 3.20547 | 1 | 1.00527 | 8.00018 |
2.74393 | 2.86051 | 1 | 1.58627 | 1 |
1.94912 | 3.55173 | 321.378 | 128.825 | 8.00018 |
2.50607 | 3.1362 | 1 | 1.2272 | 7.86776 |
2.50117 | 3.40996 | 1 | 2.59141 | 8.00018 |
2.01436 | 1.9717 | 1 | 1.84658 | 8.00018 |
1.74399 | 2.13574 | 1 | 2.74339 | 1 |
1.55385 | 3.77291 | 19.4764 | 3.40497 | 1 |
1.84827 | 1 | 1 | 1.20563 | 1.00026 |
1.48099 | 16.0029 | 305.927 | 116.831 | 8.00018 |
3.25906 | 4.1078 | 218.305 | 128.825 | 8.00018 |
1.88984 | 3.52137 | 1 | 1.01016 | 1.00719 |
1.95496 | 3.10587 | 205.784 | 128.825 | 8.00018 |
2.53949 | 6.43489 | 24.0808 | 10.4327 | 5.12246 |
1.18219 | 16.0029 | 61.1378 | 128.735 | 2.79494 |
1.62688 | 1.01011 | 144.572 | 128.825 | 8.00018 |
1.19234 | 10.6007 | 30.2878 | 128.825 | 1.90145 |
1.89839 | 16.0029 | 501.778 | 115.486 | 3.64192 |
3.02298 | 6.336 | 48.863 | 1 | 1.01751 |
2.94935 | 4.87841 | 65.0807 | 1.58859 | 1.03422 |
1.48481 | 1 | 32.8238 | 5.21677 | 6.51185 |
2.57488 | 1 | 196.637 | 99.4593 | 8.00018 |
1.18709 | 2.31219 | 29.1906 | 16.6164 | 1 |
3.3022 | 1.00246 | 126.95 | 1 | 8.00018 |
1.35802 | 1.20229 | 35.7395 | 128.825 | 8.00018 |
1.17939 | 2.98678 | 21.6642 | 111.966 | 1 |
2.22157 | 1.37443 | 17.3358 | 4.96489 | 1.0013 |
1.45561 | 16.0029 | 18.6679 | 126.261 | 4.40354 |
1.33853 | 16.0029 | 269.284 | 128.097 | 1.01187 |
1.36142 | 15.922 | 42.1897 | 7.82053 | 4.28042 |
1.58634 | 3.79205 | 14.2155 | 7.52816 | 1.00302 |
2.07173 | 4.96631 | 35.1168 | 1 | 1.06558 |
2.16536 | 1.02073 | 60.2927 | 2.89752 | 7.69905 |
1.19761 | 4.99027 | 20.8433 | 128.825 | 1.00068 |
1.21183 | 3.25201 | 31.8954 | 121.463 | 4.74495 |
1.15955 | 7.68755 | 52.693 | 128.825 | 1 |
1.19747 | 15.9547 | 40.3288 | 106.232 | 4.79994 |
1.14678 | 14.3189 | 105.651 | 128.825 | 1 |
1.31329 | 4.75247 | 39.5341 | 6.5095 | 1 |
1.28955 | 3.83953 | 12.3442 | 128.825 | 6.88565 |
1.25079 | 3.35264 | 31.9617 | 128.499 | 7.30232 |
3.54208 | 2.79402 | 24.04 | 116.388 | 6.24677 |
1.23928 | 3.70084 | 38.0085 | 121.191 | 6.69755 |
1.18921 | 2.31037 | 7.9934 | 128.825 | 1 |
1.16043 | 3.37842 | 33.9173 | 128.253 | 1 |
1.55291 | 3.4025 | 90.3102 | 128.825 | 5.64659 |
1.19088 | 3.09136 | 47.4656 | 116.573 | 1 |
1.31964 | 3.57266 | 16.7673 | 127.438 | 8.00018 |
1.13287 | 2.12181 | 16.7073 | 127.608 | 1 |
1.29983 | 3.42929 | 54.3889 | 124.45 | 5.2666 |
1.89918 | 3.58951 | 1 | 2.99075 | 4.84539 |
1.15648 | 4.54977 | 24.7907 | 114.179 | 1 |
1.18002 | 3.16433 | 26.1107 | 109.408 | 1 |
1.28697 | 2.23085 | 11.4699 | 6.17508 | 1 |
1.47203 | 16.0029 | 424.294 | 96.665 | 8.00018 |
1.35923 | 15.7712 | 180.292 | 108.152 | 8.00018 |
1.22905 | 3.56232 | 23.4472 | 116.798 | 6.1748 |
1.55541 | 1.72011 | 93.4516 | 101.304 | 8.00018 |
2.04159 | 16.0029 | 40.1776 | 1 | 1.00011 |
1.1952 | 2.49004 | 20.0365 | 114.219 | 1 |
2.77794 | 15.5213 | 512.035 | 1 | 1.02812 |
1.61996 | 14.5473 | 512.035 | 4.56864 | 1 |
1.21182 | 3.58192 | 30.8165 | 128.573 | 7.61866 |
1.46842 | 16.0029 | 212.909 | 126.77 | 7.25935 |
1.23573 | 3.89539 | 16.4708 | 123.302 | 4.63252 |
1.68854 | 16.0029 | 93.8211 | 2.23815 | 1.01339 |
1.24737 | 15.7682 | 51.6474 | 117.842 | 1 |
1.18779 | 5.80308 | 23.8137 | 114.688 | 1 |
1.519 | 3.55924 | 80.2821 | 114.28 | 7.09611 |
1.26767 | 15.9138 | 163.389 | 128.463 | 1 |
1.58908 | 8.73408 | 60.4034 | 3.12543 | 1 |
1.13967 | 6.94253 | 29.5404 | 126.366 | 1.00082 |
1.47346 | 1.02767 | 15.9882 | 3.0994 | 6.65049 |
1.19142 | 3.78845 | 111.306 | 128.825 | 1.00895 |
6.06288 | 15.3437 | 490.379 | 1 | 8.00018 |
1.87639 | 4.9916 | 491.995 | 128.825 | 1 |
1.93319 | 4.25991 | 508.601 | 128.825 | 1 |
1.68181 | 5.70063 | 509.011 | 128.825 | 1 |
3.56743 | 5.02613 | 504.939 | 128.825 | 8.00018 |
1.88407 | 4.5674 | 511.515 | 128.825 | 1 |
1.97464 | 13.1559 | 510.938 | 1.60573 | 1.10128 |
1.53288 | 16.0029 | 454.708 | 128.478 | 1 |
3.16029 | 4.69018 | 509.507 | 1.20061 | 1 |
2.17807 | 16.0029 | 499.117 | 1.5671 | 1 |
4.56358 | 5.62898 | 456.094 | 120.225 | 8.00018 |
1.57462 | 16.0029 | 487.641 | 128.825 | 1.02787 |
1.76454 | 5.19752 | 17.6369 | 1.58856 | 1.01913 |
4.76278 | 1 | 511.678 | 1 | 8.00018 |
1.93808 | 2.43203 | 511.584 | 128.825 | 1 |
1.71001 | 2.74913 | 509.772 | 128.825 | 1 |
17.6859 | 3.12348 | 511.251 | 1.74208 | 1.23072 |
2.75805 | 4.46449 | 22.5683 | 1.67204 | 1 |
1.79666 | 16.0029 | 379.387 | 2.68668 | 1.00074 |
2.28637 | 5.92394 | 1 | 1.62136 | 1.46851 |
1.5856 | 2.8623 | 356.665 | 2.92364 | 1 |
1.75186 | 2.21219 | 1 | 2.15511 | 1.21592 |
18.9152 | 2.12912 | 168.995 | 1.21215 | 1 |
3.93017 | 4.75664 | 1 | 1.2334 | 6.75532 |
2.02585 | 1 | 1 | 1.65368 | 8.00018 |
1.82365 | 1 | 1 | 1.18676 | 2.68117 |
2.45338 | 16.0029 | 324.993 | 1.64182 | 1.00067 |
3.94181 | 2.77051 | 371.001 | 1.01043 | 1 |
2.58022 | 5.28293 | 1 | 1.00899 | 8.00018 |
2.32257 | 2.42847 | 488.999 | 75.7214 | 1 |
2.48066 | 5.19708 | 1 | 1.01857 | 1 |
1.83206 | 1 | 1 | 2.5297 | 1 |
1.98198 | 16.0029 | 239.976 | 1.80369 | 1 |
3.59905 | 2.21228 | 1 | 1.00908 | 8.00018 |
1.90486 | 5.60672 | 1 | 1.01175 | 1.68969 |
4.08377 | 11.4198 | 1 | 1.01033 | 1.99797 |
2.19399 | 3.45414 | 1 | 1.01273 | 6.39969 |
2.12843 | 5.91437 | 1 | 1.01445 | 1 |
5.06573 | 12.2589 | 252.052 | 1.01313 | 2.42166 |
1.89684 | 3.38058 | 1 | 1.01567 | 1.00369 |
1.82683 | 1.79312 | 1 | 1.61672 | 1 |
2.09055 | 1.99713 | 7.66155 | 1.02651 | 1.0045 |
3.26499 | 2.60936 | 298.33 | 1.03542 | 1 |
38.0935 | 1 | 387.262 | 1.02453 | 1 |
1.79081 | 2.47266 | 1 | 1.02802 | 1.30136 |
2.70179 | 4.08058 | 1 | 1.00678 | 5.66322 |
3.70049 | 5.01941 | 125.481 | 1.01641 | 8.00018 |
2.52746 | 2.26491 | 1 | 1.04334 | 8.00018 |
2.6144 | 1.34182 | 109.675 | 1.00672 | 7.98652 |
1.89099 | 1.51636 | 1.02817 | 1.06104 | 3.43684 |
5.01127 | 3.55865 | 188.044 | 1.04538 | 1 |
3.43548 | 1 | 69.1306 | 1.07355 | 8.00018 |
7.10486 | 6.85572 | 512.035 | 1.17991 | 2.29812 |
1.87445 | 16.0029 | 507.561 | 1.13841 | 1 |
1.63515 | 2.55714 | 340.268 | 105.369 | 1.88211 |
6.87511 | 1.59641 | 510.92 | 65.2621 | 1 |
7.40836 | 3.42253 | 329.605 | 1.02056 | 5.19144 |
1.44172 | 3.06819 | 436.638 | 124.987 | 1.04117 |
1.85288 | 1.63065 | 53.3601 | 1.0173 | 8.00018 |
1.97156 | 4.89706 | 70.355 | 1.01513 | 1 |
1.54226 | 16.0029 | 511.366 | 73.8461 | 1.01515 |
3.87939 | 5.92443 | 510.453 | 1.05408 | 1.48474 |
1.51564 | 3.17769 | 500.43 | 128.825 | 1.04849 |
2.05485 | 1 | 4.23242 | 1.12558 | 8.00018 |
1.97433 | 15.804 | 10.7363 | 1.35953 | 1.0445 |
1.42677 | 3.22435 | 343.821 | 128.825 | 1.10866 |
2.69979 | 2.62094 | 1 | 1 | 8.00018 |
2.80693 | 16.0029 | 508.23 | 1.2509 | 7.44169 |
5.56769 | 1.75418 | 115.663 | 82.5968 | 1.02226 |
10.0386 | 2.70432 | 376.212 | 90.4778 | 8.00018 |
1.75714 | 5.97219 | 511.494 | 128.825 | 1 |
3.61418 | 5.74625 | 501.833 | 1.16609 | 1 |
1.58739 | 3.34706 | 476.397 | 85.2811 | 1 |
1.54583 | 2.99822 | 228.601 | 108.462 | 1 |
3.28737 | 16.0029 | 54.0864 | 1.08781 | 8.00018 |
1.61078 | 16.0029 | 510.841 | 113.284 | 8.00018 |
7.69706 | 1 | 512.035 | 1 | 7.44544 |
2.67493 | 16.0029 | 441.365 | 1.77608 | 8.00018 |
1.74966 | 16.0029 | 50.4464 | 1.31085 | 1.07266 |
3.7585 | 16.0029 | 189.338 | 1.04997 | 8.00018 |
1.81978 | 2.38928 | 509.248 | 127.419 | 1 |
1.35456 | 3.49442 | 92.3896 | 126.596 | 2.37375 |
1.66493 | 2.31178 | 1 | 37.7945 | 3.80823 |
1.82544 | 2.72484 | 502.978 | 128.731 | 1 |
1.75726 | 2.87181 | 479.845 | 128.825 | 1 |
1.68781 | 15.0497 | 509.078 | 128.825 | 2.46445 |
1.46165 | 3.95094 | 496.274 | 128.056 | 1 |
1.62109 | 2.77635 | 325.136 | 128.444 | 1 |
9.84507 | 2.36965 | 1 | 1.22999 | 3.43455 |
2.12618 | 8.06339 | 37.5034 | 1.55497 | 1.03319 |
19.9971 | 3.70609 | 512.035 | 1 | 1.22997 |
2.50406 | 3.6462 | 1 | 1.07039 | 8.00018 |
1.74508 | 2.69366 | 126.026 | 1.01994 | 1.0059 |
1.84615 | 1.3313 | 1 | 1.006 | 1 |
6.35803 | 3.40397 | 70.1067 | 1.006 | 1 |
18.1778 | 2.2708 | 125.054 | 1.02172 | 8.00018 |
1.8263 | 1.6561 | 1 | 2.18745 | 2.95551 |
1.6426 | 1.85016 | 7.80418 | 107.429 | 3.65741 |
1.99685 | 2.53031 | 1 | 1 | 1.00799 |
1.82727 | 1.8859 | 1 | 1.01417 | 1.07867 |
10.267 | 2.8795 | 217.669 | 1.03202 | 1.09872 |
1.73753 | 2.7155 | 23.3139 | 1.00842 | 1 |
3.24402 | 3.02396 | 441.701 | 1.26887 | 1 |
1.34272 | 3.44249 | 238.292 | 128.825 | 1 |
1.93469 | 3.04653 | 1 | 1 | 1 |
1.36159 | 2.59927 | 163.243 | 124.008 | 1 |
1.81301 | 2.57622 | 510.034 | 128.825 | 1.00085 |
1.52424 | 3.09678 | 488.866 | 93.7029 | 1 |
2.34485 | 3.53712 | 492.5 | 105.515 | 8.00018 |
3.94889 | 1.54106 | 61.0157 | 1.00655 | 1 |
2.03146 | 3.30615 | 38.6263 | 1.53452 | 1 |
2.21316 | 4.7681 | 44.6425 | 1.00104 | 1 |
2.57991 | 1.8712 | 1 | 1.04351 | 8.00018 |
- Will try:
- Train on all voices
- Split major/minor pieces apart
- Model only the duration
- Low validation loss doesn't imply poor perceptual performance. In contrast, overfit models tended to yield more realistic samples
- Subsetting to only major/minor pieces significantly improves sample quality
- Training on all four parts significantly improves performance over using just Soprano, but introduces obvious non-melodic parts (e.g. periods of rest)
- Improved preprocessing using
bachbot get_chorales
- Get corpus with
music21
- Transpose to Cmaj/Amin (is there a standard way to do this?)
- Strip all information except
(Note+Octave|Rest, Duration)
- Write processed data to
bachbot/scratch/{bwv_id}-mono.txt
- Get corpus with
seq length | wordvec size | num layers | rnn size | dropout | batchnorm | lr | nepoch | final train loss | final val loss |
---|---|---|---|---|---|---|---|---|---|
8 | 64 | 2 | 256 | 0 | 1 | 2e-3 | 30 | 0.238247 | 1.5794 |
8 | 64 | 2 | 128 | 0 | 1 | 2e-3 | 50 | 0.349 | 1.367 |
4 | 64 | 2 | 128 | 0 | 1 | 2e-3 | 50 | 0.288 | 1.434 |
4 | 32 | 2 | 128 | 0 | 1 | 2e-3 | 50 | 0.2527 | 1.8538 |
8 | 32 | 2 | 32 | 0 | 1 | 2e-3 | 50 | 1.044 | 1.191 |
8 | 32 | 2 | 64 | 0 | 1 | 2e-3 | 50 | 0.7539 | 1.236 |
8 | 64 | 2 | 32 | 0 | 1 | 2e-3 | 50 | 1.027 | 1.190 |
2 | 64 | 2 | 32 | 0 | 1 | 2e-3 | 50 | 0.783344 | 1.25899 |
4 | 64 | 2 | 32 | 0 | 1 | 2e-3 | 50 | 1.064 | 1.197 |
8 | 64 | 1 | 32 | 0 | 1 | 2e-3 | 50 | 1.022 | 1.188 |
8 | 64 | 1 | 32 | 0 | 1 | 2e-3 | 50 | 1.096 | 1.186 |
8 | 64 | 3 | 32 | 0 | 1 | 2e-3 | 50 | 0.989 | 1.186 |
8 | 64 | 3 | 32 | 0 | 1 | 2e-3 | 50 | 0.953 | 1.183 |
8 | 64 | 4 | 32 | 0 | 1 | 2e-3 | 50 | 1.0104 | 1.2274 |
8 | 64 | 4 | 64 | 0 | 1 | 2e-3 | 50 | 1.0165 | 1.2038 |
8 | 64 | 4 | 64 | 0.5 | 1 | 2e-3 | 27.51 | 1.392 | 1.4355 |
8 | 64 | 4 | 64 | 0.5 | 0 | 2e-3 | 25.10 | 1.807 | 1.851 |
6 | 64 | 3 | 32 | 0 | 1 | 2e-3 | 50 | 0.9304 | 1.2137 |
8 | 64 | 3 | 16 | 0 | 1 | 2e-3 | 50 | 1.264 | 1.2311 |
12 | 64 | 3 | 32 | 0 | 1 | 2e-3 | 50 | 1.030 | 1.1909 |
Generative results don't sound too realistic...
seq_length=8,wordvec=128,num_layers=2,rnn_size=256,dropout=0,batchnorm=1,lr=2e-3
- Sounds much better with an overfit LSTM and
temperature=0.98
...- Maybe generalizable modeling isn't a good criteria...
-
Added
extract_melody
, which extracts the 0th part frommusic21.stream.Score
and assumes they are the melody -
Music representation:
- Since music21 cannot output kern, use musicXML output
- We currently include all header and dynamics info; should we strip that?
seq length | wordvec size | num layers | rnn size | dropout | batchnorm | lr | nepoch | final train loss | final val loss |
---|---|---|---|---|---|---|---|---|---|
500 | 64 | 2 | 256 | 0 | 1 | 2e-3 | 16.19 | 0.022378 | 0.029262 |
50 | 64 | 2 | 256 | 0 | 1 | 2e-3 | 13.41 | 0.028490 | 0.032692 |
100 | 64 | 2 | 256 | 0 | 1 | 2e-3 | 13.41 | 0.028490 | 0.032692 |
seq length | wordvec size | num layers | rnn size | dropout | batchnorm | lr | nepoch | final train loss | final val loss |
---|---|---|---|---|---|---|---|---|---|
50 | 64 | 2 | 256 | 0 | 0 | 2e-3 | 51 | 0.443295 | 0.619 |
500 | 64 | 2 | 256 | 0 | 1 | 2e-3 | 21.45 | 0.4094 | 0.5779 |
500 | 64 | 2 | 256 | 0 | 1 | 2e-3 | 31.00 | 0.440350 | 0.572764 |
500 | 64 | 2 | 256 | 0 | 1 | 1e-2 | 28.73 | 0.287570 | 0.6176 |
50 | 64 | 2 | 256 | 0 | 1 | 1e-2 | 13.65 | 0.390861 | 0.6316 |
wordvec_size=64
appears to perform best, should use for defaults in future:rnnsize=256
num_layers=2
wordvec_size=64
- Training interrupted by
cudnn
recompilation - Results suggest
val_loss
does best withrnn_size=256
,num_layers=2
-
Training on entire corpus ** BAD: kern format has K voices => each line has K space-delimited notes ** This suggests output should be a K-dimensional vector rather than character-by-character
-
Traning on just chorales