speech.txt


<Xindi>

Hello everyone, the project we are presenting is understanding music with higher order network.

Our general question is whether we can use machine to understand how music is composed and structured, and further understand what makes music different between music eras, genres, and composers.

The data we are using is “The largest MIDI collection on the Internet” from a Reddit post. In MIDI coding, there are in total 128 codes for 12 notes across 11 octaves. However, if a music piece is performed under a different key, it should be regarded as the same piece. Therefore, we use a python package music21 to detect the tonal note and re-index each note relative to the tonal note. For example, for the snippet twinkle twinkle little star, we have the original coding as 60, 60, 67, etc., and after reindex it we end up having the sequence as 0, 0, 7, etc.

The framework we are using is Higher Order Network. Given a piece of music, if we form a simple network, we would simply connect two adjacent notes together and weight the edges if that note pair happens multiple times. However, under this setting, we are losing the “memory” in the music, which may be very important. Instead, higher order network will keep the memory information. For example, for a two order network, we would preserve one step back information and end up with a chain. However, this setting will become very complex when we have 3rd, 4th, 5th order. Here we use the higher order network algorithm by Xu et al, where they would automatically extract the rules with different orders, and end up with a network containing several different orders. In this network, rules will become nodes, and edges connecting them denotes the transition probability between different rules.
</Xindi>


<Ricky>
We came up with some features from the higher order network that we think may capture some characteristics of the music. We have abruptness, branching and melodic.

Abruptness capture whether there are dramatic change in the music piece. What we do is, after removing edges with very low transition probabilities, find edges that have high betweenness centrality but low transition probability. We then, look at the pitch difference between the two end node of this edge. What we found is that Jazz, pop and rock are more abrupt than american folk and classical music.

Branching sort of capture how complex the music is. If the network is more chainly, the music will sound more smooth and probably more memorable. If the network have more branches, there will be more variation in the music piece. To measure this, we simply take the average degree of the network. We observe that classical music have more branches, followed by american folk. Jazz is somewhat in between; rock and pop tend to be more “chainly”.

Melodic capture whether there are a clear melody line that occurs a lot of times. We expect that if there is a melody line that repeats a lot of times, we would observe them in the extracted rules. For this, we simply take the length of extracted rules and see the distribution of this number for different pieces. Interestingly, we observe bimodal distribution for most of the genres, which suggests that the melodic is hybrid within most genres: we have pieces that have a long and clear melody and some pieces that don’t.

After extracting features, we tried a little reverse engineering to see whether we can predict the piece genre from the extracting features. Here we use the multilayer perceptron as the classifer and try to classify classical vs american folk and rock vs jazz.

</Ricky>

Just to give more context, let’s look at a case study of XXX

Finally, we look at the eigenvalue distribution of the higher order network.

There are a lot of future directions we would like to pursue. First, for now we didn’t take all the  instruments into account and one way to do it is using multilayer network and see how different layers corporate with each other. Also, temporal information is highly important in music which we haven’t take into consideration fully.