AI

Human Imagination or AI?

Presented by: Catherine Rollin

Music Teachers National Association Conference – March 20, 2024 Atlanta, GA

 

This session will focus on the use of artwork imagery as inspiration for musical creativity. It will explore how teachers can significantly contribute to developing human potential and further the best of the human spirit.

How I decided on this topic:

A. My reluctant curiosity about AI in relationship to creative work.

B. Facing and even embracing what is happening in the world and what might be inevitable

 

Exploring the Subject:

A. Simple experience with trying Chat GPT

B. Discovering deductive and inductive machine learning

C. Experiment proposed to try to teach the machine the compositional style of Catherine Rollin

D. What I learned from this and the uses of AI.

Creativity, Composing, Interpretation:

A. The human elements: intuition and inspiration

B. Collaboration with visual arts to help unlock the intuitive part of our being. Artworks help us to connect, empathize, understand history, culture and appreciate the shared human experience.

 

Conclusions

A. I will embrace and collaborate with AI in situations where it is clear that machines are faster and better at solving problems than human beings.

B. Integrating artwork imagery into our teaching is a great catalyst for inspiring the imagination and leading our students to experiencing music “in the moment.” Most importantly, at this juncture in history, we music teachers play a crucial role in the survival and flourishing of human creativity and ultimately the human spirit.

 

Playing examples: Debussy: Arabesque I, Rollin: * Woman Waiting for the Moon to Rise, *Spring Sale at Bendel’s, *Self-Portrait with Her Daughter, *The Pianist, **Woman With Bouquet, ** It’s Touch and Go, To Laugh or No, **The Roll Call,**Lady with a Bowl of Violets, ***Van Gogh Self-Portrait, ***Girl Seated by the Sea + Bal du moulin de la Galette

You can contact Catherine at : catherinerollinmusic.com Museum Masterpieces: Celebrating

Women Composers Through the Ages Bk 1* & Bk 2** & Museum Masterpieces: The Premier Exhibition. +Digital Sheet

Intro:

I am training a machine learning model on the classical piano compositional works of Catherine Rollin with the goal of having it create music in her writing style. This research will be used in the presentation of the lecture Human Imagination or AI? at the 2024 Music Teachers National Association conference.

 

Step 0: Setup

 

The important libraries in this project are tensorFlow, keras, and music21. Getting tensorFlow/keras to run on an M1 chip in python was a very tedious process. Only certain versions will work with the M1 and even then, some versions of tensorFlow will not work with other versions of python. It will save you 1 million headaches to use Python 3.8 and tensorFlow 2.13.


Finally, with the correct versions of everything you still need to setup a virtual environment using conda for everything to work properly on the M1.
This video will help navigate that task.

 

I began with a foundation of code based on this github. However, in my experience Skuldur’s model could not learn without tweaks. The loss function does not decrease, accuracy does not rise and thus, the model it makes will predict the same note over and over. So, I had to play with changing the layers/dropout/batch normalization/etc. This blog helped me understand what to tweak and Softology adds matlab plotting which makes everything a lot more readable. I still did things differently in setting up my model, but these sources were super useful for getting started. More details on my methods below!

 

Step 1: Organize the data

There were 22 pieces already written in Finale, which can then be easily exported to midi. I transposed these pieces all to the key of C major or A minor (depending on if they were major/minor to start with). This seemed like a logical organizational step so that the model learns based on broader musical characteristics than key signature. The first runs just involved those pieces, but it became clear that overfitting would be an issue without a larger data set.

 

I then obtained 187 PDFs of other pieces by Catherine Rollin courtesy of Alfred Publishing. Using PlayScore2, I translated those into midi and then transposed them. With 209 examples in the training data set I began running tests.

 

Step 2: Structuring the Model

My initial goal was to simply get the model to learn. This involved removing the batch normalization layers, adding a dropout layer and removing an activation layer (some of these might get added back in later). The “lstm” and “predict” codes in this github are the first versions of the model that would appear to learn. One thing to note is that the Skuldur base code is considering only one parameter option for each instance in the sequence: note/chord. This means that the model does not consider velocity, rhythm, meter, ADSR, resonance, portamento, etc. Later into testing I will try adding parameters. Another thing to note is that Skuldur’s model takes training data 100-note sequences at a time. This means that longer musical works will be broken into more training sequences, and therefore the longer pieces will have an outsized effect on the training data. First I began experimenting with different batch sizes, using a window of 200 epochs as a starting point. Here are the results for 64, 128,

and 256 batches:

 

Listen to three examples of this model HERE.

Listen to three examples of this model HERE.

 

Figure 3<br />

Listen to three examples of this model HERE.

I want to use a model with the epoch count corresponding to when the graph first hits its limit. This is what I am called the epoch of “convergence” (as opposed to when it finished training.)

 

Step 3: Adding a parameter

Adding velocity seems like the most straightforward way to squeeze a little more “musicality” out of the machine. If the model is learning correctly, this parameter should result in us beginning to hear “phrasing” which is the method by which a musician shapes a sequence of notes in a passage with expressive articulation.

 

 

The model currently intakes instances of a note as note+octave or instances of an interval/chord as note+octave.note+octave. This means a middle C would be C4 and a major chord built on it would be C4.E4.G4. So to add velocity I decided to separate it by an underscore. This would look like C4_60 or C6_60.E4_80.G4_70.

 

 

Upon first run the loss function did not decrease at all.

 

Investigating further, I realized that it is creating possible outcomes for every single velocity+note combination. This means 128 possible notes (not even including chords) combining with 128 possible velocities. Big number! Too big for this little GPU I fear, so I experimented with quantizing the velocities of the training data to the nearest 20. This means there are 6 possible velocities (20,40,60,80,100,120). This code can be found under “lstmTake2” and “predictTake2”.

 

Figure 4

Listen to three examples of this model HERE.

 

The first two batch sizes still don’t work, but we begin to see convergence at the 256 Batch! Now this model only has 67% accuracy, before velocity was added the 256 Batch model had 97% accuracy. To try to increase accuracy I run a version quantizing velocity to the nearest 40, so 3 possible velocities (40, 80, 120). This brings accuracy up to 75%, however in my opinion it seems like the output of the model is noticeably less musical with only three shades of volume. Example 2 in this set is especially repetitive too.

Listen to three examples of this model HERE.

 


Step 3: Experimenting with batch normalization

 

Next I try adding several batch normalization layers back in but in different places than they were originally located. The model appears to train faster and at a higher accuracy but the output is definitely the most meandering and I wonder if it has become too generalized. Then I tried this with both the 40 and 20 velocity quantizations.

Figure 8

 Listen to three examples of this model HERE.

Listen to three examples of this model HERE.

Batch normalization is taking training times down but creating over-generalized results. I try decreasing the amount of normalization happening on those layers with this line, setting the momentum (originally at .9) lower:

model.add(BatchNormalization(momentum=0.5, epsilon=1e-5))

I want to see if this version of the model is able to train with 128 Batch sizes (which did not work before in Fig 5). It also does not converge:

 

 

Step 4: Experimenting with length of sequence

 

I tried reducing the length of input sequences from 100 to 50. This decreased training time and worked for 128 and 256 batch sizes. Also, I had to comment out the batch normalization for it to train. Compare 128 batch size in Fig 11 to Fig 5 and 256 batch size in Fig 12 to Fig 6 to see the difference in sequence length (all other aspects of these two models are identical). The 128 batch seems to be way too overgeneralized, it is probably the least musical in terms of note choice of any of the models, the 256 batch is a little better but is still pretty meandaring.

 Listen to three examples of this model HERE.

Listen to three examples of this model HERE.

About Catherine Rollin

Catherine Rollin is a pianist, composer, clinician, author and teacher of prize-winning students. Her more than 400 published pedagogical compositions are recognized worldwide for their combination of musicality and “teachability.” As a clinician, Catherine has given over 300 workshops and masterclasses throughout the U.S., Canada, Japan and Taiwan. Her efforts as a composer, clinician and author are an outgrowth of her work as a teacher. Finding the answers to help her students develop artistic, relaxed and confident technical skills led Catherine to insights featured in her ground breaking series, “Pathways to Artistry.” Catherine’s ASCAP award winning books, “Museum Masterpieces” feature her original piano works inspired by great art masterpieces from museums around the world. Catherine wrote this series after seeing how her students’ creativity was sparked by artwork imagery. She is also the co-author of the 2011 edition of the highly regarded college text, “Creative Piano Teaching.” All of the many facets of Catherine’s work emanate from her central passion for teaching.

A few fun facts: Catherine is a chocoholic. She thinks most pianists have this in common. She is left handed and has perfect pitch. (She thinks that there are a disproportionate group of lefties with perfect pitch in the world and hopes someone might research this someday for their doctoral thesis! ) She wrote her “Summer” pieces in honor of her daughter “Summer.” Catherine is married and her husband is a piano teacher and tuner. Catherine loves meeting piano teachers, piano students and piano music afficianados and feels very blessed that her work has brought her in touch with incredibly interesting and passionate music lovers from all over the world.