A part-time post-doc. While it’s still going, see the diary posts for the story so far…
A part-time post-doc. While it’s still going, see the diary posts for the story so far…
Folk music is part of a rich cultural context that stretches back into the past, encompassing the real and the mythical, bound to the traditions of the culture in which it arises. Artificial intelligence, on the other hand, has no culture, no traditions. But it has shown great ability: beating grand masters at chess and Go, for example, or demonstrating uncanny wordplay skills when IBM Watson beat human competitors at Jeopardy. Could the power of AI be put to use to create music?
I’m now helping them, on a part-time contract. The idea is it’s part UI design for composing with deep learning, part community engagement (read: website), and part production–reception research.
off the digital anvil: a bare-bones web-app adapation of the folk-rnn command line tool, the first step in making it a tool anyone can use. happily, folk-rnn is written in python – good in and of itself as far as i’m concerned – which makes using the django web framework a no-brainer.
- created a managed ubuntu virtual machine. - wrangled clashing folk-rnn dependencies. - refactored the folk-rnn code to expose the tune generation functionality through an API. - packaged folk-rnn for deployment. - created a basic django webapp: - UI to allow user to change the rnn parameters and hit go. - UI to show a generated tune in staff notation. - URL scheme that can show previously generated tunes. - folk-rnn-task process that polls the database (python2, as per folk-rnn). - unit tests. - functional test with headless chrome test rig.
the web-app adapation of the folk-rnn command line tool now has the start of community and research features. You can archive generated tunes you like, tweak them and save the results as settings of the generated tune, and comment. Plus dataset export.
Redesign: Tunes with Settings. Feature: ABC validation
Feature: Dataset Export
Feature: Publish to Archive, Comment on Archived Tunes.
Feature: Swap between RNN original composition and your own editable version.
the web-app adapation of the folk-rnn command line tool is now online, and generating tunes 50x faster – from ~1min to 1-2s. still bare-bones, an ongoing project, but at least playable with.
Feature: Async architecture, with folk-rnn-fast now a worker within the app
now we’re talking: the folk-rnn webapp finally feels a proper website: it’s styled, the UI has instant feedback of e.g. invalid ABC being input, and the rnn generation appears note-by-note as it’s generated. there’s a definite instant-hit of satisfaction in pressing ‘go’ and seeing it stream across the page.
or rather, one of the apps feels a proper website, as it’s actually two websites now. The
composer app runs
folkrnn.org, and is focussed entirely on generating tunes. the
archiver app runs
themachinefolksession.org, and is focussed on the community aspect: archiving, human-edits, and so on.
Fixes, refactors and tweaks for PR 9 and 10
Feature: ABC generation displayed live
Feature: Multiple models with compose client side logic
Feature: Composer styling first pass
Feature: Archive as app (i.e. themachinefolksession.org)
Feature: Wildcard in Key, Meter, Initial ABC
Feature: State and URL management
Tweaks: ABC validation and UI
Feature: Single-page app
not just a lick of paint: secure connection, a nifty piece of UX around new vs. deterministic tunes, and a rat-hole of automated backups.
Feature: Composer styling second pass
Feature: abcjs 5.0
We demonstrate 1) a web-based implementation of a gen- erative machine learning model trained on transcriptions of folk music from Ireland and the UK (http://folkrnn.org, live since March 2018); 2) an online repository of work created by machines (https://themachinefolksession.org/, live since June 2018). These two websites provides a way for the public to engage with some of the outcomes of our research investigating the application of machine learning to music practice, as well as the evaluation of machine learning applied in such contexts. Our machine learning model is built around a text-based vocabulary, which provides a very compact but expressive representation of melody-focused music. The specific kind of model we use consists of three hidden layers of long short-term memory (LSTM) units. We trained this model on over 23,000 transcriptions crowd-sourced from an online community devoted to these kinds of folk music. Several compositions created with our application have been performed so far, and recorded and posted online. We are also organising a composition competition using our web-based implementation, the winning piece of which will be performed at the 2018 O’Reilly AI conference in London in October.
Matthew Tobias Harris
Queen Mary University of London
London E1 4NS, UK
Bob L. Sturm
Royal Institute of Technology KTH
Lindstedtsvägen 24, SE-100 44 Stockholm, Sweden
Kingston Hill, Kingston upon Thames, Surrey KT2 7LB, UK
the community site. straight-up django (“the web framework for perfectionists with deadlines”), but there’s a lot going on.
Tweak: Search includes author names
Feature: Tune of the month
Tweaks: Tempo, Attributions, Add settings, ABC
Feature: Tunes page has order by added or popularity
Tweaks and fixes
Feature: Surface interesting tunes
Feature: User files upload and serve configuration
the folk-rnn webapp was selected for the 19th international society for music information retreival conference, as part of their interactive machine-learning for music exhibition. so poster time! nice to not have to futz around with html+css, instead just draw directly… i’m pretty happy with how it turned out (PDF).
having been selected for the interactive machine-learning for music exhibition at the 19th international society for music information retreival conference, the time had come. nice to see a photo back from set-up, with the poster (PDF) commanding attention in the room.
for any given tune, how much activity surrounded it?
for any given session, what happened?
what are the usage trends of folkrnn.org and of themachinefolksession.org?
to answer these kinds of questions, enter
stats, a django management command for processing the use data of composer and archiver apps for insight / write-up in academic papers.
i used this to write the following, which input from bob and oded. it will surely be edited further for publication, but this is as it stands right now.
During the first 235 days of activity at folkrnn.org, 24562 tunes were generated by – our heuristics suggest – 5700 users. Activity for the first 18 weeks averages a median of 155 tunes weekly. In the subsequent 15 weeks to the time of writing, overall use increased, with a median of 665 tunes generated weekly. This period also features usage spikes. One week, correlating to an interview in Swedish media, shows 2.7x the median tunes generated. The largest, correlating to a mention in German media, shows an 18.4x increase. These results show making our tool available to users of the web has translated into actual use, and that use is increasing. Further, media attention brings increased use, and this use is similarly engaged, judged by similar patterns of downloading MIDI and archiving tunes to themachinefolksession.org.
Of the fields available for users to influence the generation process on folkrnn.org, the temperature was used more often then the others (key, meter, initial ABC, and random seed). Perhaps this is because changing temperature results in more obviously dramatic changes in the generated material. Increasing the temperature from 1 to 2 will often yield tunes that do not sound traditional at all. If changes were made to the generate parameters, the frequency of the resulting tune being played, downloaded or archived increased from 0.78 to 0.87.
Over the same period since launch, themachinefolksession.org has seen tunes 551 contributed. Of these tunes, 82 have had further iterations contributed in the form of ‘settings’; the site currently hosts 92 settings in total. 69 tunes have live recordings contributed; the site currently hosts 64 recordings in total (a single performance may encompass many tunes). These results show around 100 concrete examples of machine-human co-creation have been documented.
Of the 551 contributed tunes, 406 were generated on, and archived from, folkrnn.org. Of these entirely machine-generated tunes, 32 have had human edits contributed; themachinefolksession.org currently hosts 37 settings of folkrnn generated tunes in total. These examples in particular attributable human iteration of, or inspiration by, machine produced scores.
Further value of machine produced scores can be seen by the 30 registered users who have selected 136 tunes or settings as being noteworthy enough to add to their tunebooks. Per the algorithm used by the home page of themachinefolksession.org to surface ‘interesting’ tunes, “Why are you and your 5,599,881 parameters so hard to understand?” is the most, with 4 settings and 5 recordings.
While these results are encouraging, most content-affecting activity on themachinefolksession.org has been from the administrators; co-author Sturm accounts for 70% of such activity. To motivate the use of the websites, we are experimenting with e.g. ‘tune of the month’, see above, and have organised a composition competition.
The composition competition was open to everyone but targeted primarily at music students. Submission included both a score for a set ensemble and an accompanying text describing how the composer used a folkrnn model in the composition of the piece. The judging panel - the first author was joined by Profs. Elaine Chew and Sageev Oore - considered the musical quality of the piece as well as the creative use of the model. The winning piece Gwyl Werin by Derri Lewis was performed by the New Music Players at a concert organised in partnership with the 2018 O’Reilly AI Conference in London. Lewis said he didn’t want to be ‘too picky’ about the tunes, rather he selected a tune to work from after only a few trails. He describes using the tune as a tone row and generating both harmonic, melodic and motivic material out of it. Though the tune itself, as generated by the system, does not appear directly in the piece.
folkrnn.org can now generate tunes in a swedish folk idiom. bob having moved to KTH in sweden, had got some new students to create a folkrnn model trained on a corpus of swedish folk tunes. and herein lies a tale of how things are never as simple as they seem.
the tale goes something like this: here we have a model that already works with the command-line version of
folkrnn. and the webapp
folkrnn.org parses models and automatically configures itself. a simple drop-in, right?
first of all, this model is sufficiently different that the defaults for the meter, key and tempo are no longer appropriate. so a per-model defaults system was needed.
then, those meter, key composition parameters are differently formatted in this corpus, which pretty much broke everything. piecemeal hacks weren’t cutting it, so a sane system was needed that standardised on one format and sanely bridged to the raw tokens of each model.
after the satisfaction of seeing it working, bob noticed that the generated tunes were of poor quality. when a user of folkrnn.org generates a tune with a previous model, setting the meter and key to the first two tokens is exactly what the model expects, and it can then fill in the rest drawing from the countless examples of that combination found in the corpus. but with this new model, or rather the corpus it was trained on, a new parameter precedes these two. so the mechanics that kicks off each tune needed to now cope with an extra, optional term.
so expose this value in the composition panel? that seems undesirable, as this parameter is effectively a technical option subsumed by the musical choice of meter. and manually choosing it doesn’t guarantee you’re choosing a combination found in the corpus, so the generated tunes are still mostly of poor quality.
at this point, one might think that exactly what RNNs do is choose appropriate values. but it’s not that simple, as the RNN encounters this preceeding value first, before the meter value set by the user. it can choose an appropriate meter from the unit-note-length, but not the other way round. so a thousand runs of the RNN and a resulting frequency table later, folkrnn.org is now wired to generate an appropriate pairing akin to the RNN running backwards. those thousand runs also showed that only a subset of the meters and keys found in the corpus are used to start the tune, so now the compose panel only shows those, which makes for a much less daunting drop-down, and fewer misses for generated tune quality.
unit-note-length is now effectively a hidden variable, which does the right thing… providing you don’t want to iteratively refine a composition, as it may vary from tune generated to tune generated. rather than exposing the parameter after all, and then have to implement pinning in as per the seed parameter’s control, a better idea was had: make the initial ABC field also handle this header part of the tune. so rather than just copying-and-pasting-in snippets of a tune, you could paste in the tune from the start, including this unit note length header. this is neat because as well as providing the advanced feature of being able to specify the unit note lenth value, it makes the UI work better for naïve users: why couldn’t you copy and paste in a whole tune before?
as per the theme there, implementing this wasn’t just a neat few lines of back-end python, as now the interface code that is loaded into the browser needs to be able to parse out and verify these header lines, and so on.
Feature: composer models have defaults for meter, mode and tempo
Feature: set unit note length within prime tokens, for models with L tokens
Feature: Headers in start ABC