Detecting Musical Rhetoric Figures with LSTM using Procedurally Generated Synthetic Data

Publication date

DOI

Document Type

Master Thesis

Collections

Open Access logo

License

CC-BY-NC-ND

Abstract

Musicologists have researched rhetorical techniques applied to baroque music, which includes the works of Johann Sebastian Bach. These musical rhetoric figures come in the form of rhythmic and melodic patterns, with each figure having the goal of evoking a certain emotion or Christian symbolism. This thesis presents a machine learning approach to pattern recognition in symbolic music. A Long Short-Term Memory (LSTM) model will be trained to detect the figura corta, a rhythmical figure, in the cantatas of J.S. Bach. Since a labeled dataset of Bach’s cantatas does not exist and labeling the data manually would be extremely time consuming, this thesis approaches the problem by generating a dataset from scratch. By drawing up laws and rules to which the data should abide, an algorithm is presented that generates plausible musical fragments as training input based on these criteria. Then, twelve different parameter settings were explored by training an LSTM on the resulting datasets. The best performing model was subsequently tested on six real cantata movements. On average, the LSTM model achieved a precision of 78.53%, a recall of 83.12% and an accuracy of 94.46%. From these results it is concluded that synthetic data can produce a reliable dataset to train an LSTM that can successfully classify real data on whether it contains a figura corta or not. Furthermore, this result implies that the presented method of procedurally generating data produces a varied and correct dataset. By extension, it implies that the laws and rules proposed, as well as the representation of the musical data, allow the LSTM to correctly apply its learned features to real data.

Keywords

LSTM; Synthetic Data; Figuren; Bach;

Citation