A novel method for modeling musical data using deep learning is described. Musical data such as MIDI is used for training the model. A series of vector x is extracted from the musical data comprising pitch P, time difference T and duration D. Time difference T refers to the time lapse in beats between the start of preceding data and current data. Vector x is then distributed in latent space p(z). A Variational Autoencoder (VAE) distribution q(z|x) with loss function LV is constructed. The model consists of content latent space and style embedding to encode the content and style of music separately. The model is able to model and reconstruct styles of two different music styles.
Komen