PiCoGen: Generate Piano Covers with a Two-stage Approach

National Taiwan University, KKCompany
ICMR 2024
PiCoGen

PiCoGen generates a piano cover in two stages: extracts firstly a lead sheet (i.e., melody line and chord progression) from an audio recording of the original song via audio analysis (i.e., transcription), and then turns the extracted lead sheet into a piano performance via conditional symbolic-domain music generation.

Abstract

Cover song generation stands out as a popular way of music making in the music-creative community. In this study, we introduce Piano Cover Generation (PiCoGen), a two-stage approach for automatic cover song generation that transcribes the melody line and chord progression of a song given its audio recording, and then uses the resulting lead sheet as the condition to generate a piano cover in the symbolic domain. This approach is advantageous in that it does not required paired data of covers and their original songs for training. Compared to an existing approach that demands such paired data, our evaluation shows that PiCoGen demonstrates competitive or even superior performance across songs of different musical genres.

PiCoGen

PiCoGen consists of two core modules: Extractor and Performer. For each bar (musical measure) k of the input, the Extractor transcribes from the input its lead sheet Lk (a token sequence), and the Performer generates autoregressively the piano performance Sk (also a token sequence) for the same bar given the current and preceding sequences of lead sheet [L1, L2, … Lk] and the preceding piano performances [S1, S2, … Sk-1] organized in an interleaving fashion.

Video Presentation

Poster

Citation


@inproceedings{tan2024picogen,
    author = {Tan, Chih-Pin and Guan, Shuen-Huei and Yang, Yi-Hsuan},
    title = {PiCoGen: Generate Piano Covers with a Two-stage Approach},
    year = 2024,
    booktitle = {Proceedings of the 2024 International Conference on Multimedia Retrieval (ICMR)},
    location = {Phuket, Thailand},
}