Formalizing nuance in classical music

random trip report
David P. Anderson (with input from Rich Kraft)

1 May 2022

Introduction

This essay is concerned with nuance in western classical piano music. In this context, nuance can be loosely defined as the differences between the score for a piece and an audio rendition of the piece, such as a human performance. These differences can be roughly divided into

  • Timing: tempo change, rubato, pauses, rolled or other non-simultaneous chords, etc.
  • Dynamics: crescendos and diminuendoes, accents, chord voicing, etc.
  • Articulation: legato, staccato, portamento, etc.
  • The use of pedal (sustain, soft, sostenuto).

We focus on piano. Other instruments (and voice) are more complex because a) their notes have additional parameters (such as attack and timbre), and b) the parameters may change continuously. The ideas presented here apply to such instruments, but would have to be extended to encompass these factors.

For most classical music, nuance is a critical component of rendition. An example: "Wasserklavier" by Luciano Berio. The score is here.

Notice that in Grimaux's performance, no two beats have exactly the same duration, and no two notes have exactly the same volume. The nuance of the performance unlocks the beauty and expression of the piece.

Where does nuance come from? Some scores have indications of nuance: tempo markings, slurs, crescendo marks, fermatas, pedal markings, etc. However:

  • These indications are imprecise: e.g. a fermata mark doesn't say how long the sound lasts, or how much silence there is afterward.
  • These indications are incomplete: they describe the broad strokes of the composer's intended nuance, but not the details. Indeed, western music notation is unable to express basic aspects of nuance such as chord voicing. A computer rendition of a score with conventional nuance indications still sounds sterile.

Some musical styles have associated conventions for nuance. In many styles, for example, upbeats are softer, and pieces end with a ritardando. Performers learn these conventions by osmosis (and in many cases the modern conventions differ from those of the composition's period).

But score markings and stylistic conventions are just guidelines. In the end, nuance is left up to the performer(s). Some nuance may be planned in advance. Some may be spontaneous during a particular performance. Some may be unintentional artifacts of the performer's technique.

I think we need a "formalism" to describe nuance: a language with precisely-defined syntax and semantics. This formalism should have these properties:

  • It can describe nuance at arbitrary levels of detail.
  • It can do so compactly - for example, a crescendo is represented by a single "primitive" rather than lots of per-note volume adjustments.
  • It allows nuance to be "layered" - for example, a long accelerando can be superimposed on measure-level rubato.

I have developed a formalism with these properties: Music Nuance Specification (MNS). Its details are described here, and a Python implementation of it is here.

Why formalize nuance?

To many musicians, nuance is ineffable - it's something magical that happens during performances, and to analyze or formalize it would break that magic spell.

This viewpoint is understandable. But as music evolves, and as computers are increasingly important tools for composition, pedagogy, and performance, there are reasons to expand our ability to represent and manipulate nuance: to make it a first-class citizen, along with scores and sounds. Doing so will not replace the human component of nuance, or the spontaneity of performance; rather, it will provide tools that enhance these processes, and that enable new ways of making music.

Let's assume that we have a formalism describing nuance, and that we have software tools that make it easy to create and edit "nuance specifications" for pieces. These capabilities would have several applications:

Composition

As a composer writes a piece, using a score editor such as MuseScore or Sibelius, they could also develop a nuance specification for the piece. The audio rendering function of the score editor could use this to produce nuanced renditions of the piece. This would facilitate the composition process and would convey the composer's intentions more clearly to prospective performers.

Virtual performance

Performers could create virtual performances of pieces, in which they play the piece using a computer rather than a traditional instrument.

Pedagogy

A teacher's instruction to a student could be represented as a nuance specification which guides the student's practice. This could be done in various ways. For example, as a student practices a piece they could be shown "virtual conductor" that expresses (graphically, on a computer display) a simplified representation of the target nuance.

Ensemble rehearsal and practice

When an ensemble (say, a piano duo) rehearses together, they could record their interpretive decisions as a nuance specification. They could then use this to guide their individual practice (perhaps using the "virtual conductor" described above).

Sharing and archival

IMSLP lets people share musical scores, and recordings of renditions. It could also include nuance descriptions for pieces. This would provide a framework for sharing and discussing the interpretation of pieces.

User interfaces for editing nuance

What kind of UI (user interface) would facilitate creating and editing a nuance specification - in particular, for transcribing one's mental model of a performance?

This generally involves changing every parameter - start time, duration, volume - of every note. We can imagine a GUI that shows a piano-roll representation of the score and lets you click on notes to change their parameters. This low-level approach would let you do whatever you want, but it would be impossibly tedious.

Desirable properties of a UI for editing nuance:

  • You can describe nuance at a high level: if you want an accelerando from 80 to 120 from measures 8 to 13, you can express this directly rather than moving individual notes.
  • You can express repetition. E.g., if you want to emphasize the strong beats in each measure, you can define a pattern of emphases, and then apply it to multiple measures.
  • You can make an adjustment and hear the effect quickly and with a minumum of clicks.

Some general approaches:

  • Integrate nuance editing with score editing. I discuss this here. You'd need to devise ways of graphically representing nuance in scores (color, note-head size, etc.). This would be good from the user experience point of view, but a) it's not clear how to represent layered nuance graphically, and b) it would be a lot of work to implement (e.g. in Musescore).
  • A special-purpose GUI where you can use the mouse to drag and drop nuance primitives, adjust their parameters, and hear the results. This GUI could site alongside (or above) the score-editing GUI. I think this might be the best approach.
  • Express nuance in a programming language. I've done this in a Python-based system called Numula, and it's quite powerful. But the user experience isn't great. You specify nuance by writing a Python program. The process of developing and refining nuance involves editing code, re-running the program, then (in the synthesizer program) advancing to relevant part of the piece. It's slow and awkward: the cycle of changing something and hearing the result involves lots of clicks and takes tens of seconds. It's nonintuitive: you're forced to think in terms of numbers. And it involves programming; not all musicians know Python.

A research agenda for musical nuance

Having a formalism for nuance opens up huge range of possible research in musicology.

The most basic issue is what I'll call the "primitive selection problem". A nuance formalism (like MNS) provides a set of "primitives". Some of these define fluctuations in tempo or volume that affect lots of notes. Others apply random 'jitter' to timing and volume. Others affect sets of notes based on attributes such as position within a measure. Others apply to individual notes.

The goal in designing the set of primitives is to find a small "basis set" of transformations, each with a small number of parameters, that can achieve the desired specifications - for example, that can closely approximate typical human performances.

MNS, for example, has a primitive for linear tempo change. This was easy to implement - but is it a good approximation of ritardandos and accelerandos in practice? There may be better choices.

I can imagine a research program to study this, by calculating the nuance in human performances, and finding the primitives that approximate it best.

The first step is to automatically extract nuance from human performances:

  • Get a corpus of performances as MIDI files. Audio recordings could be converted to MIDI files by software (though I'm not sure how well this works). For each performance you'd also need a prepresentation of the score, e.g. as MusicXML or MIDI.
  • Write software that finds the correspondence of notes between performance and score (there might be mistakes or other noise).

We can then use software to find a transformation that maps the score to the performance. This tranformation would typically have multiple levels. A first level would model large-scale fluctuations. The second level would take the residue from this, and fit it, possibly with different types of primitives. At some point the residue presumably would be noise-like, and its statistical properties could measured.

Each level would consist of a set of primitives. The software would consider various families of primitives: in the case of continuous fluctuations this might include linear, polynomial, exponential, logarithmic, etc. The software would use data-fitting techniques to find an optimal basis set.

It may turn out that the optimal set of primitives depends on

  • the performance period;
  • the period and style of music being played;
  • the individual performer
and so on.

There has been some research in this general area. I've looked at some, but haven't found anything usable.

Copyright 2024 © David P. Anderson