And how does that sound in Japanese? The mora versus the syllable in Japanese songs

February 12, 2015

Have you ever tried translating a song, poem, or nursery rhyme from one language to another, faithfully preserving the content and exact rhythm of the original? It is highly unlikely that you would not end up changing the syntax, substituting words for near-synonyms, or even simplifying any of the content in the process of translation. Add the syllable-based phonological system of most languages and the moraic Japanese language into the mix, and some interesting questions arise. What happens when a song from a syllable-based language like English gets translated to Japanese? Does the syllable actually exist in Japanese? And if a syllabic setting is used, would native Japanese speakers find that weird? What is a mora anyway? I’ll attempt to answer some of these questions by exploring the presence of the mora and the syllable in text-setting—the pairing of language and music in song—in Japanese songs, with focus on recent research conducted in this field.

Moras, syllables, and effects on text-setting

The syllable is familiar to most English speakers as a unit of pronunciation with one vowel sound in the centre, with or without surrounding consonants. “Music”, for example, has two syllables, while “musical” has three. In English, and most other languages, the syllable is the most prominent rhythmic unit. For example, if you clap in time to the word “music”, you will end up with two claps, one for each syllable.

The mora, on the other hand, is a smaller rhythmic timing unit within the syllable. Japanese is often described as a “mora-based” language, and according to the linguist, Seiichiro Inaba, means that the mora, rather than the syllable, serves as the basic unit of Japanese rhythm and phonology (i.e sound system). This means that while Japanese can be divided into syllables, as in dividing the word sensei (‘teacher’) into two syllables, sen-sei, for Japanese listeners the most prominent rhythmic unit is the mora, which divides sensei into 4 units, se-n-se-i.

If you are familiar with Japanese writing, the mora is a familiar unit as it forms the basis of the two kana writing systems, hiragana and katakana.  Below are two examples of how Japanese words are divided into syllables and moras, and how this corresponds to the hiragana writing system. Romaji (literally ‘roman letters’), a way of representing Japanese using the Latin script, is also provided to help with the pronunciation of the words.



The units of syllable and mora are important in the study of Japanese text-setting—the way in which words are arranged to a melody. The study of text-setting is not only interesting from a musical standpoint, but also from a linguistic perspective, because the way that language is arranged across various notes gives us clues about the sound system of that language.  To better understand how differences in the syllable and mora affect text-setting, listen to these two lines from the Japanese version of “I Saw Mommy Kissing Santa Claus”, with particular attention to how santa is sung in each example.

Example (c): de mo so no san ta wa

Example (d): sa n ta no o ji sa n ga

In Example (c), santa is sung syllabically (san-ta), while in Example (d), santa is broken down into its three moras (san-ta) and spread over three notes. As this Christmas song illustrates, there are two options for how words may be arranged to a melody in Japanese: according to syllable, or according to mora.

Exploring the existence of the syllable in Japanese songs

Observing this variation in text-setting patterns in Japanese songs, linguists Rebecca Starr and Stephanie Shih found it surprising that a lot of existing literature on Japanese phonology claim that Japanese text-setting is exclusively mora-based. Looking at the song data, however, it was clear that the picture was not so simple.

Starr and Shih also noticed several patterns in the types of words and songs that were more likely to have syllable-based text-setting. Songs that had been translated into Japanese, for example, seemed to contain more syllable-based settings than songs that had been originally written in Japanese. This pattern can be explained given the low “information density” of Japanese. As researchers François Pellegrino, Christophe Coupe, and Egidio Marsico suggest, due to the small number of sounds in Japanese, it takes longer to say the same thing in Japanese than it does in English and many other languages. As such, when attempting to translate an English song into Japanese, there often isn’t enough room in the metrical setting to convey all of the content of the original version. Because syllable-based text-setting uses fewer notes than mora-based text-setting (remember san-ta vs. sa-n-ta), syllable-based setting is a tool that translators can use in their attempt to fit more content into the Japanese version of a song.

To explore these patterns in Japanese songs, three corpora of Japanese songs were compared: translated Disney songs, translated Christmas songs, and Japanese anime theme songs. Disney songs present a particularly challenging translation situation: the translators must faithfully convey a great deal of content while matching the mouth movements of the animated characters to avoid an audio-visual mismatch onscreen. (Imagine the voice of The Little Mermaid’s Sebastian still happily singing Under the Sea, while on your TV screen, the crustacean’s mouth is in fact closed.)

Examining these three corpora revealed that syllabic settings were found in both translated songs and native Japanese songs. However, as predicted, they were more frequent in translated songs. The researchers also discovered that Japanese words that are actually borrowings from European languages, such as santa, were more likely to receive syllable-based settings than words of Chinese and Japanese origin. They proposed that this pattern may result from Japanese lyricists’ familiarity with English.

While the corpus study established that the syllable is seen as an acceptable text-setting unit among professional Japanese lyricists, it was uncertain if the same could be said for the average Japanese listener. To test the judgments of ordinary listeners, they created a series of synthesised sung melodies using the software Vocaloid by Yamaha. Using this software, it was possible to create sung melodies that were completely identical except for the use of different rhythmic units for text-setting. Five different rhythmic text-setting styles were used on completely identical melodies, and these settings included mora-based (sa-n-ta), syllable-based (san-ta), and settings not corresponding to either unit (s-an-ta). Listen to the following voices synthesised using Vocaloid. The first is an example of moraic text-setting, while the second is syllabic.

Example (e): moraic text-setting

Example (f): syllabic text-setting

It turned out that native Japanese listeners rated syllable-based settings just as highly as mora-based settings.. This indicates that the syllable not only exists in Japanese, but is also highly accepted by native speakers. In spite of this pattern, though, Japanese listeners did give lower ratings to syllable-based settings in situations where a mora-based solution would have been possible.

All in all, this series of studies demonstrates that although the mora is the most prominent rhythmic unit of Japanese, both the mora and the syllable are in fact active units in Japanese phonological structure.

So the next time you watch Spirited Away or sing along to Disney’s Let It Go in Japanese, take a moment to figure out the types of text-setting in the song, and decide if the words and music belong!

