Research on corpus phonetics

Image of Vowels in the Kera language of Chad

Dr. Tim Kempton, a linguist with SIL Nigeria describes how developing new techniques can support language development:

”During the last few months I have completed a research paper on corpus phonetics - a new technique for analysing the sounds in a language. In this technique, we start with a speaker of the language writing down a story that has been recorded. Even if this is a rough writing attempt in a language that has no established writing system, the computer can automatically align their writing with the audio recording. We can then make measurements of the recordings to help us understand how many sounds there are and how they are delineated. This is quicker and more accurate than trying to do this process by hand and helps us ensure that the alphabet and the spelling of words match the intuition of the speakers."

The graph above is from Dr. Tim's recent paper showing vowels in the Kera language of Chad. The automatic analysis helps us understand why some of the vowels were confused with each other in the original writing system.

We are now applying this technique to Nigerian languages such as Ishe. In the Ishe language we suspect that certain changes in the length of a vowel or the tone of a vowel can change the meaning of a word. The technique of corpus phonetics should help confirm this, and assist the Ashe community in improving their writing system.

Dr. Tim's full paper as presented in November 2019 is available, click the link below. 

Kempton, T., & Pearce, M. (2020). Corpus Phonetics for Under-Documented Languages: A Vowel Harmony Example. Proceedings of the 2019 Annual Meeting on Phonology