Further Information
This cd‑rom and website are intended to provide an overview of the variety of the sounds of the English language on various levels:
• in time, with our transcriptions of historical ancestor forms of English, from present-day back to Late Modern English, Early Modern, Middle and Old English, as far back even as Proto-Germanic;
• over geographical space;
• by sociolinguistic context.
For a brief explanation of our approach to each of these levels of variation, read on...
History
Obviously, we cannot provide recordings of the various historical stages in the development of English, but linguistic analyses do make it possible to work out what the likely pronunciations were to a reasonable degree of accuracy. With each step further into the past, however, the more and more we ‘reconstruct’, the more linguists’ confidence in the real phonetic accuracy of our transcriptions necessarily reduces. All our transcriptions for historical varieties are therefore always to be taken with this caveat in mind. Naturally we do not claim for them any absolute or guaranteed precision; nonetheless, the transcriptions have been produced for us by experts in the phonetics of these periods, and form at the very least a useful guide to how the pronunciation of English developed through the ages. We have also tried to make our historical varieties as comparable as possible to our modern ones, by providing transcriptions mainly based on single manuscripts, wherever possible written at a known location and in a single hand.
Geographical Variation
We have aimed to give a broad geographical sample of varieties of English and Scots and other Germanic languages. The choice of locations depended very much on what was available to us within the time-frame of the project. The particular concentration of varieties in Northern Ireland, northeast England and southern Scotland is indicative of the origin, place of residence, and place of employment (respectively) of the chief data collector, Warren Maguire.
Some areas remain under- or unrepresented, particularly those further away from the data collectors’ home turf. In particular, the Republic of Ireland and central and south-eastern England are areas where further investigation is desirable. Varieties of English outside of the British Isles are covered in less detail, although an attempt has been made to provide a range of accent types from the linguistically diverse eastern half of the United States of America.
Social Variation
At any one place and time, language is characterised by variation which is often dependent on social factors such as gender, age, social-class, social context and identity. Although it is desirable to investigate “language within the social context of the community in which it is spoken” (Labov 1966: 3), detailed coverage of the variation present at each location sampled is beyond the scope of this project. Nevertheless, it was felt desirable to factor in some sociolinguistic variation, and we have done this by recording data for more than one ‘sub-variety’ at many locations. These sub-varieties aim to give representative (and comparable) samples of the kinds of speech present at each location, but it must always be remembered that this is a considerable idealisation of the variation present in each and every speech community. Here we have used the following terms.
• As its name suggests, the ‘Typical’ sub-variety is a representative sample of the dialect concerned, usually characteristic of native working-class speakers between the ages of 30 and 60.
• The ‘Traditional’ sub-variety comprises the still extant traditional dialect pronunciations of the location, most characteristic of older working-class males. This Traditional sub-variety does not necessarily represent the typical pronunciations of such speakers, but rather represents their ‘broadest’ usage.
• The ‘Emergent’ sub-variety gives a representative sample of the local pronunciations of younger speakers (typically working-class between the ages of 16 and 25). Although ‘Emergent’ sub-varieties may display features which are new to the dialect concerned, speakers were not chosen by linguistic criteria.
• AAVE stands for African American Vernacular English, which is often rather different from the European American Vernacular English spoken in the same locations.
Transcription
To ensure as much consistency as possible, all the recordings in this database were phonetically transcribed by Warren Maguire. A few of the transcriptions are based on the speech of more than one speaker from the given location, so that not all transcriptions correspond exactly to the associated recording.
The Wordlist
The selection of the words in our wordlist depended upon two (potentially conflicting) factors.
• Firstly, since our comparison method depends upon comparison of cognates, it was necessary for the words to be cognate in all the varieties compared. This means that in addition to being present in all of the English and Scots varieties in our sample, they must also be present in the varieties from the other Germanic languages included in our study.
• Secondly, we were keen not to select words purely on the basis that they were likely to give us interesting results. Rather, we wanted to give a representative sample of the range of phonetic types present in the English (and Germanic) lexicon.
The interaction of these two factors results in certain (unavoidable) phonetic lacunae in our wordlist. In particular, phonetic patterns which are largely or wholly characteristic of the Romance and/or Latinate part of the English lexicon are not represented (for example, word-initial [p] and the vowel in words such as choice).
Key to Abbreviations and Symbols
The symbols used in our phonetic transcriptions are those of the full IPA symbols chart by the International Phonetic Association.
To view them properly you must have the Arial Unicode MS font installed on your computer (already built into all modern versions of Windows).
Other abbreviations we use with our transcriptions are:
* means that this phonetic transcription is not an actual known phonetic form but only a linguistic reconstruction. On these pages this applies to all words in Proto-Germanic, the reconstructed ancestor language of the Germanic family.
˛ in superscript, with Proto-Germanic transcriptions, means that the reconstructed transcription is only one of a number of proposed reconstructions for this word.
!NC in superscript means that this language does not have a cognate (directly related) word in this meaning. Luxemburgish, for example, no longer uses a word cognate with father, but just the word Papp, unrelated phonetically to father.
What Do We Do With These Data?
For more on what further analysis and use we make of these transcriptions, see our Sound Comparisons research project website.