Interlinear gloss
In linguistics and pedagogy, an interlinear gloss is a gloss (series of brief explanations, such as definitions or pronunciations) placed between lines (inter- + linear), such as between a line of original text and its translation into another language. When glossed, each line of the original text acquires one or more lines of transcription known as an interlinear text or interlinear glossed text (IGT)—interlinear for short. Such glosses help the reader follow the relationship between the source text and its translation, and the structure of the original language. In its simplest form, an interlinear gloss is simply a literal, word-for-word translation of the source text.
Contents
1 History
2 Structure
3 Punctuation
4 See also
5 References
6 External links
History
Interlinear glosses have been used for a variety of purposes over a long period of time. One common usage has been to annotate bilingual textbooks for language education. This sort of interlinearization serves to help make the meaning of a source text explicit without attempting to formally model the structural characteristics of the source language.
Such annotations have occasionally been expressed not through interlinear layout, but rather, through enumeration of words in the object and meta language. One such example is Wilhelm von Humboldt's annotation of Classical Nahuatl:[1]
1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | |
ni- | c- | chihui | -lia | in | no- | piltzin | ce | calli | |
1 | 3 | 2 | 4 | 5 | 6 | 7 | 8 | 9 | |
ich | mache | es | für | der | mein | Sohn | ein | Haus |
This "inline" style allows examples to be included within the flow of text, and for the word order of the target language to be written in an order which approximates the target language syntax. (In the gloss here, mache es is reordered from the corresponding source order to approximate German syntax more naturally.) Even so, this approach requires the readers to "re-align" the correspondences between source and target forms.
More modern 19th and 20th-century approaches took to glossing vertically, aligning the same sort of word-by-word content in such a way that the metalanguage terms were placed vertically below the source language terms. In this style, the given example might be rendered thus (here English gloss):
ni-
I
c-
it
chihui
make
-lia
for
in
to-the
no-
my
piltzin
son
ce
a
calli
house
ni- c- chihui -lia in no- piltzin ce calli
I it make for to-the my son a house
"I made my son a house."
Note that here word ordering is determined by the syntax of the object language.
Finally, modern linguists have adopted the practice of using abbreviated grammatical category labels. A 2008 publication which repeats this example labels it as follows:[2]
ni-c-chihui-lia
1SG.SUBJ-3SG.OBJ-mach-APPL
in
DET
no-piltzin
1SG.POSS-Sohn
ce
ein
calli
Haus
ni-c-chihui-lia in no-piltzin ce calli
1SG.SUBJ-3SG.OBJ-mach-APPL DET 1SG.POSS-Sohn ein Haus
This approach is denser and also requires effort to read, but it is less reliant on the grammatical structure of the metalanguage for expressing the semantics of the target forms.
In computing, special text markers are provided in Specials (Unicode block) to indicate the start and end of interlinear glosses.
Structure
A semi-standardized set of parsing conventions and grammatical abbreviations is explained in the Leipzig Glossing Rules.[3]
An interlinear text will commonly consist of some or all of the following, usually in this order, from top to bottom:
- The original orthography (typically in italic or bold italic),
- a conventional transliteration into the Latin alphabet,
- a phonetic transcription,
- a morphophonemic transliteration,
- a word-by-word or morpheme-by-morpheme gloss, where morphemes within a word are separated by hyphens or other punctuation,
and finally
- a free translation, which may be placed in a separate paragraph or on the facing page if the structures of the languages are too different for it to follow the text line by line.
As an example, the following Taiwanese clause has been transcribed with five lines of text:
- 1. the standard pe̍h-ōe-jī transliteration,
- 2. a gloss using tone numbers for the surface tones,
- 3. a gloss showing the underlying tones in citation form (before undergoing tone sandhi),
- 4. a morpheme-by-morpheme gloss in English, and
- 5. an English translation:[4]
.mw-parser-output .noitalic{font-style:normal}(1.)
(2.)
(3.)
(4.)
goá
goa1
goa2
I
iáu-boē
iau1-boe3
iau2-boe7
not-yet
koat-tēng
koat2-teng3
koat4-teng7
decide
tang-sî
tang7-si5
tang1-si5
when
boeh
boeh2
boeh4
want
tńg-khì
tng1-khi3.
tng2-khi3.
return.
(1.) goá iáu-boē koat-tēng tang-sî boeh tńg-khì
(2.) goa1 iau1-boe3 koat2-teng3 tang7-si5 boeh2 tng1-khi3.
(3.) goa2 iau2-boe7 koat4-teng7 tang1-si5 boeh4 tng2-khi3.
(4.) I not-yet decide when want return.
(5.) "I have not yet decided when I shall return."
In linguistics, it has become standard to align the words and to gloss each transcribed morpheme separately. That is, koat-tēng in line 1 above would either require a hyphenated two-word gloss, or be transcribed without a hyphen, for example as koattēng. Grammatical terms are commonly abbreviated and printed in SMALL CAPITALS to keep them distinct from translations, especially when they are frequent or important for analysis. Varying levels of analysis may be detailed. For example, in a Lezgian text using standard romanization,[5]
Gila
now
abur-u-n
they-OBL-GEN
ferma
farm
hamišaluǧ
forever
güǧüna
behind
amuqʼ-da-č
stay-FUT-NEG
Gila abur-u-n ferma hamišaluǧ güǧüna amuqʼ-da-č
now they-OBL-GEN farm forever behind stay-FUT-NEG
'Now their farm will not stay behind forever.'
Here every Lezgian morpheme is set off with hyphens and glossed separately. Since many of these are difficult to gloss in English, the roots are translated, but the grammatical suffixes are glossed with three-letter grammatical abbreviations.
The same text may be glossed at a different level of analysis:
Gila
now
aburun
their.OBL
ferma
farm
hamišaluǧ
forever
güǧüna
behind
amuqʼ-da-č
stay-will-not
Gila aburun ferma hamišaluǧ güǧüna amuqʼ-da-č
now their.OBL farm forever behind stay-will-not
'Now their farm will not stay behind forever.'
Here the Lezgian morphemes are translated into English as much as possible; only those which correspond to English are set off with hyphens.
A more colloquial gloss would be:
Gila
now
aburun
their
ferma
farm
hamišaluǧ
forever
güǧüna
behind
amuqʼdač
won't.stay
Gila aburun ferma hamišaluǧ güǧüna amuqʼdač
now their farm forever behind won't.stay
'Now their farm will not stay behind forever.'
Here the gloss is word for word; rather than setting off Lezgian morphemes with hyphens, the English words in the gloss are joined with periods when more than one is required to translate a Lezgian word.
Punctuation
In interlinear morphological glosses, various forms of punctuation separate the glosses. Typically, the words are aligned with their glosses; within words, a hyphen is used when a boundary is marked in both the text and its gloss, a period when a boundary appears in only one. That is, there should be the same number of words separated with spaces in the text and its gloss, as well as the same number of hyphenated morphemes within a word and its gloss. This is the basic system, and can be applied universally. For example,
Odadan hızla çıktım. (Turkish)
oda-dan
room-ABL
room-from
hız-la
speed-COM
speed-with
çık-tı-m
go.out-PFV-1sg
go_out-perfective-I
oda-dan hız-la çık-tı-m
room-ABL speed-COM go.out-PFV-1sg
room-from speed-with go_out-perfective-I
'I left the room quickly.'
An underscore may be used instead of a period, as in go_out-PFV, when a single word in the source language happens to correspond to a phrase in the glossing language, though a period would still be used for other situations, such as Greek oikíais house.FEM.PL.DAT 'to the houses'.
However, sometimes finer distinctions may be made. For example, clitics may be separated with a double hyphen (or, for ease of typing, an equal sign) rather than a hyphen:
Je t'aime. (French)
je=te=aime
I=you=love
je=te=aime
I=you=love
'I love you.'
Affixes which case discontinuity (infixes, circumfixes, transfixes, etc.) may be set off by angle brackets, and reduplication with tildes, rather than with hyphens:
sulat, susulat, sumulat, sumusulat (verbal declensions) (Tagalog)
sulat
write
su~sulat
contemplative mood~write
s⟨um⟩ulat
⟨agent trigger.past⟩write
s⟨um⟩u~sulat
⟨agent trigger⟩contemplative~write
sulat su~sulat s⟨um⟩ulat s⟨um⟩u~sulat
write contemplative mood~write ⟨agent trigger.past⟩write ⟨agent trigger⟩contemplative~write
(See affix for other examples.)
Morphemes which cannot be easily separated out, such as umlaut, may be marked with a backslash rather than a period:
unser-n
our-DAT.PL
Väter-n
fatherPL-DAT.PL
(German)
unser-n Väter-n
our-DAT.PL fatherPL-DAT.PL
'to our fathers' (the singular of Väter 'fathers' is Vater)
A few other conventions which are sometimes seen are illustrated in the Leipzig Glossing Rules.[3]
See also
Kanbun – Japanese tradition of glossing Classical Chinese texts
Ruby text – a gloss sometimes used with Chinese or Japanese to show the pronunciation- Collection of texts in the EOPAS system, many in interlinear formats and linked back to the source media
References
^ Lehmann, Christian (2004-01-23). "Directions for interlinear morphemic translations". In Geert Booij, Christian Lehmann, Joachim Mugdan, Stavros Skopeteas. Morphologie. Ein internationales Handbuch zur Flexion und Wortbildung. Handbücher der Sprach- und Kommunikationswissenschaft. 2. Berlin: W. de Gruyter. pp. 1834–1857.CS1 maint: Uses editors parameter (link) .mw-parser-output cite.citation{font-style:inherit}.mw-parser-output q{quotes:"""""""'""'"}.mw-parser-output code.cs1-code{color:inherit;background:inherit;border:inherit;padding:inherit}.mw-parser-output .cs1-lock-free a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/6/65/Lock-green.svg/9px-Lock-green.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .cs1-lock-limited a,.mw-parser-output .cs1-lock-registration a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/d/d6/Lock-gray-alt-2.svg/9px-Lock-gray-alt-2.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .cs1-lock-subscription a{background:url("//upload.wikimedia.org/wikipedia/commons/thumb/a/aa/Lock-red-alt-2.svg/9px-Lock-red-alt-2.svg.png")no-repeat;background-position:right .1em center}.mw-parser-output .cs1-subscription,.mw-parser-output .cs1-registration{color:#555}.mw-parser-output .cs1-subscription span,.mw-parser-output .cs1-registration span{border-bottom:1px dotted;cursor:help}.mw-parser-output .cs1-hidden-error{display:none;font-size:100%}.mw-parser-output .cs1-visible-error{font-size:100%}.mw-parser-output .cs1-subscription,.mw-parser-output .cs1-registration,.mw-parser-output .cs1-format{font-size:95%}.mw-parser-output .cs1-kern-left,.mw-parser-output .cs1-kern-wl-left{padding-left:0.2em}.mw-parser-output .cs1-kern-right,.mw-parser-output .cs1-kern-wl-right{padding-right:0.2em}
^
Haspelmath, Martin (2008). Language typology and language universals: an international handbook. Walter de Gruyter. p. 715. ISBN 978-3-11-011423-2.
^ ab Bickel, Balthasar; Bernard Comrie; Martin Haspelmath (February 2008). "The Leipzig Glossing Rules. Conventions for Interlinear Morpheme by Morpheme Glosses". Dept. of Linguistics – Resources – Glossing Rules. Retrieved 2010-06-30.
^ Example from A Basic Vocabulary for a Beginner in Taiwanese by Ko Chek Hoan and Tan Pang Tin
^ Haspelmath 1993:207
External links
- The Leipzig Glossing Rules: Conventions for interlinear morpheme-by-morpheme glosses
Interlinear Glossed Text Standards (E-MELD)
Interlinear Glossed Text Levels (E-MELD)
Towards a General Model of Interlinear Text (E-MELD)- Interlinear Morphemic Glosses
Glossing Ancient Languages and Texts. A forum for recommendations on the Interlinar Morphemic Glossing of ancient languages as attested in ancient manuscripts.- Online Interlinear of Biblical Greek Scriptures (New Testament) text - Requires Adobe Acrobat
- ODIN - The Online Database of INterlinear text