This repository contains hierarchical (i.e., tree format) context annotations derived from OpenText 2.0 data (forthcoming).
In OpenText's functional-linguistic model, context is represented by discourses. More specifically, each discourse includes at least one turn
node (at least one speaker must take a turn, or else there would be no discourse) and zero or more turn stages (these mostly correspond to traditional pericopes, but see caveats below regarding projection data). The graphological expressions (i.e., tokens) are included for human-readability and to anchor this contextual representation to the correct tokens.
The XML files for each book of the New Testament can be downloaded and opened directly in any XML or text editor.
In order to view a rendered display of the context trees, ensure the context.css
file is in the same directory as the XML file you are trying to view, and open the XML file in your internet browser (e.g., Safari or Chrome).
text
nodes represent texts. Currently, these are only the books of the New Testament.c
nodes represent contextual units. There are three types of contextual unit:turn
units represent each time someone speaksseg
units represent turn segments, and these include a title for reference.move
e
nodes represent graphological expressions (I.e., whitespace-separated tokens).- We include three identifiers on expression nodes.
- An easily-readable
@id
(similar to an OSIS identifier) attribute includes the base-text version (e.g.,N1904
for the Nestle 1904 text). The format is[base-text-version].[book].[chapter].[verse].[word]
@clearId
matches the ids in macula-greek, and is alphabetically sortable. The format is documented on the macula-greek repository.@usfm
follows the USFM/Paratext format, and is (in theory) more universal. See USFM documentation for details.
- An easily-readable
- In addition, option
@before
and@after
attributes encode any associated punctuation.
- We include three identifiers on expression nodes.
<!-- Each book of the New Testament comprises a text node -->
<text id="Gal">
<!-- Each text comprises at least one turn -->
<c unit="turn">
<!-- Each turn is broken down into zero or more turn segments -->
<c unit="seg" title="Introductory Greeting and Doxology">
<!-- Each turn (or turn segment) comprises one or more moves -->
<c unit="move" id="N1904.Gal.1.1.1-2.10">
<!-- Each move includes any expressions that realize that move -->
<e id="N1904.Gal.1.1.1">Παῦλος</e>
<e after="," id="N1904.Gal.1.1.2">ἀπόστολος</e>
...
Because projected speech realizes a distinct order of discourse from its projecting matrix (view example rendering by turn segment), projected speech is enclosed in a distinct turn element.
In other words, a turn nested within a turn represented a new, embedded order of discourse. In the New Testament, there are up to 4 levels of discourse.
In the future, we hope to release turn-segment annotations for all nested discourses. Currently, there are only turn segments for top-level turns (e.g., the narrator of a text).
Orders of discourse are noted in the comment elements in this example:
<text id="Mark">
<!-- First-order discourse: "Mark", the narrator, speaks -->
<c unit="turn">
<c unit="seg" title="The Ministry of John the Baptist">
<c unit="move" id="N1904.Mark.1.1.1-7">
<e id="N1904.Mark.1.1.1">Ἀρχὴ</e>
<e id="N1904.Mark.1.1.2">τοῦ</e>
<e id="N1904.Mark.1.1.3">εὐαγγελίου</e>
<e id="N1904.Mark.1.1.4">Ἰησοῦ</e>
<e id="N1904.Mark.1.1.5">Χριστοῦ</e>
<e before="(" id="N1904.Mark.1.1.6">Υἱοῦ</e>
<e after=")." id="N1904.Mark.1.1.7">Θεοῦ</e>
</c>
<c unit="move" id="N1904.Mark.1.2.1-4.13">
<e id="N1904.Mark.1.2.1">Καθὼς</e>
<e id="N1904.Mark.1.2.2">γέγραπται</e>
<e id="N1904.Mark.1.2.3">ἐν</e>
<e id="N1904.Mark.1.2.4">τῷ</e>
<e id="N1904.Mark.1.2.5">Ἠσαΐᾳ</e>
<e id="N1904.Mark.1.2.6">τῷ</e>
<e id="N1904.Mark.1.2.7">προφήτῃ</e>
<!-- Second-order discourse: "Isaiah the prophet" speaks -->
<c unit="turn">
<c unit="move" id="N1904.Mark.1.2.8-3.14">
<e id="N1904.Mark.1.2.8">Ἰδοὺ</e>
<e id="N1904.Mark.1.2.9">ἀποστέλλω</e>
<e id="N1904.Mark.1.2.10">τὸν</e>
<e id="N1904.Mark.1.2.11">ἄγγελόν</e>
<e id="N1904.Mark.1.2.12">μου</e>
<e id="N1904.Mark.1.2.13">πρὸ</e>
<e id="N1904.Mark.1.2.14">προσώπου</e>
<e after="," id="N1904.Mark.1.2.15">σου</e>
<e id="N1904.Mark.1.2.16">ὃς</e>
<e id="N1904.Mark.1.2.17">κατασκευάσει</e>
<e id="N1904.Mark.1.2.18">τὴν</e>
<e id="N1904.Mark.1.2.19">ὁδόν</e>
<e after="·" id="N1904.Mark.1.2.20">σου</e>
<e id="N1904.Mark.1.3.1">φωνὴ</e>
<e id="N1904.Mark.1.3.2">βοῶντος</e>
<e id="N1904.Mark.1.3.3">ἐν</e>
<e id="N1904.Mark.1.3.4">τῇ</e>
<e id="N1904.Mark.1.3.5">ἐρήμῳ</e>
<!-- Third-order discourse: "The voice in the wilderness" speaks -->
<c unit="turn">
<c unit="move" id="N1904.Mark.1.3.6-14">
<e id="N1904.Mark.1.3.6">Ἑτοιμάσατε</e>
<e id="N1904.Mark.1.3.7">τὴν</e>
<e id="N1904.Mark.1.3.8">ὁδὸν</e>
<e after="," id="N1904.Mark.1.3.9">Κυρίου</e>
<e id="N1904.Mark.1.3.10">εὐθείας</e>
<e id="N1904.Mark.1.3.12">τὰς</e>
<e id="N1904.Mark.1.3.13">τρίβους</e>
<e after="," id="N1904.Mark.1.3.14">αὐτοῦ</e>
<e id="N1904.Mark.1.3.11">ποιεῖτε</e>
</c>
</c>
</c>
</c>
</c>
</c>
</c>
</text>
The projection data has been manually reviewed at least once, but some errors likely remain.
If you catch any errors, please create an issue on this repository and add the data-error
tag. Alternatively, you can email this repository's contributors.