TEI Documentation

This page offers a description of the TEI text files created for the Encoding Medieval Mary Magdalene project and a description of the encoding conventions used for them.  In this way, it contrasts with the User Guidelines page, where users can find a description of the viewing options available through the website’s online interface.  Should viewers wish to download the XML files, they are available [somewhere–will update once I know where they’ll end up being].

The encoding conventions of the Exploring Medieval Mary Magdalene project follow the most current guidelines for electronic text encoding and interchange as laid out by the Text Encoding Initiative (TEI) at http://www.tei-c.org/release/doc/tei-p5-doc/en/html/index.html. The editions do not adhere to any customization of TEI but instead draw on the entire standard.  This documentation intends to provide an overview over the mark-up conventions used in encoding of each textual witness, outlining the basic structure common to each XML-based TEI document and accounting for all the elements and attributes used.

The TEI tagging of each text follows a principle of parsimony. Only elements that are necessary, either from the perspective of the editorial principles for this project or from the perspective of the end-user version of each text, occur.  For this reason, the tags are not fully expansive and could certainly be modified by others to suit their own aims.  In accordance with the TEI 5 guidelines, elements discussed in this document will be referred to with their start-tag in standard TEI notation (e.g. <element>).  Elements that are usually empty are referred to with their simplified tag (e.g. <element/>). Attributes that occur within element tags are marked with the @ symbol (e.g. @attribute).

Table of Contents

Click the links below to jump to a specific portion of the document.

TEI Document Structure
1 Tei Header
2 Text Body
    2.1 Basic Structure
        2.1.1 Body
            2.1.1.1 Folio and Column Beginnings
            2.1.1.2 Line Beginnings
    2.2 Encoding Manuscript Features
        2.2.1 Abbreviations
        2.2.2 Decoration
        2.2.3 Proper Names
        2.2.4 Punctuation
        2.2.5 Metamarks
        2.2.6 Uncertain Readings
        2.2.7 Modifications of the Text
            2.2.7.1 Original
                2.2.7.1.1 Deletions
                2.2.7.1.2 Additions
                2.2.7.1.3 Hand
            2.2.7.2 Editor
                2.2.7.2.1 Editorial Emendations
                2.2.7.2.2 Corrections
                2.2.7.2.3 Supplied Omissions
3 Comparisons and Collation
    3.1 Variants
    3.2 Missing in Other Witnesses
    3.3 Associating the Critical Apparatus to the Text


TEI Document Structure

Each TEI document in this project begins with lines that specify the files’ identities as XML-based TEI documents.  They form part of the basic structure of a TEI file and include basic processing instructions.

To see these opening lines, click here.
<?xml version="1.0" encoding="UTF-8"?>
<?xml-model href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_lite.rng" type="application/xml" schematypens="http://relaxng.org/ns/structure/1.0"?
<?xml-model href="http://www.tei-c.org/release/xml/tei/custom/schema/relaxng/tei_lite.rng" type="application/xml" schematypens="http://purl.oclc.org/dsdl/schematron"?>
<TEI xmlns="http://www.tei-c.org/ns/1.0">

 

The root element for each TEI document is <TEI xmlns=”http://www.tei-c.org/ns/1.0″>. It contains the two high-level elements <teiHeader> and <text>, which we will respectively address in Sections 1 and 2.

1 TEI Header

The TEI header element <teiHeader> contains all the metadata relevant to each witness and its TEI encoding.

Each TEI header <teiHeader> contains three elements: <fileDesc>, <encodingDesc>, and <revisionDesc>.

  • <fileDesc> contains the bibliographic description of the TEI document, including publication statement and source description.
  • <encodingDesc> provides declarations about the relationship between encoding and the source document.
  • <revisionDesc> serves to summarize the revision history of each TEI document.

These elements each contain a number of hierarchically-structured elements that contain the specific information in question. Some of this information is identical in all TEI documents within the project.

Click to view an example header that contains the information shared by all TEI documents in this project.

This example header is intended to  indicate the type and location of the information denoted in the header.  It is schematic in nature, in that it is not the header that appears as-is in any of the project files.  Rather, it denotes the type of information (e.g. encoder’s name, location of production of the manuscript) that occurs in these locations in the files.  For ease of reading, these schematic information types are denoted here in italics.  In the individual files, these schematic representations are replaced by the particulars of the encoded manuscripts; naturally, this text also does not occur in italics in the individual files.

To see more, scroll to the right, or hover over the scroll bar at the very bottom of the code block (i.e. hover the cursor over the final </teiHeader> tag).

<teiHeader>
<fileDesc>
    <titleStmt>
        <title>Mary Magdalene Conversion Legend</title>
        <author ref="#PseudoIsidore">Pseudo Isidore</author>
        <respStmt>
            <name xml:id="NS">Name of student</name>
            <resp>TEI encoder</resp>
        </respStmt>
    </titleStmt>
    <editionStmt>
        <edition>
            <title>Mary Magdalene Conversion Legend - Digital Edition</title>
            <date when="YYYY-MM-DD">Creation date in Month DD, YYYY format</date>
        </edition>
    </editionStmt>
    <publicationStmt>
        <authority>
            <ref target=""></ref>
        </authority>
        <availability status="free"><p>Published under <ref target=""></ref></p></availability>
    </publicationStmt>
    <sourceDesc>
        <msDesc>
            <msIdentifier>
                <settlement>Source manuscript city location</settlement>
                <repository>Source manuscript holding institution</repository>
                <idno>Source manuscript shelfmark</idno>
            </msIdentifier>
            <history>
                <origin>
                    <p>Institution where manuscript was produced
                        <origPlace>City where manuscript was produced</origPlace> in the
                        <origDate  notAfter="YYYY" notBefore="YYYY">Century when created</origDate>
                    </p>
                </origin>
            </history>
        </msDesc>
        <listWit>
            <witness xml:id="Alphanumerical identifier">Alphanumerical identifier</witness>
        </listWit>
    </sourceDesc>
</fileDesc>
<encodingDesc>
    <schemaRef n="lbp-critical-1.0.0" url="https://raw.githubusercontent.com/lombardpress/lombardpress-schema/develop/src/out/critical.rng"/>
    <editorialDecl>
        <p>Encoding of this text has followed the published guidelines on...</p>
    </editorialDecl>
</encodingDesc>
<revisionDesc status="draft">
    <listChange>
        <change when="YYYY-MM-DD" status="draft" n="0.0.0">
            <p>Change date when editing file for the first time</p>
        </change>
        <change when="YYYY-MM-DD" status="draft" n="0.0.0">
            <p>Created file for the first time.</p>
        </change>
    </listChange>
</revisionDesc>
</teiHeader>

 

2 Text Body

2.1 Basic Structure

The <text> element contains all of the file’s manuscript content. Since each document contains the transcription of a single text from one witness, <text> only contains within it one <body> element.

2.1.1 Body

Within the <body> element, the so-called “anonymous block” element <ab> is used as the general container for semantically-grouped text segments, which may be understood as distinct episodes within the legend. Since the witnesses themselves do not display their respective text divided into graphical sections like paragraphs, the paragraph element <p> would be too semantically loaded.

We have identified a total of eighteen distinct episodes in the legend and added <ab> elements accordingly.  Each <ab> contains a unique @xml:id, as can be seen in the list below.  As is represented by XX in the labels below, each label begins with the witness’s shorthand label (e.g. B1Latin-C).  Note that not every manuscript contains each episode: for instance, the final commentary occurs only in the Latin C manuscript (Copenhagen, Det Kongelige Bibliotek, Gl. kgl. S. 205 fol., fols. 89v–90r).

  1. <ab xml:id=”XX_legend_title”>
  2. <ab xml:id=”XX_genealogy”>
  3. <ab xml:id=”XX_heritage”>
  4. <ab xml:id=”XX_MaryMagdalene_mismanagement”>
  5. <ab xml:id=”XX_Martha_confronts_Mary”>
  6. <ab xml:id=”XX_MaryMagdalene_lovers”>
  7. <ab xml:id=”XX_market_scene”>
  8. <ab xml:id=”XX_Mary_at_Marthas_house”>
  9. <ab xml:id=”XX_Jesus_enters”>
  10. <ab xml:id=”XX_Mary_retreats”>
  11. <ab xml:id=”XX_Martha_encourages_Mary”>
  12. <ab xml:id=”XX_Martha_in_Simons_house”>
  13. <ab xml:id=”XX_Gregory_comment”>
  14. <ab xml:id=”XX_Mary_in_Simons_house”>
  15. <ab xml:id=”XX_VirginMary_visits”>
  16. <ab xml:id=”XX_Mary_rest_of_life”>
  17. <ab xml:id=”XX_epilogue”>
  18. <ab xml:id=”XX_commentary”>

The following elements are used within the <ab> element: <head>, <pb/>, <cb/>, <lb/>, and <app>.

The element <head>, used to mark the title of the text in each witness, occurs nested within the <ab xml:id=”XX_legend_title”>

Click here to see an example of the <head> element.

Witness B1 (Berlin, Staatsbibliothek Preußischer Kulturbesitz, Ms. germ. quart. 261, fols. 186r-190v) contains the title text “Dyt is dat leven der seliger Marien Magdalenen”.  We encode this as follows.  Note that for ease of reading in this example, we have removed references to manuscript abbreviations; for details of these, see Section 2.2.1. For further details on the <lb/> element, see Section 2.1.1.2.

<ab xml:id="B1_legend_title">
    <lb facs="#B1_186r_a_01"/>
    <head>Dyt is dat leven der seliger Marien Magdalenen</head>
</ab>

 

The next two subsections describe the <pb/>, <cb/>, and <lb/> elements; we return to the <app> element in Section 3.

2.1.1.1 Folio and Column Beginnings

The <pb/> element indicates folio beginnings. The <cb/> element indicates column beginnings on each folio, if applicable.

Every <pb/> and <cb/> contain the attributes @edRef, @facs, and @n.

  • @edRef notes the source of the folio and column beginning by pointing to the respective manuscript ID. For instance, @edRef=”#B1″ points to the manuscript with the abbreviated shelf mark B1.
  • @facs is used to point to the manuscript image of the corresponding folio.
  • @n provides a label for each folio and column beginning. This is used to display the folio and column number within the end-user version of the text.
To see examples of these elements, click here.

The element marking the beginning of fol. 186r in manuscript B1 appears as follows.

<pb edRef="#B1" facs="#B1_186r" n="fol. 186r"/>

The element marking the beginning of column a (the left-hand column) of fol. 147r in manuscript Latin-A appears as follows.

<cb edRef="#Latin-A" facs="#Latin-A_147ra" n="fol. 147ra"/>

It can be noted that these <pb/> and <cb/> elements appear extremely similar; the only difference between the information in the <pb/> and <cb/> elements is that <cb/> additionally encodes information about column numbering, while <pb/> does not.

 

Even if a witness does not have columns, each <pb/> element is followed by a <cb/> element which contains a @n attribute containing the folio number. This solution has been chosen to provide a unified way of fetching the folio (or column) numbers for display in the end-user version.

2.1.1.2 Line Beginnings

Each line beginning is marked by the <lb/> element.

Every <lb/> contains a @facs attribute. @facs serves to identify the specific line in each manuscript image. The numbering systematically follows the structure manuscript number_folio number_column number_line number. For instance, <lb facs=”#B2_159r_a_13″/> refers to line 13, column a, folio 159r from manuscript B2.

 

2.2 Encoding Manuscript Features

2.2.1 Abbreviations

All abbreviations found in each witness are tagged with the <abbr> element. Special characters used within abbreviations are represented by using decimal (&#…;) or hexadecimal (&#x…;) Unicode notation. Below is a list of all abbreviations occurring in the edition.

Symbol DisplayingDecimal (&#...;)Hexadecimal (&#x...;)Description
◌̃771303Combining tilde
◌̄772304Combining macron
◌̅773305Combining overline
◌̇775307Combining single dot above
◌̈776308Combining double dots above
◌̌78030CCombining caron
◌̒786312Combining turned comma above
◌̛79531BCombining horn
78361E9CLatin small letter long s with diagonal stroke
8266204ATironian sign et
42833A751p with stroke through descender
42834A752Latin capital letter p with flourish
42835A753Latin small letter p with flourish
42841A759Latin small letter q with diagonal stroke
ƚ41019ALatin small letter l with bar
42844A75CLatin capital letter rum rotunda
42845A75DLatin small letter rum rotunda
42858A76AUpper case Latin et
42861A76DLatin small letter is
42863A76FLatin small letter con
|124124Vertical line
182B6Paragraph sign
·183B7Middle dot

The resolution to an abbreviation is tagged with the <expan> element.

Each pair of <abbr> and <expan> elements are embedded in a <choice> element in order to allow display of the end-user text either with abbreviations or with resolved spelling.  The <choice> element surrounds the entire word, not just the abbreviated portion.

Click to view examples of each symbol occurring in the text, showing how each is encoded.

In this section, we focus only on abbreviation resolution.  To see how this interacts with capitalization emendation, see Section 2.2.7.2.1.

Symbol (Hex)
Example Image
Example Code
◌̃ (303)
<choice>
    <abbr>
        h
        <am>o&#x303;</am>
    </abbr>
    <expan>
        h
        <ex>ora</ex>
    </expan>
</choice>
◌̄ (304)
<choice>
    <abbr>
        sorror
        <am>e&#x304;</am>
    </abbr>
    <expan>
        sorror
        <ex>em</ex>
    </expan>
</choice>
◌̅ (305)
<choice>
    <abbr>
        i
        <am>h&#x305;</am>
        s
    </abbr>
    <expan>
        i
        <ex>hesu</ex>
        s
    </expan>
</choice>
◌̌ (030C)
<choice>
    <abbr>
        <am>q&#x030C;</am>
    </abbr>
    <expan>
        <ex>qui</ex>
    </expan>
</choice>
 ◌̒ (312)
<choice>
    <abbr>
        hono
        <am>i&#x312;</am>
        s
    </abbr>
    <expan>
        hono
        <ex>ri</ex>
        s
    </expan>
</choice>
◌̛ (31B)
<choice>
    <abbr>
        viu
        <am>e&#x31B;</am>
    </abbr>
    <expan>
        viu
        <ex>ere</ex>
    </expan>
</choice>
ẜ (1E9C)
<choice>
    <abbr>
        <am>&#7836;</am>
    </abbr>
    <expan>
        <ex>ser</ex>
    </expan>
</choice>
⁊ (204A)
<choice>
    <abbr>
        <am>&#8266;</am>
    </abbr>
    <expan>
        <ex>et</ex>
    </expan>
</choice>
ꝑ (A751)
<choice>
    <abbr>
        <am>&#42833;</am>
        sonen
    </abbr>
    <expan>
        <ex>per</ex>
        sonen
    </expan>
</choice>
 ꝓ (A753)

Ꝓ (A752)

<choice>
    <abbr>
        <am>&#xA753;</am>
        grediens
    </abbr>
    <expan>
        <ex>pro</ex>
        grediens
    </expan>
</choice>
ꝙ (A759)
<choice>
    <abbr>
        <am>&#xA759;</am>
    </abbr>
    <expan>
        <ex>quam</ex>
    </expan>
</choice>
ꝝ (A75D)

Ꝝ (A75C)

<choice>
    <abbr>
        filio
        <am>&#42845;</am>
    </abbr>
    <expan>
        filio
        <ex>rum</ex>
    </expan>
</choice>
Ꝫ (A76A)
<choice>
    <abbr>
        catricib
        <am>&#xA76A;</am>
    </abbr>
    <expan>
        catricib
        <ex>us</ex>
    </expan>
</choice>
ꝭ (A76D)
<choice>
    <abbr>
        vir
        <am>&#42861;</am>
    </abbr>
    <expan>
        vir
        <ex>is</ex>
    </expan>
</choice>
ꝯ (A76F)
<choice>
    <abbr>
        <am>&#42863;</am>
        tigit
    </abbr>
    <expan>
        <ex>con</ex>
        tigit
    </expan>
</choice>
· (00B7)
<choice>
    <abbr>
        <am>&#183;i&#183;</am>
    </abbr>
    <expan>
        <ex>et</ex>
    </expan>
</choice>
ƚ (19A)
<choice>
    <abbr>
        m
        <am>&#410;</am>
        t
    </abbr>
    <expan>
        m
        <ex>ul</ex>
        t
    </expan>
</choice>

 

2.2.2 Decoration

Certain words and initials occur in the manuscripts with special decoration, including but not limited to rubrication and underlining. These decorations are tagged using the <hi> (“highlighting”) element. Each <hi> element contains a @rend attribute, indicating the specific kind of decoration present. Possible values are: 

  • rend=”init_lombard” (Lombardic capital)
  • rend=”rubr” (rubrication)
  • rend=”underline” (underlined text)
  • rend=”decor” (non-specific but special decoration)
  • rend=”rubr_underline” (rubricated and underlined text)

The text contained within the <hi> element is exactly the text which is specially decorated. For instance, if a word’s first letter is rubricated but the rest of the word is not, then the <hi @rend=”rubr”> tag will contain only that first letter. By contrast, if an entire word is underlined, then the <hi @rend=”underline”> tag will contain the entire word.

Note that, as is mentioned in Section 2.2.3 below, these @rend attributes can be used within <name> elements to refer to decoration that persists across an occurrence of a name.

Click to view examples of each type of decoration, showing how each is encoded.
Decoration Type (@rend Value)
Example Image
Example Code
“init_lombard”
<hi rend="init_lombard">
    <hi rend="rubr">M</hi>
</hi>
aria

Here, we have two embedded <hi> tags. The first indicates that the initial letter is a Lombardic capital. The second is used to indicate a further type of decoration, because this letter in the source manuscript is red (rubricated), though this is not evident from the scan.

“rubr”
<hi rend="rubr">H</hi>oc
“underline”
<hi rend="underline">engelen</hi>
“decor”
<hi rend="decor">J</hi>herusalem
“rubr_underline”
<hi rend="rubr_underline">hÿmelscher</hi>

 

2.2.3 Proper Names

All proper names found in each witness are tagged with the <name> element. Using these <name> elements allows for standardized reference to and disambiguation between proper names like Mary Magdalene and the Virgin Mary in the XML code, though these references are not visible in the online text edition.

Every <name> contains at least the @type and the @nymRef attribute; the @role, @subtype, and @rend attributes are also used where relevant. These attributes have the following functions.

  • @type declares the type of entity indicated by the name. Values used in this edition are “person”, “org” and “place”.
  • @subtype further specifies the entity indicated by the name beyond the characterization given by @type.
  • @nymRef points to the canonical form of the name in question.  This tag serves as a standardized reference that links all occurrences of a particular individual or group’s name across the different manuscript witnesses.  However, it should be noted that these @nymRef tags are not visibly represented in the online text editions and can only be accessed from the XML files themselves.
  • @role further specifies the entity indicated by the name beyond the form denoted in @nymRef.
  • @rend characterizes special visual features of the name in question. The possible values are the ones indicated in Section 2.2.2.

If a name is split across two lines, the <lb/> tag is contained between the opening <name> and closing </name> tags.

Below is a list of all proper names tagged with <name>. (Click the arrow buttons at the bottom of the table to view more.)

Name@type@subtype@role@nymRef
AugustinuspersonAugustine
BethanyplaceBethany
Children of IsraelorgreligionChildren_of_Israel
ConstantinopleplaceConstantinople
DavidpersonDavid
Disciples (of Jesus)persongroupDisciples_of_Jesus
EastereventEaster
EuchariapersonEucharia
GalileeplaceGalilee
GodpersonGod
GregorypersonGregory
HerodpersonkingHerod
Holy SpiritpersonHoly_Spirit
IsidorepersonPseudo_Isidore
IsrahelplaceIsrael
JerusalemplaceJerusalem
JerusalemitesplaceresidentsJerusalemites
Jesus ChristpersonJesus_Christ
JewsorgreligionJews
JoachimpersonJoachim
JohnpersonJohn
JosephpersonJoseph
JudaspersonJudas
JudeaplaceJudea
King of KingspersonKing_of_Kings
LazaruspersonLazarus
LukepersonLuke
MagdalumplaceresidenceMagdalum
MarthapersonMartha
MartillapersonMartilla
Mary MagdalenepersonMary_Magdalene
Virgin Mary (Mother of God)personVirgin_Mary
Mary, Wife of ZebedeepersonMary_of_Zebedee
Mary, Wife of JacobpersonMary_of_Jacob
MatthewpersonMatthew
PaulpersonPaul
PeterpersonPeter
PhariseepersongroupPharisee
RomeplaceRome
SaulpersonSaul
Simon the PhariseepersonSimon
SyruspersonSyrus
TaborplaceTabor
Tribe of DavidorgreligionTribe_of_David
TyberiuspersonemperorTiberius

In the code, these attributes all appear within the name tag, with all non-empty attribute values occurring in quotation marks and with those attributes listed left-to-right in accordance with the table.

Click to view examples of <name> tags in context.

Let us consider an example in which “maria” occurs within the text and we encode this as a reference to Mary Magdalene. As per our conventions, we do so as follows. Note that we embed a <choice> element around the (within the <name> element) to allow for the possibility of capitalizing Mary’s name. For further details of capitalization emendations, see 2.2.7.2.1 below.

<name type="person" nymRef="Mary_Magdalene">
    <choice>
        <orig>m</orig>
        <reg resp="editor" type="capit">M</reg>
    </choice>
    aria
</name>

As a second example, consider an example of “tyberius”, a reference to the emperor Tiberius. We encode this reference as follows. Note here that the order of the attributes within the <name> tag is still consistent with left-to-right movement across the table.

<name type="person" role="emperor" nymRef="Tiberius">
    <choice>
        <orig>t</orig>
        <reg resp="editor" type="capit">T</reg>
    </choice>
    yberius
</name>

 

2.2.4 Punctuation

All punctuation, either original or supplied, is tagged with the <pc> element. Between manuscript and edition, four punctuation scenarios are possible:

  1. Punctuation occurs in the witness, and the editor accepts that punctuation into the edition. In this case, the <pc> element contains a @source attribute with the value “manuscript”.
  2. No punctuation occurs in the witness, and the editor supplies the punctuation. In this case, the <pc> element contains a @resp attribute with the value “editor”.
  3. Punctuation occurs in the witness, but the editor emends the punctuation. In this case, the <pc> element contains a <choice></choice> element which itself contains the original punctuation tagged with the <orig> element and the editor’s emended punctuation tagged with the <reg> element; the reg element contains a @resp attribute with the value “editor” as well as a @type attribute with the value “punct”.
  4. Punctuation occurs in the witness, but the editor does not accept it into the edition. In this case, the <pc> element contains a <choice> element which itself contains the original punctuation tagged with the <orig> element and an empty <reg> element; the reg element contains a @resp attribute with the value “editor” as well as a @type attribute with the value “punct”.
Click to view examples of these four types of punctuation scenarios.

For the first situation, consider an example in which the text contains the words “libro, de” (with a comma between the two words), and we as editors concur with this punctuation. We encode this in the following manner.

libro
<pc source="manuscript">,</pc>
 de

For an example of the second situation, the text contains “sorores scilicet”, but modern punctuation conventions dictate in context that a comma should intervene between the two words. We encode this in the following manner.

sorores
<pc resp="editor">,</pc>
 scilicet

For an example of the third situation, the text contains the words “religiosam: sed” (with a colon between the two words), but modern punctuation conventions call for a comma rather than a colon here. We would encode this as follows.

religiosam
<pc>
    <choice>
        <orig>:</orig>
        <reg resp="editor" type="punct">,</reg>
    </choice>
</pc>
 sed

For an example of the fourth situation, the text contains the words “filium. et” (with a period between the two words), but modern punctuation conventions call for no punctuation between these two words. We encode this as follows.

filium
<pc>
    <choice>
        <orig>.</orig>
        <reg resp="editor" type="punct"></reg>
    </choice>
</pc>
 et

 

Note that no space is used between the word preceding the punctuation and the <pc> element, though a space is used between the </pc> tag and the following word.

In the Latin text editions, the punctus elevatus is represented using the Unicode symbol “modifier letter high end tone” (decimal 762, hexadecimal 02FA). With this exception, all other punctuation is represented using modernized punctuation marks. That is to say, for instance, that the modern question mark “?” is used when a punctus interrogativus occurs in the manuscript.

2.2.5 Metamarks

Two special kinds of metamarks appear in the manuscripts: line fillers and hyphens used for syllabification and words spanning line breaks. Both are tagged with the <metamark> element. Each <metamark> element contains the @type, @function, and @source attributes. For line fillers, the value of @function is “line_filler”, for syllabification, the value for @function is “word_division”. For line fillers, the value of @type may vary in accordance to the specific rendition of the line filler; for syllabification, the value for @type is “hyphen” or “double_hyphen”. The value for @source is “manuscript”.

To see some examples, click here.

Below, we see an example of a hyphen used to indicate that the word gebor- (full word geboren) is spread across two lines. We encode this as follows.

gebor
<metamark type="hyphen" source="manuscript" function="word_division">
    -
</metamark>

 

Below, we see an example of a double hyphen used to a similar end to indicate that byd- (full word bydden) is spread across two lines. We encode this as follows. In this instance, because the editor has determined that this is an addition to the original manuscript, we wrap the <metamark> element inside an <add> element following Section 2.2.7.1.2.

byd
<add place="rmargin">
    <metamark type="double_hyphen" source="manuscript" function="word_division">
        =
    </metamark>
</add>

 

2.2.6 Uncertain Readings

To represent uncertain readings of the original, the element <unclear> is used, containing the most likely reading. The reason for the uncertainty may be expressed with the @reason attribute. Possible values may include “illegible”, “rubbing” etc. The degree of certitude may be expressed with the @cert attribute which may carry the values “low”, “medium”, and “high”.  Because it reflects an instance in which the editor must supply a reading, the entire <unclear> element is embedded within a <supplied> element.

The <unclear> element only surrounds the portion of the text for which the reading is uncertain.  If, for instance, the last two letters of a word are unclear but the first five are unambiguously readable, the <unclear> tag should only contain the last two letters.

To see an example, click here.

We encounter a situation in which the manuscript contains a word ?en, where the last two letters are unambiguously readable as en but the first letter is uncertain due to illegibility.  We encode this as follows.

<supplied>
    <unclear reason="illegible" cert="medium">
        Z
    </unclear>
</supplied>
en

 

2.2.7 Modifications of the Text

Any modification of the text in a witness is tagged as a specific element in each TEI document. Such modifications include both those that are found in the witness itself, be it through a third party, damage, etc., and those that are based on editorial choices.

2.2.7.1 Original

The following instances of scribal corrections in the witness itself have been encountered in the course of this project and are tagged accordingly.

2.2.7.1.1 Deletions

When text in the manuscript has been deleted in the original by a scribal hand, it is tagged with the <del> element. Each <del> element contains a @rend attribute, indicating the specific rendition of the deletion. Possible values are:

  • rend=”rubr_strikethrough”
  • rend=”strikethrough”
  • rend=”adapted”
2.2.7.1.2 Additions

Additions in the original by a scribal hand are tagged with the <add> element. Each <add> element contains a @place attribute, indicating the specific placement of the deletion. Possible values are:

  • place=”above”
  • place=”below”
  • place=”lmargin”
  • place=”rmargin”
  • place=”interlinear”
2.2.7.1.3 Hand

In the case that several hands con be identified, the <add> element may also include a @hand attribute, identifying the hand responsible for the addition.

2.2.7.2 Editor

Each TEI document reflects a number of editorial interventions that were deemed necessary by the respective editor in the course of transcribing and encoding the respective witness.  These various editorial interventions, including emendations and supplied omissions, are detailed below.

2.2.7.2.1 Editorial Emendations

Systematic emendations are tagged with the <reg> element. The original reading is tagged with <orig>. Both are embedded in a <choice> element. Each <reg> element contains a @type attribute, indicating the specific type of emendation, and a @resp attribute with the value “editor”. The most common instance of this is normalization of capitalization; the @type attribute then has the value “capit”.

Click to view examples.

First, we consider a case in which “lazarus” appears in the witness, but the editor wishes to normalize the capitalization to “Lazarus.” We encode this in the following manner. (Note that the entire code block would be contained within a <name> element referring to Lazarus.)

<choice>
    <orig>l</orig>
    <reg type="capit" resp="editor">L</reg>
</choice>
azarus

Next, we consider a case in which we must both normalize capitalization and offer the option of expanding an abbreviation which occurs in the text; for example, we consider a case in which “marthā” (“Martham”) occurs within the text witness. In this case, we must both normalize the capitalization and offer the option of expanding the abbreviation. In this situation, we follow a convention of embedding the <choice> element associated with the capitalization twice—once in the abbreviated spelling and once in the expanded version—as illustrated below.

<choice>
    <abbr>
        <choice>
            <orig>m</orig>
            <reg resp="editor" type="capit">M</reg>
        <choice>
        arth
        <am>a&#x304;<am>
    </abbr>
    <expan>
        <choice>
            <orig>m</orig>
            <reg resp="editor" type="capit">M</reg>
        <choice>
        arth
        <ex>am<am>
    </abbr>
</choice>

 

2.2.7.2.2 Corrections

We have begun to tag non-systematic corrections using the <corr> element. This editorial intervention is still in progress and at this point only occurs in some vernacular text editions. These non-systematic corrections include, for instance, misspelled names (e.g. enttkaria rather than Euckaria) and misused names (e.g. Maria in a place where clearly Martha is meant to be referenced); they are non-systematic insofar as they cannot be predicted from a word’s status as, say, a proper noun or sentence-initial word.  The original reading is tagged with <orig>, and both <corr> and <orig> are embedded within a <choice> element. A <corr> element may optionally contain a @type attribute that indicates the reason for the correction.

If the correction is merely based on a non-standard spelling in the original, the <sic> element is used instead of <orig>.

To see an example, click here.

In Manuscript B1 (Berlin, Staatsbibliothek Preußischer Kulturbesitz, Ms. germ. quart. 261, fols. 186r-190v), we see an occurrence of the spelling Marthi as a non-standard spelling of the name Martha.  We encode this as follows.  (The entire following text would be embedded within a <name> tag, as Martha is a proper name.)

<choice>
    <sic>Marthi</sic>
    <corr>Martha</corr>
</choice>

 

2.2.7.2.3 Supplied Omissions

If the editor identifies an obvious (erroneous) omission of text, the corresponding text is supplied using the <supplied> element.  This element contains a @resp attribute with the value “editor” and a @source element that clarifies the source of where the supplied text comes from.  If the supplied text comes from other witnesses the value of @source indicates the shorthand labels of those witnesses.  In the case of multiple source witnesses, the abbreviated labels are separated by spaces in the value of @source.

As in previous cases, the <supplied> element is embedded in a <choice> element.  An <orig> element is also used, but as the text in this case is omitted in the witness, this <orig> will be empty.

Click here to see an example.

As an example, Manuscript S (Strasbourg, Bibliothèque Nationale et Universitaire, ms. 324, fols. 350r-335v) contains the phrase das duchte.  Here, the editor has determined on the basis of witness B1 that this in fact ought to read das yr duchte.  We encode this as follows.

das 
<choice>
    <orig></orig>
    <corr>
        <supplied resp="editor" source="B1">
            yr
        </supplied>
    </corr>
</choice>
 duchte

 

3 Comparisons and Collation

As a portion of this project, we have begun to encode into the manuscript files a dynamic collation aspect that resembles in spirit a critical apparatus but, as with much of our editions, leverages the power of the digital medium in letting us offer comparisons and connections even across texts in different languages or which contain separate additions and portions. The encoding of this portion of the project is currently in its initial stages; therefore, we give only a basic outline of the tagging conventions here. The collation embeds information connecting and comparing the manuscript witnesses. For this reason, these tags only appear in the XML files where applicable.

We are also currently introducing into the text editions a <note> function. This enables the encoding of notes pertaining to the critical comparison of witnesses, translator’s notes, and editorial explanation. This function is still very much in development, and we will offer details of its encoding in future versions of this documentation.

3.1 Variants

A critical apparatus tag has the following basic structure:

<app>
    <rdg wit=”#ms id”>reading from other ms</rdg>
</app>

Each <rdg> element contains exactly one reading from another witness, with @wit pointing to its respective ID. Additionally, the <app> element may contain a <note> element providing additional comments by the editor.

3.2 Missing in other witnesses

In a typical <app> element, the <rdg> element indicates a reading that exists in a different manuscript and differs from the current manuscript under discussion. However, it is entirely possible that a different manuscript might entirely lack a passage that the current manuscript contains. In this case, the <app> element in the current manuscript cannot contain the other manuscript’s reading of this passage (since the other manuscript lacks the passage altogether). This holds also true on the level of TEI syntax, because the <rdg> element should not be empty. In this case, a <note> element is embedded within the relevant <app> element stating that text is missing.

3.3 Associating the Critical Apparatus to the Text

In order to give a precise location reference for a critical apparatus entry within a text, different methods may be used. In the case of a single word reference, the <app> element is placed around it. The word in question is then tagged with the <lem> element. The <rdg> elements follow after the <lem> element.

Click to view an example.

As an example, let us assume that the manuscript in question contains the phrase this word in a place where another manuscript, Manuscript X, contains the phrase that word. We would encode this information as follows.

<app>
    <lem>this</lem>
    <rdg wit="#X">that</rdg>
</app>
 word

Note that our default viewer will render the resulting text as this word. That is, the <lem> element denotes that the content in question belongs to the original text on which the apparatus is commenting. Note also that, similarly to <pc> tags, a space occurs between the closing </app> tag and the following word.

 

Alternatively, the range of an <app> element may be defined using the attributes @from and @to. If an apparatus entry concerns a single word, only the @from attribute is used, since it automatically only extends to the one entity pointed at within it. If an apparatus entry concerns a (uninterrupted) sequence of words, both @from and @to are used.

Both attributes @from and @to fetch unique xml IDs that have to be defined for that purpose. Typically, these IDs are defined within a <w> element which is used to tag single words. The IDs follow the established logic already mentioned above: manuscript number_folio number_column number_line number_word number.

Click to view an example.

Let assume that Manuscript S, folio 350r, column a contains 8 lines on line 14:

word1 word2 word3 word4 word5 word6 word7 word8

Let us also assume that, rather than words5-8, Manuscript X contains just two words, wordy wordz, instead.

We wish to note in the file for Manuscript S that Manuscript X differs in this way. We would encode this follows.  (Here, the final number in S_350r_a_14_05 refers to the word’s position in the line in Manuscript S.)

<w xml:id="S_350r_a_14_05”>word5</w>
word6
word7
<w xml:id="S_350r_a_14_08">word8</w>
<app from="#S_350r_a_14_05" to="#S_350r_a_14_08">
    <rdg wit="X">wordy wordz</rdg>
</app>

 

As we further develop this portion of the project, we hope that it will provide a powerful way to relate the various text witnesses to each other.