Skip to end of metadata
Go to start of metadata


See the Front Matter page for more detailed information.


Prose includes novels, shorts stories, essays, etc.

The body will generally take the following structure (with a few exceptions):

  • body
    • div type="chapter"
      • any of the below tags needed to encode the text
    • div
  • body

There are several main tags that we use to mark up the structural elements of prose.

They indicate:


Divisions are often indicated by a chapter, section, etc. of a book. Nest as many divisions as necessary to properly represent the structure of the text (e.g., chapters, sections, etc.). Be sure to maintain consistency among the levels of division within the body (e.g., all chapters occur as first-level divisions, section as second-level, etc.).

All division tags will have a type attribute. The value of the type attribute will be one of following:

  • chapter
  • section
  • lecture
  • letter
  • essay
  • story (used to demarcate short stories)
  • book
  • pamphlet
  • notes
  • dedication

If none of these value correctly describe the section of text you are encoding, document the nature of the division in the VWWP Encoding Problems page.


See VWWP TEI P5 Encoding Guidelines for more information about headings.


Paragraphs are marked with a <p> tag. Paragraphs can be marked virtually anywhere in the text to mark a prose block. Paragraphs include <pb/> (page breaks), lists and tables. Paragraphs are extremely versatile and are used in a wide variety of text encoding situations. Generally speaking, if something is written as a paragraph, it can be marked as such. <div> tags cannot come within paragraphs, but <list> tags, <figure> tags, <pb/> tags, <note> tags, and many others can come within <p> tags. For instance, if a paragraph is broken up by a blank page and an image, as shown below, you do not need to close the paragraph to include these features. This allows you to faithfully represent the text.

Floating Texts

Often in prose texts you may encounter an "embedded" or floating text in the form of a letter, poem, journal entry, song, etc. Floating texts such as these have a complete structure that interrupts the flow of the main text that require the use of the <floatingText> tag. For example, letters and journal entries (see detailed description below have an opener and body; letters usually have closers, and a poem may be quoted in its entirety, with a title, epigraph, etc.

Floating texts are contained within a division of text (see example below) and may have one of the following division types (e.g., <div type="letter">):

  • article (e.g., journal or newspaper article)
  • letter
  • poem
  • journal
  • song

If you encounter another genre, do not assign a "type" attribute. Please document this in the VWWP Encoding Problems page for review and later designation.

  • Chapter with a letter


See VWWP TEI P5 Encoding Guidelines for more information about encoding notes (footnote, endnotes, etc.).

Photographs, Graphics, and other Images

See VWWP TEI P5 Encoding Guidelines for more information about photographs, graphics and other images.


See VWWP TEI P5 Encoding Guidelines for more information about lists.


See VWWP TEI P5 Encoding Guidelines for more information about tables.


Quotes are denoted by quotation marks, which will be retained in the text. Only text that comes within quotation marks will be marked as a quotation for the purposes of encoding. There are two types of quotes: quotes that are external to the text and quotes that are internal. The quote element is used for passages that are external to the text, like a reference to a study or another book. Internal quotes are quotes occur inside the text (e.g., character speeches or thoughts or notes written by characters) and have various TEI elements to represent them.


Quotes that are External to the Text: Outside Sources and Other References

Quotes that come from outside the text are marked by first using a <cit> tag, to denote an external citation. Within the <cit> tag there are two smaller parts, <quote> and <bibl>. <quote> encompasses the body of the quote, or actual quoted text. The <bibl> tag encompasses any bibliographic reference given that identifies the source of the text, such as a title or author. For a more comprehensive discussion of the <bibl> tag, please see the <bibl> section of the official TEI P5 guidelines. Quotes can also be marked with other tags, for instance, inside the <quote> tag, you can have an <l> tag to denote a line of poetry.

Quotes that are Internal to the Text: Thought, Speech, Writing

Quotations in the text that indicate speech, thought, writing, etc. by one or more characters is marked by the various TEI elements.

Specialized tags are provided to indicate the various types of internal quotations, but for this project we will only use a subset of the possible tags:

  • <said>: Use to indicate passages thought or spoken aloud
    • When <said> is used, the who attribute is required. To facilitate the use of the who attribute, be sure you first record the
      person in the TEI Header following the instructions under the prosopography section. This will generate a pick list for the who attribute (to minimize errors and ensure consistency).
  • <q> is used when someone is being quoted, but it's not an actual <said>. The use of <q> is kinda mushy, but here's a good example:
  • <foreign>: A word or phrase is in quotation marks, italisized or set apart in some way because it not the predominant language used in the text.
    • Attempt to identify the language using the "xml:lang" attribute and a two-letter (as opposed to the three-letter) code according to the ISO 639 standard. See example below.
  • <distinct>: A word or phrase is in quotes or set apart in some way because it is linguistically distinct such as slang or regional dialect.

Anything else that appears in quotes but is neither <quote>, <said>, <foreign> or <distinct> does not need to be differentiated in the markup.

Retain the quotation marks printed in the text. Tags should surround the quotation marks when present.

Quotes can come within quotes, such as when one speaker quotes someone else. If there is an external quote inside an internal quote, for instance, a character quotes the bible, the correct tags will be used to delineate between the two distinct types of quotes.


Letters commonly appear within prose texts and should be encoded as <floatingText> with <div type="letter">.

  • Use <opener> if the letter contains a dateline, salutation or other opening content.
    • Use <salute>, <dateline>, etc. when present
  • Use <closer> if letter has closing content like signature, dateline, etc.
    • Use <signed> if name appears in the closing
  • Use <postscript> to encode P.S. content


See VWWP TEI P5 Encoding Guidelines for more information about letter closers.

Page Breaks

For more information on how to encode page breaks see the page break section of the general guidelines.

Back Matter

See the Back Matter page for more detailed information.


If a part of the prose text that you are trying to encode does not fit one of the above described features, document the problem in the VWWP Encoding Problems page.

  • No labels