For how to deal with the front matter see the front matter section.
Prose includes novels, shorts stories, essays, etc.
The body will always take the following structure (with a few exceptions):
The chapter or division title, including markers such as Chapter II or Section III.
The title of the list.
The caption title of the image, either below or above.
Headings include the titles of lists, chapters, sections, etc. in a work. Most commonly, you will use them for chapters, lists and sections when marking prose. <head> tags can only come at the beginning of a <div>, <figure> or <list>. A <head> cannot come in the middle of a <div>. If you are going to mark <head>s in the text, you must start a new <div>, <figure> or <list>. Therefore, a head indicates a new section of the text. Page breaks can (and should) come before <head> tags, but paragraphs and other tags cannot. There can be more than one head tag following the <div>.
Chapters are designated using divs, marked with an ID. The ID is formulated by ADD. The <div> tag encloses a chapter. Chapter titles (headings) are indicated using a <head> tag. Page breaks come within the chapter <div>. Chapters are the sections of a text directly below books, generally speaking.
Paragraphs are marked with a <p> tag. Paragraphs can be marked virtually anywhere in the text to mark a prose block. Paragraphs include <pb/> (page breaks), lists and tables. Paragraphs are extremely versatile and are used in a wide variety of text encoding situations. Generally speaking, if something is written as a paragraph, it can be marked as such. <div> tags cannot come within paragraphs, but <list> tags, <figure> tags, <pb/> tags, <note> tags, and many others can come within <p> tags. So, for instance, if a paragraph is broken up by a blank page and an image, as shown below, you do not need to close the paragraph to include these features. This allows you to maintain bibliographic accuracy.
Figures, Pictures and Images
Lists are ordered, itemized information. They can have headings, <head>, but need not have headings. They can come within paragraphs,<p>, and divisions, <div>, but need not. Lists can include many types of information, including images and charts, <figure>, and financial information. Lists can also come within lists. So for instance, sublists in a larger list can be marked. This is done by putting the list, figure or other tag within the list <item>.
You can also number the items in a list and indicate whether or not they are bulleted, numbered, or otherwise marked. This is done by giving the <list> a type="" attribute. Bulleted lists are <list type="bulleted">, numbered lists are <list type="ordered">, and lists that are not marked are given the attribute <list type="simple">. You can number the items in lists that are given type="ordered" by using the n="" attribute, or number equals. This looks like <item n="3">.
Lists are marked with the <list> tag. Each individual item in a list is marked with an <item> tag within the <list> tag. Lists can have headers, which are marked using <head>. There can only be <head> tags at the beginning of lists.
Quick Reference, List Type:
*ordered: numbered or lettered list.
*bulleted: list with bulletin points.
*simple: list that does not have numbers or other indicators to show items.
*gloss: list made of labeled terms followed by glosses or definitions.
Lists of Definitions and Terms, Glossaries
Glossaries and other lists that have a term followed by a definition are considered special types of lists, type="gloss", in TEI. These lists are labeled with the <list type="gloss"> tag. They can then be followed by a <head> tag, but need not be. The <label> tag is used to determine the term or phrase being glossed in the definition. The <item> tag is then used to denote the definition.
Tables are text displayed in tabular form. In other words, text displayed in columns and rows. Tables are marked with the <table> tag. This tag is given the elements rows= and cols=, in order to specify how many rows and columns are in the table. Tables can have a <head>, but need not.
Each row in the table is marked with a <row> tag, given the attribute role=. This attribute delineates how a row functions within the table. You can have to values for attribute role=, label and data. Label indicates that the row contains information about the values in each column. Data indicates that the row contains data in each column (the actual values).
Within the <row> tag, there are <cell> tags. These tags indicate the specific units within the table. There should be as many cells as there are columns in the table. The rows are in order, but cells are used to indicate columns, rather than a separate <col> tag. The text from the table is placed within the <cell>.
Tables can come inside of paragraphs, lists and many other units.
Quick Reference, Table Attributes
*rows: rows="", the number of rows in a table. Goes inside <table>.
*cols: cols="", the number of columns in a table. Goes inside <table>.
*role: role="", can be label or data. Goes inside <row>.
Quotes are denoted by quotation marks. Only text that comes within quotation marks will be marked as a quotation for the purposes of encoding. There are two types of quotes: quotes that are external to the text and quotes that are internal. The quote element is used for passages that are external to the text, like a reference to a study or another book.[Internal quotes are quotes that are from inside the text (e.g., character speeches or thoughts, notes written by characters, or terms used in the book) and have various TEI elements to represent them.
Quotes that are External to the Text: Outside Sources and Other References
Quotes that come from outside the text are marked by first using a <cit> tag, to denote an external citation. Within the <cit> tag there are two smaller parts, <quote> and <bibl>. <quote> encompasses the body of the quote, or actual quoted text. The <bibl> tag encompasses any bibliographic reference given that identifies the source of the text, such as a title or author. For a more comprehensive discussion of the <bibl> tag, please see the <bibl> section of the guidelines. The <cit> tag denotes the citation as a unit, and the <quote> and <bibl> tags denote smaller portions of the larger unit. Quotes can also be marked with other tags, for instance, inside the <quote> tag, you can have an <l> tag to denote a line of poetry.
Sometimes, citations will occur within the text. In that case, you still use the <cit> tag and mark the quote as you normally would. You must remember, however, that all of the words within the <cit> must be within either a <bibl> or a <quote> tag. You do not need both <quote> and <bibl>, but you do need at least one.
Quotes that are Internal to the Text: Thought, Speech, Writing
Quotations in the text that indicate speech, thought, writing, etc. by one or more characters is marked by the various TEI elements. For instance, dialogue or notes written from one character to another would be indicated using this <q> element. The <q> tag will generally come inside of a set of <p> tags, since most dialogue is denoted within the text by setting it apart as a separate paragraph. Quotes can come within quotes, such as when one speaker quotes someone else. If there is an external quote inside an internal quote, for instance, a character quotes the bible, the correct tags will be used to delineate between the two distinct types of quotes. Sometimes, quotation marks
The emph, foreign, distinct, mentioned, term and soCalled values indicate that a quote is linguistically set a part. For instance, emph is used to denote special emphases placed on a word via quotation marks. The foreign tag indicates that quotation marks were used because the word is in a foreign language. The distinct tag signifies that the quote is in quotation marks because to set it apart from the rest of the text due to some linguistic peculiarity, slang, for instance, or regional dialect. Mentioned is used to indicate that the writer is talking about the word itself rather than using the word. For instance, talking about the part of speech of the word "canary." Term indicates that the word was put in quotation marks because it is a discipline or subject specific term. For example, if the author uses quotations to demarcate medical terminology, then the term type would be indicated. Finally, soCalled is used to indicate scare quotes. If the author removes him or herself from the word via quotation marks, then you mark the term as "soCalled." Below is a reference list of the different TEI elements used to mark up internal quotes:
Quick Reference, Quote Type
*spoken: A quote is spoken out loud by a character in the text. Use said,
*thought: A character thinks a quote, rather than saying it out loud. Use said.
*written: A character internal to the text has written something that is quoted within the text. Use quotation.
*emph: A word or phrase is in quotation marks in order to emphasize it. Use emph.
*distinct: A word or phrase is in quotes because it is linguistically distinct, so, for instance, it is slang or regional dialect. Use distinct.
*mentioned: A word or phrase is in quotes because the author refers to the word itself, such as in a discussion of its part of speech, rather than using the word. Use mentioned.
*term: A word or phrase is in quotes because it is terminology. For instance, computer terms or scientific language. Use term.
*foreign: A word or phrase is in quotation marks because it does not belong to the predominant language used in the text. Use foreign.
*soCalled: An author uses scare quotes to distance him or herself from a word. Use soCalled.
Closers are most commonly seen at the ends of letters, but they also appear at the end of prefaces, pamphlets and other works. They include the closing salutation, such as "sincerely," the author's name, and sometimes information about the date of publication or the place. Closers are grouped inside the <closer> tag, which can contains <salute>, <signed>, or <dateline>.
<salute> is used to indicate the closing salutation, such as "sincerely" at the end of a letter.
<signed> is used to mark the signature of the document's author.
<dateline> marks any information about the text's creation, such as the date, place or publisher.
For how to encode the back matter of the text, see the back matter section.