Prose includes novels, shorts stories, essays, etc. There are several main tags that we use to mark up the structural elements of prose.
Chapters are designated using divs, marked with an ID. The ID is formulated by ADD. The <div> tag encloses a chapter. Chapter titles (headings) are indicated using a <head> tag. Page breaks come within the chapter <div>. Chapters are the sections of a text directly below books, generally speaking.
Headings include the titles of lists, chapters, sections, etc. in a work. Most commonly, you will use them for chapters, lists and sections when marking prose. <head> tags can only come at the beginning of a <div>, <figure> or <list>. A <head> cannot come in the middle of a <div>. If you are going to mark <head>s in the text, you must start a new <div>, <figure> or <list>. Therefore, a head indicates a new section of the text. Page breaks can (and should) come before <head> tags, but paragraphs and other tags cannot. There can be more than one head tag following the <div>.
The chapter or division title, including markers such as Chapter II or Section III.
The title of the list.
The caption title of the image, either below or above.
Paragraphs are marked with a <p> tag. Paragraphs can be marked virtually anywhere in the text to mark a prose block. Paragraphs include <pb/> (page breaks), lists and tables. Paragraphs are extremely versatile and are used in a wide variety of text encoding situations. Generally speaking, if something is written as a paragraph, it can be marked as such. <div> tags cannot come within paragraphs, but <list> tags, <figure> tags, <pb/> tags, <note> tags, and many others can come within <p> tags. So, for instance, if a paragraph is broken up by a blank page and an image, as shown below, you do not need to close the paragraph to include these features. This allows you to maintain bibliographic accuracy.
Figures, Pictures and Images
Because most of the texts in VWW will not include images, you need to mark figures, charts, images and other matter within the text so that the reader understands where they fall in the text and what they look like. All of these things are marked using the same tag, <figure>. A figure will include to sub-elements, <figDesc>, or figure description, and <head>. The <head> and <figDesc> tags can be listed in any order. In other words, <head> can come before or after <figDesc>. The figure <head> contains the title caption listed in the text. If there is no caption, you do not use the <head> tag. The <figDesc> is the element used to denote a summary of the image and what is featured in it. Remember, this tag will be used to indicate to the reader what the image looks like and how it appears bibliographically. Be as detailed and specific as possible, without writing too much.
Lists are ordered, itemized information. They can have headings, <head>, but need not have headings. They can come within paragraphs,<p>, and divisions, <div>, but need not. Lists can include many types of information, including images and charts, <figure>, and financial information. Lists can also come within lists. So for instance, sublists in a larger list can be marked. This is done by putting the list, figure or other tag within the list <item>.
You can also number the items in a list and indicate whether or not they are bulleted, numbered, or otherwise marked. This is done by giving the <list> a type="" attribute. Bulleted lists are <list type="bulleted">, numbered lists are <list type="ordered">, and lists that are not marked are given the attribute <list type="simple">. You can number the items in lists that are given type="ordered" by using the n="" attribute, or number equals. This looks like <item n="3">.
Lists are marked with the <list> tag. Each individual item in a list is marked with an <item> tag within the <list> tag. Lists can have headers, which are marked using <head>. There can only be <head> tags at the beginning of lists.
Quick Reference, List Type:
*ordered: numbered or lettered list.
*bulleted: list with bulletin points.
*simple: list that does not have numbers or other indicators to show items.
*gloss: list made of labeled terms followed by glosses or definitions.
Lists of Definitions and Terms, Glossaries
Glossaries and other lists that have a term followed by a definition are considered special types of lists, type="gloss", in TEI. These lists are labeled with the <list type="gloss"> tag. They can then be followed by a <head> tag, but need not be. The <label> tag is used to determine the term or phrase being glossed in the definition. The <item> tag is then used to denote the definition.
Back to Lists.
Tables are text displayed in tabular form. In other words, text displayed in columns and rows. Tables are marked with the <table> tag. This tag is given the elements rows= and cols=, in order to specify how many rows and columns are in the table. Tables can have a <head>, but need not.
Each row in the table is marked with a <row> tag, given the attribute role=. This attribute delineates how a row functions within the table. You can have to values for attribute role=, label and data. Label indicates that the row contains information about the values in each column. Data indicates that the row contains data in each column (the actual values).
Within the <row> tag, there are <cell> tags. These tags indicate the specific units within the table. There should be as many cells as there are columns in the table. The rows are in order, but cells are used to indicate columns, rather than a separate <col> tag. The text from the table is placed within the <cell>.
Tables can come inside of paragraphs, lists and many other units.
Quick Reference, Table Attributes
*rows: rows="", the number of rows in a table. Goes inside <table>.
*cols: cols="", the number of columns in a table. Goes inside <table>.
*role: role="", can be label or data. Goes inside <row>.
Quotes are denoted by quotation marks. Only text that comes within quotation marks will be marked as a quotation for the purposes of encoding. There are two types of quotes: quotes that are external to the text and quotes that are internal. External quotes are quotes that come from outside the text, like a reference to a study or another book. Internal quotes are quotes that are from inside the text: character speeches or thoughts, notes written by characters, or terms used in the book.
Quotes that are External to the Text: Outside Sources and Other References
Quotes that come from outside the text are marked by first using a <cit> tag, to denote an external citation. Within the <cit> tag there are two smaller parts, <quote> and <bibl>. <quote> encompasses the body of the quote, or actual quoted text. The <bibl> tag encompasses any bibliographic reference given that identifies the source of the text, such as a title or author. For a more comprehensive discussion of the <bibl> tag, please see the <bibl> section of the guidelines. The <cit> tag denotes the citation as a unit, and the <quote> and <bibl> tags denote smaller portions of the larger unit. Quotes can also be marked with other tags, for instance, inside the <quote> tag, you can have an <l> tag to denote a line of poetry.
Sometimes, citations will occur within the text. In that case, you still use the <cit> tag and mark the quote as you normally would. You must remember, however, that all of the words within the <cit> must be within either a <bibl> or a <quote> tag. You do not need both <quote> and <bibl>, but you do need at least one.
Quotes that are Internal to the Text: Thought, Speech, Writing
Quotations in the text that indicate speech, thought, writing, etc. by one or more characters is marked by the <q> element. For instance, dialogue or notes written from one character to another would be indicated using this element. Quotations that are external to the text are marked using a different tag. For example, you would not mark a quote from Plato, the Bible or any other external source using the <q> tag. The <q> tag will generally come inside of a set of <p> tags, since most dialogue is denoted within the text by setting it apart as a separate paragraph. Quotes can come within quotes, such as when one speaker quotes someone else. If there is an external quote inside an internal quote, for instance, a character quotes the bible, the correct tags will be used to delineate between the two distinct types of quotes. The type attribute is used with <q> to indicate the nature of the quote. Acceptable values for the type attribute, in this case, are spoken, thought, emph, distinct, mentioned, term, foreign, soCalled and written.
The attribute type values thought, spoken and written are precisely as they seem. They indicate that a quote is thought, written or spoken by a character in the text.
The emph, foreign, distinct, mentioned, term and soCalled values indicate that a quote is linguistically set a part. For instance, emph is used to denote special emphases placed on a word via quotation marks. The foreign tag indicates that quotation marks were used because the word is in a foreign language. The distinct tag signifies that the quote is in quotation marks because to set it apart from the rest of the text due to some linguistic peculiarity, slang, for instance, or regional dialect. Mentioned is used to indicate that the writer is talking about the word itself rather than using the word. For instance, talking about the part of speech of the word "canary." Term indicates that the word was put in quotation marks because it is a discipline or subject specific term. For instance, if the author uses quotations to demarcate medical terminology, then the term type would be indicated. Finally, soCalled is used to indicate scare quotes. If the author removes him or herself from the word via quotation marks, then you mark the term as "soCalled."
Quick Reference, Quote Type
*spoken: A quote is spoken out loud by a character in the text.
*thought: A character thinks a quote, rather than saying it out loud.
*written: A character internal to the text has written something that is quoted within the text.
*emph: A word or phrase is in quotation marks in order to emphasize it.
*distinct: A word or phrase is in quotes because it is linguistically distinct, so, for instance, it is slang or regional dialect.
*mentioned: A word or phrase is in quotes because the author refers to the word itself, such as in a discussion of its part of speech, rather than using the word.
*term: A word or phrase is in quotes because it is terminology. For instance, computer terms or scientific language.
*foreign: A word or phrase is in quotation marks because it does not belong to the predominant language used in the text.
*soCalled: An author uses scare quotes to distance him or herself from a word.
Closers are most commonly seen at the ends of letters, but they also appear at the end of prefaces, pamphlets and other works. They include the closing salutation, such as "sincerely," the author's name, and sometimes information about the date of publication or the place. Closers are grouped inside the <closer> tag, which can contains <salute>, <signed>, or <dateline>.
<salute> is used to indicate the closing salutation, such as "sincerely" at the end of a letter.
<signed> is used to mark the signature of the document's author.
<dateline> marks any information about the text's creation, such as the date, place or publisher.