Skip to end of metadata
Go to start of metadata



See the Front Matter page for more detailed information.

Each book of poetry will contain a table of contents. To encode this table, use a <div type="contents"> tag inside of which use a <list type="simple"> with an <item> tag for each poem in the book. Use the <ref> tag with a target attribute that points to the corresponding page. The value of the target attribute for a given poem in the table of contents should correspond to the xml:id that is given to the page break in the electronic text where that poem exists. This will allow for the readers to jump to a poem without scrolling down the page.



The body will always take the following structure (with a few exceptions):

  • body
    • div type="poem"
      • div type="canto" (if applicable)
        • any of the below tags needed to encode the text

There will be 5 TEI tags used when encoding poetry.

  1. <div> for division that marks each poem
  2. <head> for the title and subtitle of the poem
  3. <lg> for stanzas within a poem
  4. <l> for each line of poetry
  5. <pb> for page break

Division (div)

The <div> tag will be used to demarcate each poem or section (e.g, canto, dedication, etc.) in the book.
The value of the type attribute will be one of following:

  • poem
  • part (verse volumes are sometimes carved up in parts and are usually indicated thusly)
  • canto (a major division of an epic or otherwise long poem)
  • notes
  • dedication
  • dialogue

If none of these value correctly describe the section of text you are encoding, document the problem in the VWWP Encoding Problems page.


On rare occasions, more commonly encountered with Verse texts, but maybe present in prose as well, a dialogue appears that is not attributed to a particular act or scene as is the case in dramas. When such a dialogue appears, without any context, then encapsulate the section in a division with type "dialogue." To see an example of text with this feature, visit:

The following tags will be found ONLY inside the <div> tag:


The <head> tag is used to encode the title and subtitles of the poem. If the poem does NOT have a title, the <head> tag will not be used. The <head> tag will usually have a rend=" " attribute to denote the layout of the title of the page. Possible values for rend are:

  • center for a title centered on the page
  • left for a title to the left of the page
  • right for a title to the right of the page
  • uc for a title with all uppercase capital letters
  • sc for a title with all lowercase capital letters

One rend attribute can contain multiple formatting values as long as those values are separated by a space.


  • Without a title
  • With title and multiple formatting values

For more information on how to encode the head element see the general guidelines.

Line group (lg)

The <lg> tag will be used to separate stanzas of poetry. The <lg> tag will often immediately follow <head>. Each <lg> can contain a type attribute with a number of possible values. Attributes and possible values:

  1. type=" " to denote the type of stanza
    • couplet for 2 line stanza
    • tercet for 3 line stanza
    • quatrain for 4 line stanza
    • quintet for 5 line stanza
    • sestet for 6 line stanza
    • septet for 7 line stanza
    • octave for 8 line stanza
    • indeterminate for longer than 8 lines or of unknown structure
    • verse_paragraph

Rhyme Scheme

The rhyme attribute (e.g., <lg rhyme="abab">) indicates the rhyme scheme applicable to a group of verse lines. Indicate the rhyme scheme with lowercase letters. Below are a few common schemes:

  • aa (couplet)
  • abab cdcd efef gg (Shakespearean sonnet)
  • abba (enclosing rhyme)
  • etc.

If the rhyme scheme is irregular, use the following value:

  • irregular
Unique Situations: Sonnet

If the poem is a sonnet, be sure to add an extra <lg> tag with a type attribute and value that matches the type of sonnet it is, either:

  • english_sonnet (also known as a Shakespearean sonnet: consisting of three quatrains and a concluding couplet in iambic pentameter with the rhyme pattern abab cdcd efef gg)
  • italian_sonnet (also known as a Petrarchan sonnet: consisting of an octave with the rhyme pattern abbaabba, followed by a sestet with the rhyme pattern cdecde or cdcdcd)

NOTE: be sure that the words are separated by an underscore (_), otherwise the xml will be invalid.

Then use the <lg> tag to separate the octave and sestet. Give the <lg> tag the type attribute with the value that corresponds to the stanza type, and a rhyme attribute with the proper values.


Unique Situations: Stanza Titles

If a poem has stanza titles or number, they should be encoded using the <head>. But in these situations, make sure that the <head> tag is inside <lg> tag. Make sure to use the rend attribute for any formatting values on the <head> tag.


Line (l)

The <l> tag will be used to distinguish each line of poetry. EVERY <l> tag will contain an n=" " attribute (stands for number) that has value which corresponds to the line number of the poem. NOTE: Line numbers corresponds to the line's number in the poem, not the stanza, therefore numbering will NOT restart for each stanza. The other attribute that a line may have is rend=" ". Once again the rend attribute will be used to denote format. Here is a list of possible values for rend inside an <l> tag:

  • ti-1 for text indent of 1 tab space
  • ti-2 for text indent of 2 tab space
  • ti-3 for text indent of 3 tab space
  • ti-4 for text indent of 4 tab space
  • ti-5 for text indent of 5 tab space
  • and so on for as many tab spaces as needed

Occasionally, individual lines in verse may continue on to one or more lines of space within the stanza, and this will typically be indicated by a lack of capitalization for each line to which it extends. In these cases, the lines are only separate typographically and do no represent new, syntactical lines of verse. Add a line break tag <lb/> as needed (a conceptual line of verse could be "broken" over several printed lines) and use the text indent rend attribute with the <hi> tag to convey any additional text indenting if present.

Tags that MAY appear outside the <div> tag

Page Break: <pb/>

Page breaks can appear anywhere in the document, and should correspond to the start (or top) of the page:

  • If a page break comes in the middle of a stanza, place a <pb/> tag between the last line of the previous page and first line of the next page.
  • If a page break happens at the end of a stanza, close the <lg> before the <pb/>.
  • If the page break occurs at the end of a poem, close the <div> before the <pb/>.

For more information about how to encode page breaks and represent page numbering, see general guidelines.


  • Break in the middle of a stanza
  • Break after stanza end, with page number

Typographical Separations in Verse

Typographical markers are often used for indicating informal divisions with texts, usually poetry. These often appear as a string of asterisks, dots, symbols, graphic, etc.
The content should mimic the characters printed on the page as much as possible. If a graphic is used that can not be represented with keyboard characters, use asterisks instead. Capture this information in an anonymous block tag, <ab type="typography">.

Possible values include:

  • <ab type="typography" rend="center">******</ab> (asterisk, also use as default if visual marker can not be captured any other way)
  • <ab type="typography" rend="center">------</ab> (dash)
  • <ab type="typography" rend="center">~~~~~~</ab> (tilde)

Other Tags and Attributes used for formatting:

Epigraph <epigraph>

For more detail about encoding epigraphs see general guidelines.

  • Same author, multiple lines of poetry
  • Different author, single line, citation:
Argument <argument>

The <argument> tag is used to mark a brief description of the contents of the section of text to follow, or of the occasion that prompted its writing.


Some poems may include information such as attribution following the poem. In such case, use the <closer> element. For more detail about encoding closers see general guidelines


Tags used for Italics, Bold, and text size

When a word or phrase is typologically different than its surrounding text, use the <hi> tag (stands for highlighted text) with the rend attribute to signal the difference. There are various possible values that can be used with the rend attribute, but these are the more common:

  1. i for italics
  2. b for bold
  3. uc when a word is all uppercase capitals and the font is larger than the surrounding text
  4. sc when a word is all capitals but the font size is the same as the surrounding text

For more general information about additional values see the general guidelines.

The <hi rend="i"> will be used for both text that is italicized for formatting and text that is italicized for emphasis.


Italics for formatting (to distinguish speaker and stage directions) and emphasis:

Back Matter

See the Back Matter page for more detailed information.


If a part of the verse text that you are trying to encode does not fit one of the above described features, document the problem in the VWWP Encoding Problems page.

  • No labels