[Catalog][Search][Home][Tell Us][Help]

ATS Logo
Academic Text Service (ATS)

380 Meyer Library
email: ats@lists.stanford.edu
phone: 725-3163

[Home | Searcher | OED | Services | Texts | Web Access | English Poetry | Other Sites | Papers | Staff]


Textual Mark-Up

TEI-compliant textual mark-up can be, as you've probably gathered, an expensive proposition. It requires:

The DTD you use will be determined by the level of mark-up required for individual projects. This will probably involve:

On the basis of this general analysis, you should be able to sketch out how you will generally treat the variety of documents you plan to serve. You might, for example, decide that for dramas, act breaks, scene breaks, speechs, stage directions, cast lists, and front and back materials are sufficient. In individual cases, or for specific pedagogical or research purposes, the level of mark-up might increase to include lineation, subdivision of speeches into other components (i.e., prose, verse, songs), metrical markings (for verse drama), and any number of other features of value to the particular purpose.

The TEI Lite, a version of the TEI that attempts to gather in the 'most useful' tags from the full range of tag sets, is well-suited to perform the basic task, but you are likely to find that you will occasionally need to make modifications to the DTD that permit the inclusion of some of the other, richer and more specific tag sets.

This general analysis is also useful because it leads directly into the formation of a methodology for document analysis. Defining major document types that you will serve and their components then transforms the job of the staff into analysis, and not the drudgery of marking text up. While the text, after analyzed at this level, may not "tag itself," developing in staff the ability to recognize ambiguities will lead to a futher refinement of the document analysis process itself, and thereby to documents with a "richer" set of features. The actual process of marking text up according to an established DTD can be greatly enhanced by using one of a number of SGML editors.

Author/Editor runs on Unix, Windows, and Macintosh platforms; Adept runs on Unix and Windows; psgml is a major mode of Emacs and runs primarily in Unix, and is best suited for an X11 windowing environment. Author/Editor and Adept are commercial products; psgml is part of the GNU shareware scheme.

These products (by no means the only ones available) all read and interpret DTDs, inform users of which tags are correct in which context, enforce (if instructed) the rules of the DTD, allow users to enter the entities defined by the DTD or to create new ones local to the current document instance, and permit rapid navigation of the tree-structure of the document.

Author/Editor and Adept also come with validation routines, while psgml does not (it does, however, permit easy add-on of validation programs such as sgmls). Both Author/Editor and Adept use pre-compiled forms of the DTD, and require the purchase of specialized add-ons (RulesBuilder for Author/Editor and Document Architect for Adept; psgml has a compilation routine built in.)

Unless you are interested in creating an SGML-based publication environment (which would preclude selecting psgml), the choice among the products is largely a matter of personal preference and productivity. Although Adept is the most expensive, it offers more options for customization to a local environment, and might for that reason be the best choice for Windows and Unix users. Alternately, if your environment is Mac-based, then Author/Editor is pretty much your only option. Similarly, psgml is a good option only where Unix workstations or X11 servers on Macs or Windows are available.

In all cases, however, you will need to adjust to the SGML environment and rethink how you use the software. With pre-existing text, using any of these products requires a strategy that is tree-based, and not linear. That is, the first operations will be marking of large textual structures, and then refining these to their logical components. This process is repeated until the basic structure is in place, and from there 'phrase' or 'word' level components are added. There are fewer difficulties if the text is being tagged and created at the same time: here the DTD can act more like a 'style sheet.' In both instances, your users will soon discover that knowing the DTD and how it works is essential to the process of successful and productive mark-up, and that the product that helps them work most effectively is the right one.

Previous | Next

Last Update: July 6, 1995

[an error occurred while processing this directive]