[Home | Searcher | OED | Services | Texts | Web Access | English Poetry | Other Sites | Papers | Staff]
The database and network environment in which electronic texts can be served is quite varied. In general, however, the best current technology is based on Unix and Windows NT servers. Depending on the platform, clients to these servers will reside on the workstations of the users, or, in the case of Unix, application clients can also run under Xwindows and be delivered to the workstations via X11 servers on those individual workstations. There are no current SGML-based database servers that run on the Mac platform of which I am aware, but there are Mac clients (and MacX) for some Unix database products.
Most electronic text centers that permit textual analysis and deliver information over a network rely on OpenText's database engine (aka PAT). It fully supports SGML data, and is capable of searching very large databases very rapidly. The OpenText DBMS runs on a variety of Unix platforms and Windows NT. Open Text can provide a number of 'generic' clients, including a character-based VT100 front-end, one for Windows, Unix, and, more recently, the Mac. Open Text has also reported plans to develop an API for Web-based access to PAT, and some institutions, Stanford as one of the first among them, have developed local clients to the PAT DBMS.
The PAT DBMS and its associated clients are excellent at retrieval and can be configured to return KWIC lists, concordances, portions of SGML documents, or entire documents. Depending on the needs of your users and researchers, these configuration options may be enough. It may also be necessary to use the PAT system as a front-end into large textual corpora whose filtered data is then extracted for analysis using other systems; in such cases clients will need to be configured correctly, and for more exacting searching, the 'native PAT' search interface may be appropriate.
The use to which the e-text will be put will play a determining factor when considering indexing, character mappings, stop words, and the like. It is possible to create indexes that give the user entry into the text string at any point, i.e. mid-character of a word through to mid-character of the next word. We should also note that different scholars or discipline areas may have different requirements: while the literary scholar may find word indexes appropriate, the linguistic scholar may wish to focus on suffixes, lemma, or other portions of particle retrieval.
Designing your system so that access issues, client configuration, and indexing are all in sync is a known problem set for library system administrators and analysts, but less so for the usual providers of e-texts. Involving and integrating knowledgable technical staff is a critical component for insuring that the process of creating and updating e-text databases runs smoothly.
Finally, server-based, network delivery systems are all subject to the inherent problems associated with fluctuating user demand. You can count on periods of high network activity to be conjoined with high server activity, which both may result in the degredation of response time and user dissatisfaction. The answer to "who do you call when the e-text server melts on finals week" will need to be addressed, hopefully before it actually does, and preferably by placing responsibility for the operational provision of the service in the hands of operational staff.
Last Updated: July 6, 1995
Previous | Next[an error occurred while processing this directive]