skip to page content | skip to main navigation
 Catalog and Search Tools  Research Help   Libraries and Collections  Services  How To ...  About SULAIR

History of Science and Technology Home Page

Printer-Friendly Printer-Friendly     



Archives and On-Line Resources

by Henry Lowood

Conference on the History of Computing organized by the IEEE History Center,

William & Mary College, Williamsburg, Virginia, 15 June 1997


Writing the history of computing, as in any historical field, usually means writing about or around historical sources. If you want to interpret historical evidence, first you have to find it. Today, I will tell you about archival and on-line research materials and how to work with institutions, technologies and staff to facilitate your access to them.

For both traditional and on-line sources:

  • What are the creators and curators of these historical resources trying to do, given the disciplinary, and institutional contexts of their activities?
  • What kinds of collections are available, and where are they?
  • How can you gain access to these sources for research?
  • What pitfalls might be expected, and how can you avoid them?

As the mix of topics in my talk suggests, I am inclined to view digital resources as at least potentially equivalent to other resources found in archival and manuscript repositories, especially for the history of computing. In fact, this is not yet the case today, but the trend, as weathermen and stockbrokers say, is favorable.

It is important to keep in mind that there are hundreds, if not thousands, of repositories with collections that are relevant, at least in part, to the history of computers and computing. Consider topics such as the history of hospital information management, library database technology, scientific computation, digital typography, or computer graphics in the film industry: Many kinds of institutions have produced significant and relevant records, and, in turn, these records may be found in many different kinds of repositories, from relatively open government record centers and university archives to relatively closed private collections and corporate records centers. The spectrum ranges from the Library of Congress to the Disney archives. Rather than cover everything, a hopeless task, I will touch on a few salient issues and relevant examples, most of them from my own experiences as curator of history of science and technology collections at Stanford University.


In the United States, the development of archival collections in the history of computing resembles other areas of postwar documentation in the history of science, technology, and medicine. The common factor is the massive expansion of activity in techno-scientific fields of every ilk, especially when created by programs tied to federal and military funding, large-scale projects, or areas of intense industrial or commercial development. In each case, besides merely changing all of our lives fundamentally, these activities have stimulated the evolution new institutions that rely on or provide funding and support for research and development, including funding agencies, national laboratories, new kinds of corporate divisions, and independent think-tanks. In short, the rapid expansion of techno-science in our society has resulted in a many-fold increase in the production of records by such institutions.

This growth has been multi-faceted. Obviously, the production of records continues unabated. There is very little to say about this evident fact-clearly, it has enormous implications for the capacity of existing repositories to document recent and contemporary computing. Also, the advance of research, technology, applications, and uses, as well as the institutions that support research and development, have led to new forms or increasingly complicated systems of records; some examples are grant applications, semi-published technical reports arising from funded projects, legal documentation -- patents, copyright, anti-trust law, ergonomics, privacy, etc. are legal issues related in various ways to computing-reports to shareholders, and the lobbying records of trade associations. Finally, the diversity of formats of historical records is unprecedented today, with a rate of change that appears to be increasing. The transition from paper to electronic forms of record storage and the proliferation of video and digital media both have enormous implications for archival programs.

In addition to these general trends, circumstances more or less specific to the history of computing also influence the nature and extent of documentation available, as well as playing a role in decisions about preserving these records. First, the relentless advance of computer technology on an ever-expanding set of fronts is redefining the nature and scope of computing itself. It could be argued, at least from the vantage-point of the present, that human beings interact directly with computers more than with any other technology. In The Road Ahead, published in 1995, Bill Gates' vision of the near future of computing includes all "mediated experiences" in almost every social and economic realm. Its historians, therefore, may have to venture into every niche, nook and cranny of society. I think that computing, since the "PC" or "microcomputer revolution" of just over twenty years ago, also differs from fields such as physics or genetics in the degree to which hobbyists, entrepreneurs, artists, and others outside any particular well-defined technical/professional community have not only influenced its assimilation into daily life, but also technical vectors and research, whether at Interval Research Corporation, Silicon Graphics or Stanford University. In short, it is more difficult to locate the edges of computing as a discipline and the boundaries of its impacts on society than for most other technical and scientific fields. The open-ended nature of computing challenges archivists, librarians, and curators, and it complicates matters for researchers looking for disparate materials in a variety of repositories.

Collecting Programs and Collections

How, then, have archives and other repositories responded to the documentary needs of the history of computing?

Let me begin with a bit of terminology. One can hardly talk about archives without addressing the ways in which the term is understood. As archivists understand archives, they are preserved records of the activities of organizations or individuals. A narrower understanding of this term is limited to departments responsible for historical records of enduring value in a larger organization of which they are a part; examples are the Hewlett-Packard Archives or IBM Archives. Such archives may or may not be found in the same piece of the organizational chart as records managers responsible for current records, retention schedules, and the like. In other words, records do not necessarily ever reach the archives. Finally, with the widespread movement and acquisition of historical records by hundreds, if not thousands, of archival repositories and special collections units of libraries (often called manuscript collections) in the United States alone, the term "archives" is expanding to mean documents themselves, often as shorthand for manuscript collections, personal papers, collections of historical documentation in a particular field, or even, unpublished materials. At issue here is whether we think of archives simply as any collections of historical records or as necessarily arising out of the context of their creation. For those concerned primarily with historical research and not, in the first instance, with records management, it is reasonable to focus on archives simply as documentary records. As users of these records, however, you should keep in mind that concerns probably limited to record-keeping needs and requirements in a specific institutional context guided the people who originally assembled, organized, and preserved these records, not the motive of organizing them for future use by historians. Archivists know about these issues, and you should always discuss the organization and review your use of a collection with archival staff.

As you can imagine, the records of individuals and institutions are often, if not usually, dispersed, discarded, weeded, or lost. Many migration patterns are possible: Entire archives may be transferred to or acquired by repositories with no organic connection to their original site of creation; the files of distinct laboratories or projects may end up in the garage of a lab director or principle investigator; the personal papers of important researchers may be acquired by libraries, historical societies, or disciplinary historical centers; and individual documents may end up in museums or private collections, to name only a few possibilities.

Permutations and combinations of these scenarios abound: The records of Burroughs Corporation, originally held in the Burroughs and then in the Unisys corporate archives, are now at the Charles Babbage Institute at the University of Minnesota; series of records from Sperry-UNIVAC can be found at The Hagley Museum and Library in Wilmington, Delaware, as the result of acquisitions that included large collections of documentation created as a result of patent infringement litigation; the surviving records of the Augmentation Research Center at the Stanford Research Institute (now: SRI International) were acquired by the Department of Special Collections at Stanford with the personal papers of Douglas C. Engelbart from McDonnell-Douglas Information Systems, which had acquired Tymshare Corporation, which had earlier acquired the project from the Stanford Research Institute; and so on. It is important that, even in the case of a large, active organization or a prominent individual associated with a particular institution, records can be scattered to distant repositories. Incidentally, while archives often distinguish between organizational records and personal papers of individuals, the Engelbart example reminds us that these two categories are often permeable, something to keep in mind when you are hunting for archival collections.

The history of computing is a relatively new field of scholarship. Yet, the scope and size of its documentary record will rival any other techno-scientific discipline. Multi-institutional research collaborations, federal funding, litigation, venture capital, financial markets, and other factors have created a complex environment for the creation of documentation, as well as greatly increasing its volume. Organizations and individuals responsible for creating records of potential historical value may or may not feel the need to retain them, and in any case their capacity to do so is finite. Obviously, much historical documentation will be lost. How are archivists responding to this situation?

By the late 1970s, archival organizations, historical repositories, and professional societies began to pay closer attention to the records of recent science and technology. Disciplinary history centers such as the AIP History Center, the IEEE History Center, and the Charles Babbage Institute were established in part to coordinate and support the preservation of historical documentation and work with other repositories to address issues of archival appraisal, preservation, and access. The Society of American Archivists, History of Science Society, Society for the History of Technology, and the Association of Records Managers and Administrators co-sponsored the work of a Joint Committee on Archives of Science and Technology (JCAST). The JCAST report, Understanding Progress as Process: Documentation of the History of Post-War Science and Technology in the United States., represented an important milestone when it was published in 1983, in that it raised awareness among American archivists of their need to understand better the records of post-war science and technology.

A loosely-knit group of archival repositories and, just as important, an evolving set of principles and practices emerged out of archival research and projects like the JCAST report in the early 1980s. Guidelines for appraisal of records and documentation strategies set the stage for projects. Progress made during the 1980s resulted by the end of the decade in published guides to collections. In the history of computing, the three indispensable guides remain Resources for the History of Computing, edited by Bruce Bruemmer, The High-Technology Company: A Historical Research and Appraisal Guide by Bruce Bruemer and Sheldon Hochheiser, both published by the Babbage Institute, and Archives of Data-Processing History: A Guide to Major U.S. Collections, edited by James Cortada and published by Greenwood Press. All three guides emerged from this period of intense activity at the end of the 1980s, and they effectively document the strategies and programs that guided the growth of archival resources in the history of computing.

So, where are the collections? Archives of Data-Processing History provides a good overview of the major repositories in the field, even though many collections have been made available since 1990 and, as Cortada pointed out in his preface, a few important collections could not be represented. A recent list of "archives specializing in the history of computing," prepared by Bruce Bruemmer for History of Programming Languages II, published by the ACM in 1996, updates contact information, but otherwise the roster of important archival institutions has not changed much. This core group of archives consists then of the Charles Babbage Institute, the Computer Museum, the Hagley Museum and Library, the Library of Congress, the National Archives and Records Administration, the Smithsonian Institution, and the Stanford University Libraries, plus several corporate archives (IBM, AT&T, Texas Instruments, etc.) Other significant collections located in university libraries or archives can be found at Dartmouth, Harvard, MIT, Carnegie-Mellon, Illinois, and Pennsylvania. A few independent museums of computing have been founded since 1990, and both Microsoft and Intel are establishing archives or museums. In short, there are certainly fewer than ten institutions in the United States that actively collect research materials in the history of computing; another dozen or more feature important, but generally static collections or limit the scope of their collecting to the mission of institutional archives in the narrow sense.

The archival map in the United States for history of computing thus includes a small number of institutions with significant collections, and an even smaller number with an active collecting program. When I said earlier that there were hundreds, if not thousands, of institutions with potentially relevant materials, I was referring to the presence of needles of various shapes and sizes in many different archival haystacks. Finding historical records in the history of computing, as in many other areas, usually involves two mindsets. On the one hand, you should invest time to understand the resources available in the major repositories and centers focused on the history of computing, where book and reference collections, knowledgeable staff, and other support materials can inform and refine your search for sources on a specific topic. On the other, you will need to learn to use printed and online guides and databases to locate archival sources in repositories that have not specialized in the history of computing and may not be equipped to describe or interpret what they have in detail. Since I will am also talking about on-line resources today, I will describe the electronic databases later; for the next few minutes, I will concentrate instead on archival collecting programs and traditional forms of access to archives and manuscripts.

The Stanford University Libraries, where I have been curator of the history of science and technology collections since 1983, maintains an active archival program in the history of computing. Let me take a few minutes now to use our program as an example for how institutions go about acquiring collections of historical records. The library's program in the history of computing grew on two legs: first, an archival orientation in the narrow sense, focused on records of activities that took place at Stanford, and, second, a collecting program founded in 1984 and called the "Stanford and the Silicon Valley Project." The idea behind the Silicon Valley Project is straightforard: Compile documentation tracing relationships connecting Stanford faculty and graduates to emerging high-technology industries in the surrounding region since the 1930s. It extends the archival program that, by the mid-1980s, had assembled collections of faculty papers and university records in the sciences and engineering. For computing, relevant collections in the archives already included the papers of Frederick Terman, George and Alexandra Forsythe, and Donald Knuth, as well as records of Center for Information Technology (Stanford's computation center), the BALLOTS project papers (an early project in the area of library automation and database technology), the ACME Project collection (a collaboration of Edward Feigenbaum, Joshua Lederberg, and others from which emerged path-breaking programs in the field of expert systems such as MYCIN and DENDRAL), and the Heuristic Programming Project. As the Department of Computer Science, founded in 1965, has become perhaps the leading university program in its field, the University Archives has, by preserving records of its programs and faculty papers, grown in importance for the history of computing.

By 1984, it was clear that the explosive growth of Silicon Valley not only dominated regional development, but that it also signalled the emergence of concentrated techno-scientific regions as a defining characteristic of our era. Due to the close connections between Stanford and specific business business ventures located in this region, such as Varian Associates and Hewlett-Packard, the University Archives already owned significant collections relevant to the study of Silicon Valley's development. It seemed like a logical step for the Department of Special Collections and University Archives to move forward and actively collect records of Silicon Valley enterprises and individuals not directly tied to Stanford. The archival record of semiconductor, hardware, and software companies located in Silicon Valley was inadequate, and it appeared that no other institution would invest resources to locate and preserve archival materials documenting research and business growth in Silicon Valley. Moreover, faculty in Stanford's academic programs expressed interest in such a program. One aspect of our strategy was to work outward from Stanford by focusing first on those areas with close ties to Stanford departments, but then to expand our efforts by following these personal and historical connections to build, as it were, a network of related collections. Another was to emphasize contacts in areas of R&D activity, rather than business history. We intended all along to encourage the creation of archives or historical collections in corporations ready to accept that responsibility, such as Hewlett-Packard or Intel, but when companies or organizations have been less inclined to do so, and their records fit our collecting program, we have acquired company and laboratory records, such as those of Fairchild Semiconductor, the American Association for Artificial Intelligence, and SRI laboratories under the direction of Douglas Engelbart and Charles Rosen.

Once the parameters of our project had been established, we proceeded to work with faculty who were known to have contacts in Silicon Valley industry, such as Edward Feigenbaum in computer science, an original member of the Computer Science Department, former chairman of the Computer Science Department, and former director of the Stanford Computation Center. Feigenbaum's work in expert systems and knowledge engineering had led applications in science, business, medicine, and engineering, and from there to the founding of at least two companies. The acquisitions of his voluminous papers led to a other collections, such as records of the American Association for Artificial Intelligence (AAAI), with the archives of AI Magazine; the papers of Louis G. Robinson, the first executive director of the AAAI and co-founder of Spang-Robinson, a publishing venture specializing in AI trade publications such as the Spang-Robinson Report: The Artificial Intelligence Business Newsletter; videotapes of tutorial on artificial intelligence by leading figures such as Feigenbaum, Alan Newell, Raj Reddy, and others from the National Conferences on Artificial Intelligence; and the papers of Donald A. Waterman from the Information Sciences Department at Rand Corporation, Stanford's first Ph.D. in computer sciences and one of the early theorists of knowledge-based systems. Similar vectors from Stanford out to Silicon Valley have been followed in digital typography (Euler Project papers: Hermann Zapf and Donald Knuth, Emigré), in the acquisition of the System Development Foundation archives, and in preserving the papers of Charley Rosen, Douglas Engelbart, Philip Rice, and others who worked in SRI laboratories.

When a repository such as Stanford finally acquires the papers of a computer scientist or the records of a corporation, it does so on the basis of an agreement with the previous owners of the records. The agreement typically addresses issues such as the transfer of ownership, conditions of access by researchers to the collection, and matters pertaining to copyright. The repository then concentrates on processing the collection, which means doing what is necessary to prepare it for use, including preparing or verifying an inventory list, preservation measures (such as refoldering in acid-free folders), organization of the collection for long-term storage, and the creation of finding aids and cataloging records.

Your use of archival materials depends to some degree on both agreements reached with the donors of collections and the descriptive aids and cataloging records provided by the archives and your own preliminary research. I do not intend today to offer a seminar on the application of intellectual property law, copyright and fair use guidelines to unpublished manuscripts and archival collections. Suffice it to say that physical ownership of a collection by a repository does not necessarily convey unlimited access to the collection and often (usually, in the case of Stanford) does not come with copyright ownership; yes, copyright bears on the uses that can be made of archival sources, as well as publications. In fact, until the Copyright Law of 1976, rights for unpublished under common law never expired, and the J.D. Salinger and L. Ron Hubbard case decisions of the late 1980s seemed to set fair use more narrowly than for published materials under the new law. Most archival repositories will produce forms before you can use collections or make photocopies, and these forms will outline their understanding of copyright law as it pertain to archives and manuscripts; if you are in a special situation or expect to reproduce (say, for a website) or quote profusely from archival records in print, I would recommend a preliminary conversation with knowledgeable archival staff.

Now that you have an idea of some of the problems faced by archives and how collections are acquired, I would like to look more closely at what you can do to find archival resources. Keep in mind that the search for primary materials is triggered by many kinds of projects. You may be working on a substantial body of materials, such as the corporate or laboratory archives of Burroughs or Control Data at the Charles Babbage Institute, Doug Engelbart's development of the NLS system documented in his papers at Stanford, or the design of the Macintosh at Apple Computer, with scattered documentation, including the Jef Raskin papers recently acquired by Stanford, at Apple Computer, and in several private and museum collections. Archives also can be used for much more specific needs: a reproduction of Bill Gates' essay on software piracy in the first issue of the newsletter of the Homebrew Computer Club, or a photograph of the Whirlwind computer, or access to an unpublished technical report or oral history. At Stanford, attorneys have also been heavy users of the collections, particularly in areas such as intellectual property and environmental law; indeed, one memorable request for the Gates essay just mentioned was related to the "look and feel" case involving Microsoft and Apple Computer.

Preparation for the use of archival collections begins with a general understanding of the fundamental principles archivists apply to organizing and recording information about collections. These principles lead to a general approach to intellectual and physical access to materials held in archives and libraries that, if understood by the user, greatly simplifies the process of identifying documents of interest in a repository. The first principle is that of provenance, which means simply that the arrangement of an archival collection, whether personal papers or institutional records, follows their original organization and order, if it has not been destroyed. It is assumed that this order reflects the activites and organization that produced those documents; remember that archival records are generally not preserved by the same agencies or for the same reasons that they were created, so every effort is made to respect the original arrangement of this materials as flowing from and, hence, telling us about the context of their creation. The second principle, flowing from the first, is that the most natural way of capturing this organization is a hierarchival, multi-level description of a collection that likewise reflects some understanding of the activities undertaken by the individual or organization producing the records. Typical levels of description are the depository, series, box/folder unit, and individual document, and it is assumed that these will correspond to provenance, if possible.

What all this means to you as a potential user of archival collections is that you will need to use and understand the instrument that describes archival collections according to these principles, typically called the finding aid. The finding aid is a document that describes and provides a level of inventory control for an archival collection; it is not a catalog or index in a bibliographic sense. In keeping with the emphasis on provenance, the context of creation, and multi-level description, the finding aid includes many kinds of notes and descriptive text and rarely catalogs individual items. The finding aid will tell you whose records are in a collection and when they were created, the repository's identifying number for the collection, the size of the collection, when it was acquired and who prepared the finding aid. Notes will tell you when the collection was acquired and from whom, whether its use is restricted, who owns copyright, and something about the history of the organization or the biography of the person that produced the records. A scope or content note will then describe the contents of the collection in a paragraph or two, telling you about the kinds of material - correspondence, audio tapes, laboratory notebooks, office files, artifacts, etc. - in the collection and their arrangement. A listing then follows, usually according to a division into series of related records and to the box or folder level. For example, you might learn that the records of a Computer Science Department include a series devoted to grant files, and that in this series, Box 1 includes grant applications from 1965 to 1975; it may not list the particular faculty member or project of interest to you in the finding aid in this part of the finding aid, but other information will lead immediately to the correct box, if you know the date of the particular grant application of interest to you. Or, you might discover from the finding aid that the series is closed to researchers. Once you have worked with the finding aid, you are ready to request and work with the archival materials of interest to you.

On-Line Sources and Other Electronic Resources

About two weeks ago, as I was preparing for this talk, I found a webpage offered by Yahoo on the topic "Digital, Cyrix Sue Intel." It offered a nice collection of links to Cyrix, DEC, and Intel websites, dozens of press releases and news stories, additional links to the "Intel Secrets" page, the IBM Patent Server, patents pages, etc. I thought that this page would be a good example of how a single well-organized webpage could save days of traditional research time and even offer several sources not otherwise available in any other format. I bookmarked the page. A week later, as I was putting my paper together and checking the sites I had gathered, I discovered that this particular resource no longer existed. Previously listed under current events in the Yahoo classification, it had apparently ceased being current news, and was now gone; all I can show you today is a similar site under a more up-to-date rubric: "Intel sues Digital."

Archives and historical writing both respond to a societal need to preserve cultural memory. Archivists and historians concerned with the history of computing in the 1990s realize that most cultural records produced today are created and stored in digital form. At the same time, archivists in particular are not yet comfortable with digital media, though a series of task forces, reports, and studies have concentrated their attention on the difficulties they present. As one report put it, the essential problem is that "reading and understanding information in digital form requires equipment and software, which is changing constantly and may not be available within a decade of its introduction."

Problems remain to be solved before information in digital form can be considered archival resources in the traditional sense. Yet, from the standpoint of writing history of computing, especially recent computing, electronic media are fast catching up with traditional paper archives as indispensable sources for research. Also, the development of information tools useful for archival control or historical research, combined with an intensification of interest in solving problems standing in the way of reliable preservation of on-line sources and the explosive growth of the Internet and World Wide Web is transforming archives and libraries.

I think it is useful to divide on-line sources into three categories: (1) traditional electronic databases and bibliographic utilities; (2) independent sources of information made available principally through the Web; and (3) digital libraries and archives.

1.Traditional electronic resources arose out of printed bibliographies, card catalogs, and multi-volume union catalogs, such as the National Union Catalog of Printed Books (NUC) and National Union Catalog of Manuscript Collections (NUCMUC). Beginning in the 1960s, when the Library of Congress began to make their catalog records available not only in printed, but also in machine-readable form, library automation has led to the creation of standards and systems for bibliographic information. The MARC ("machine-readable cataloging") record format, completed in 1968, created a basis for the communication of this information, and it is now the basis for virtually every American on-line bibliographic catalog. In the 1970s, related standards were approved by the ISO and ANSI, so that there are now many national MARC standards. Formats have been developed for books, serials, maps, and other information formats, including archives and manuscripts with the adoption of the MARC-AMC format (for Archives, Manuscripts, and Control). For several years now, the digital equivalents of NUC and NUCMUC - only much larger and with enhanced access and searching - have been available through bibliographic utilities such as OCLC and RLIN.

What this all means is that catalog information is now readily exchanged among libraries and databases. An ANSI standard, known as Z39.50, regulates the structure of queries and replies from one system to an independent database over the Internet (or other means of communications), so that it is possible to query a database from within another application. With the implementation of a new generation of on-line library catalogs based on hypermedia and web technology, the day is fast approaching when integrated searching of a configurable suite of databases, catalogs, and indexes - many with cataloging based on current library standards and shared authority databases for controlled fields, names, geographical places and the like - will be available from a scholar's desktop.

These standards, and the database and communications technologies based upon them have made it possible to move structured bibliographic information around easily in machine-readable form, especially over the Internet. Thus, it is possible now to locate catalogs throughout the world and to query the contents of libraries. A good place to being your search for library information is the Library resources on the Internet site at Northwestern University.

Specialized bibliographies with elevant citations to articles and other publications in the history of computing are also available. An important resource is the History of Science and Technology File made available by the Research Libraries Group; this is a database that combines the on-line versions of a growing set of bibliographies, currently the Isis Current Bibliography of the History of Science from 1976 to the present, and the Current Bibliography in the History of Technology (published annually in the journal Technology and Culture) from 1987 to the present; shortly, the Bibliografia italiana di storia della scienza, published since 1982, will be added to the file, with other bibliographies under consideration. The file is available via telnet or the a web interface to subscribers and subscribing institutions, and it is available at no cost to members of the Society for the History of Technology and the History of Science Society; you can find the Eureka site by navigating from the SHOT or HSS homepages. Other widely available files with numerous citations in the history of computing include the Computer Articles file provided by Information Access Company (covering approximately 200 publications back to 1988), INSPEC, the on-line version of Computer and Control Abstracts, Electrical and Electronics Abstracts, and Physics Abstracts from the IEE, and EI/Compendex, the on-line version of Engineering Index with coverage to 1970.

2. The web has made it possible to make almost anything widely available. It has thus opened up access to a variety of new formats of information and to information that previously would only have been available via personal contact. Online publications, listservs and bulletin boards, private homepages, guides and Frequently-Asked-Question lists, compilations of links to other sites, and topical sites all gather resources, most not available in print. Many of these are linked to research projects in the history of computing. Some examples are the:

Alan Turing Home Page, by Andrew Hodges, author of Alan Turing: The Enigma.

The history of computing page by J. A. N. Lee, editor of IEEE Annals of the History of Computing.

SiliconBase, a site created by the Information Technology & Society Project at Stanford University that includes guided tours, digitized documents, and courseware focused on the history of Silicon Valley and related topics.

One of the great benefits of easy access to information available over the Web is that it has opened up access-virtual access-to previously ephemeral resources, such as filmed documentaries and interviews, museums, and private collections. For example, it is not unusual now to find companion websites for documentaries, as in the Turing example for publications. Examples are the sites devoted to Silicon Valley: A One Hundred Year Renaissance," a documentary produced by John McLaughlin and featuring documentary footage and original interviews with Gordon Moore, Doug Engelbart, Steve Wozniak, and others; and "The Machine That Changed the World", based on the WGBH series broadcast on PBS a few years ago.

Corporate and some government sites also fall into this category. Webpages created by private companies and corporations frequently offer company histories as "extras" (cf. the directory structure of the Apple History Home Page), corporate backgrounders, investor information, or (as in the case of Adobe, for example) interviews and other background material about their technology (Aureal Semiconductor) , leaders or founders. These pages are also a good place to look for information from or about corporate archives, libraries and museums, both real and virtual. The Virtual Museum of Computing offers a useful list of corporate histories, computing organizations, online museums, and other links to sites in all of these categories.

National archival programs are described on websites such as the those maintained by the French National Center for Scientific Research (Centre National de la Recherche Scientifique) or CNRS; also available in English and the National Archives of Canada. The Australian Science Archives Project has led the way in providing information about national archival resources in the history of science and technology, and it offers hundreds of links to related sites, as well as detailed information about Australian topics and archives. National resources in the history of computing have also been made accessible in this fashion, such as the The National Archive for the History of Computing at the University of Manchester, which claims to be the largest collection of documents relating to the development of computing in Great Britain.

3. Only a year or two ago, dismissed as "scanning projects," enormous progress has been made in defining what is meant by digital or "virtual" libraries and archives. By digital archives and libraries, I mean collections of on-line documents (not just bibliographic information, but the sources themselves) maintained for use, delivery, or preservation by organizations capable of continuing this responsibility for the long term and following a set of agreed-upon professional and technical standards (such as MARC) to do so.

The Digital Collections Inventory Report, sponsored by the Commission on Preservation and Access and the Council on Library Resources, is an inventory conducted during the second half of 1995. The list, available over the web, turns up dozens and dozens of projects including large projects featuring national literature, history, and/or politics; projects covering disciplines, such as history of science and technology; sites devoted to special, archival, and manuscript collections; and clearinghouses of electronic texts.

Today, few projects merely scan published or unpublished materials for delivery as images. Current projects include provisions for the creation and searching of metadata (such as cataloging information); links between catalogs, finding aids and other indexes, on the one hand, and a collection of authorized and carefully maintained documents on the other; and attention to issues of long-term preservation and migration of data. They may take the form of image databases, with metadata; searchable electronic texts (whether generated from scanned images or entered manually), and full-blown "text encoding" projects involving textual mark-up using SGML or HTML. In the archival realm, an additional focus has been to open up the multiple levels of descriptive information available through archival finding aids by encoding and making them available on websites, often with links to either online catalogs or electronic documents. A list of more than 2000 repositories of primary resources with home pages has been compiled by Terry Abraham, head of the Special Collections and Archives at the University of Idaho Library. A leading example in this area is the "Finding Aids for Archival Collections" project, which includes finding aids from Berkeley, Stanford, the Hoover Instititution, Duke University, and several University of California campuses; it is available for use via the Berkeley Sunsite.

Let us not forget that one of the earliest applications of web technology in the realm of research materials was the conversion of published materials to electronic form. This is not the time nor the place to discuss the myriad conceptual, bibliographic and economic difficulties embedded in this misleadingly straightforward notion. Different approaches to this goal have been tried, ranging from homepages offering information about journals and subscriptions in paper form to full-text online-versions of journals, with archives, and even a few titles published exclusively in digital form. The IEEE Annals of the History of Computing offers current issue and other information on its website. Of course, it is possible to search the contents of these publications through search engines such as Yahoo or AltaVista, which often yields unexpected sources; as one of a hundred examples, see "Doug Engelbart: Father of the Mouse," an interview with Engelbart on the invention of the mouse, with a hyperlink to the digital version of Vannevar Bush's "As We May Think," which when published in the 1945 issue of Atlantic Monthly played a seminal role in Engelbart's thinking; the interview is published on the Superkids site, an on-line educational software review. On-line versions of journals and magazines such as Byte, Datamation, PC Magazine, and Technology Review can be found in the many lists of on-line periodicals available on the web, such as those offered by Yahoo in the "Computers and Internet" section of its magazines classification and the Michigan Electronic Library.

Digital conversion projects and the various websites discussed thus far provide content for online digital collections. The next step in digital libraries and archives will integrate techniques for producing and preserving collections; web technology for browsing, linking, and searching; and a commitment to the various library and archival standards and practices in place to locate, authenticate, control, and provide access to source materials, whether publications, archives, or their digital equivalents. Digital archives, for example, will integrate finding aids that describe the contents of archival collections; local and union databases of catalog records for archives based on the AMC format; citation and authority files with related information; images of archival documents, with SGML- or HTML-encoded databases searchable by text or metadata provided according to emerging standards such as the Dublin Core Standard for data elements. From this point of view, the digital finding aids project is an initiative covering one piece of this enormous puzzle. Large-scale, library-based projects attempting to offer large collections of digitized books, periodicals, archival materials, images, other media or combinations of these formats are underway; collaborative undertakings such as Project Muse (journals published by the Johns Hopkins University Press), the JSTOR project (a retrospective journal image project), and Making of America (5000 volumes in American history published 1850-1877) and RLG's "Studies in Scarlet Project" (19th-century primary sources delivered via the Arches "archival server and test bed") provide an indication of possible . As these relatively large collections of resources begin to populate the computer-based networks that are, increasingly, a part of our research lives, the notion of digital libraries and archives as vast storehouses of information akin to their physical counterparts - as real themselves and not merely virtual-will slowly be realized.

As Patricia McClung put it in the Digital Collections Inventory Report:

There has recently been a burst of activity, much of it experimental in nature, towards this end. However, several very large electronic conversion projects (and related initiatives) intended to test and shape the new information infrastructure are also getting underway. All of them presuppose an information system that has either the Internet, or a more robust uccessor, at its core. If successful, they will go a long way towards real implementation of the digital revolution that has been predicted for so long.


Aspray, William and Bruce Bruemmer, eds. Guide to the oral history collection of the Charles Babbage Institute. Minneapolis : Charles Babbage Institute, Center for the History of Information Processing, Univ. of Minnesota, 1986.

Bruemmer, Bruce H. Resources for the history of computing: A guide to U.S. and Canadian records. With the assistance of Thomas Traub and Celeste Brosenne. Minneapolis : Charles Babbage Institute, The Center for the History of Information Processing, Univ. of Minnesota, 1987.

Cortada, James W. Archives of Data-Processing History: A Guide to Major U.S. Collections. New York, Greenwood, 1990.

Contents: Nancy Y. McGovern, "U.S. National Archives;" David K. Allison, "National Museum of American History;" Leonard C. Bruno, "Library of Congress;" Bruce H. Bruemmer, "The Charles Babbage Institute: The Center for the History of Information Processing, University of Minnesota;" James W. Cortada, "Massachusetts Institute of Technology, Institute Archives and Special Collections and the MIT Museum;" Clark A. Elliott:

Harvard University Archives. Maynard Brichford: University of Illinois Archives. Kenneth C. Cramer: Dartmouth College Archives. Michael Nash: The Hagley Museum and Library. Robert E. Pokorak: International Business Machines (IBM) Archives. Anne Frantilla: Unisys Archives. Henry Lowood: Sources on the History of Computing: Stanford University and the Silicon Valley. James W. Cortada: Bibliographical essay on U.S. archival holdings.

Doorn, Peter, "Opportunities and Pitfalls of the Internet for Historians." Pp. ?? in: Historical Informatics: An Essential Tool for Historians? ed. Peter Oldervoll, et al. [Bergen?], [Association for History and Computing?], [1994?]

Elliott, Clark A., ed. Understanding Progress as Process: Documentation of the History of Post-War Science and Technology in the United States. Chicago: Society of American Archivists, 1983.

Haas, Joan K., Helen Willa Samuels, and Barbara Trippel Simmons. Appraising the Records of Modern Science and Technology: A Guide. Cambridge, Massachusetts Inst. of Technology, 1983.

McClung, Patricia A., Digital Collections Inventory Reportt (Feb. 1996).

Morris, R. J., "Electronic Documents and the History of the Late 20th Century: Black Holes or Warehouses? What do Historians Really Want?" Pp. 302-16 in: Electronic Information Resources and Historians: European Perspectives. St. Katharinen, Scripta Mercuturae, 1993.

Nash, Michael. Computers, Automation, and Cybernetics at the Hagley Museum and Library. Wilmington, Del., Hagley Museum and Library, 1989.

Swade, Doron, "Collecting Software: Preserving Information in an Object-Centred Culture." Pp. 93-103 in: Electronic Information Resources and Historians: European Perspectives, ed. Seamus Ross and Edward Higgs. St. Katharinen, Scripta Mercaturae, 1993.

Williams, Michael R. , "Preserving Britain's Computer Heritage: The National Archive for the History of Computing" Annals of the History of Computing 11 (1989): 313-19

--------------------------------, ed. "Museums and Archives." Annals of the History of Computing 10 (1989): 305-29.

Last modified: June 27, 2005

© Stanford University. Stanford, CA 94305. (650) 723-2300. Terms of Use | Copyright Complaints
[an error occurred while processing this directive]