A View of the World from Houston
The Houston Daily Post, January 25, 1901
1 Henri Lefebvre, The Production of Space, trans. Donald Nicholson-Smith (Wiley-Blackwell, 1991). A sampling of other influential works on the production of space include David Harvey, Social Justice and the City (Johns Hopkins University Press, 1973); Doreen Massey, Spatial Divisions of Labor (Routledge, 1984); Neil Smith, Uneven Development: Nature, Capital, and the Production of Space (Blackwell, 1984); Michel Foucault, “Of Other Spaces,” trans. Jay Miskowiec, Diacritics, 16 (April 1, 1986), 22–27; Edward W. Soja, Postmodern Geographies: The Reassertion of Space in Critical Social Theory (Verso, 1989); and Derek Gregory, Geographical Imaginations (Wiley-Blackwell, 1994). For an introduction to the major issues and writers addressing space and place, see the introduction to Phil Hubbard and Rob Kitchin, eds., Key Thinkers on Space and Place, Second ed. (Sage Publications, 2010). Some of the more influential writings on place and place-making include Yi-Fu Tuan, Space and Place: The Perspective of Experience (University Of Minnesota Press, 1977); Denis E. Cosgrove, Social Formation and Symbolic Landscape (Croom Helm, 1984); Michel de Certeau, The Practice of Everyday Life (University of California Press, 1984); Doreen Massey, “A Global Sense of Place,” Marxism Today, 35 (no. 6, 1991): 24–29; and Edward Casey, The Fate of Place: A Philosophical History (University of California Press, 1997). For accessible introductions to the concept of place, see John A. Agnew and James S. Duncan, eds., The Power of Place: Bringing Together Geographical and Sociological Imaginations (Unwin Hyman, 1989); and Tim Cresswell, Place: A Short Introduction (Wiley-Blackwell, 2004)
2 My use of "Imagined geography" is a conscious reference to Edward Said’s "imaginative geography" and Benedict Anderson’s "imagined communities." Both concepts explore many of the themes I explore in my project: socially constructed space, relations of power, and overlapping scales, among others. See Edward W. Said, Orientalism (Vintage, 1979), 53-55; and Benedict Anderson, Imagined Communities: Reflections on the Origin and Spread of Nationalism (Verso, 1983). Americans’ relationship to geography during this period is articulated in Susan Schulten, The Geographical Imagination in America, 1880-1950 (University Of Chicago Press, 2001) and Susan Schulten, Mapping the Nation: History and Cartography in Nineteenth-Century America (University of Chicago Press, 2012).
3 For examples of Gulf shipping and commerce, see Democratic Telegraph and Texas Register (Houston, Tex.), Aug. 20, 1845, p. 3. Online at: http://texashistory.unt.edu/ark:/67531/metapth78113/m1/3/
4 See the fixation on the Rio Grande as a sovereign boundary in Telegraph and Texas Register (Houston, Tex.), May 11, 1842, p. 1. Online at: http://texashistory.unt.edu/ark:/67531/metapth48061/m1/1/
5 Democratic Telegraph and Texas Register (Houston, Tex.), Vol. 14, No. 28, Thursday, July 12, 1849. Online at: http://texashistory.unt.edu/ark:/67531/metapth48547/ and Democratic Telegraph and Texas Register (Houston, Tex.), Vol. 14, No. 33, Ed. 1, Thursday, Aug. 16, 1849. Online at http://texashistory.unt.edu/ark:/67531/metapth48551/
6 Harold L. Platt, City Building in the New South: The Growth of Public Services in Houston, Texas, 1830-1910 (Temple University Press, 1983).
7 As of September 3, 2013, the Library of Congress’s Chronicling America project listed 1,005 different newspaper titles on its website. Rough calculations for manual reading were based on a generous reading speed of 300 words per minute, multiplied by 480 minutes in an eight-hour workday, multiplied by 260 workdays in a year.
8 Franco Moretti, “Conjectures on World Literature,” New Left Review (Jan.-Feb. 2000), 54–68; Franco Moretti, Graphs, Maps, Trees: Abstract Models for a Literary History (Verso, 2005), 1. For an example of the quantitative application of distant reading in literature, see the Stanford Literary Lab’s pamphlet publication, Ryan Heuser and Long Le-Khac, “A Quantitative Literary History of 2,958 Nineteenth-Century British Novels: The Semantic Cohort Method." Available online at: http://litlab.stanford.edu/LiteraryLabPamphlet4.pdf. For a collection of responses to Moretti and “distant reading,” see Jonathan Goodwin and John Holbo, eds., Reading Graphs, Maps, and Trees: Responses to Franco Moretti (Parlor Press, 2011), http://www.parlorpress.com/pdf/ReadingMapsGraphsTrees.pdf. Literary historian and digital humanities luminary Matthew Jockers, meanwhile, prefers the term macroanalysis to distant reading. Matthew Jockers, Macroanalysis: Digital Methods and Literary History (University of Illinois Press, 2013).
9 The source material was produced by the University of North Texas Library’s Texas Digital Newspaper Program. For more information about the program, see http://tdnp.unt.edu/. To access the digitized collections, see http://texashistory.unt.edu/. My many thanks to Andrew Torget for giving me access to the collection.
10 For a brief introduction to the digitization of sources, see Marilyn Deegan and Simon Tanner, “Conversion of Primary Sources,” in Companion to Digital Humanities, ed. Ray Siemens, John Unsworth, and Susan Schreibman (Oxford: Blackwell Publishing Professional, 2004). For problems posed by copyright laws for text analysis, see Matthew L. Jockers, Matthew Sag, and Jason Schultz, "Brief of Digital Humanities and Law Scholars as Amici Curiae in Authors Guild v. Google,” Aug. 3, 2012. Available at SSRN: http://ssrn.com/abstract=2102542 or http://dx.doi.org/10.2139/ssrn.2102542
11 Many thanks to Tze-l Yang at the University of North Texas for providing OCR accuracy rates for the two newspapers.
12 The three scales of my gazetteer can be thought of as sequential filters: national, regional, and local. The program iterated through every word in the newspaper text. If a word (unigram) or pair of words (bigram) were capitalized, (Kansas City, for example) it paused and sent them through the first filter of national place-names. If it found a matching record (Kansas City in the national filter), the program updated the tally for that place-name in that issue and moved on to the next word in the text. If it did not match any record in the first filter, the program sent it through the second and third filters looking for a match. The program looked first for a bigram and then for a unigram, meaning that in the case of Kansas City it would identify the two-word place-name Kansas City and subsequently skip over the one-word place-name Kansas without trying to identify it as a separate place.
National place-names included any city that had been in the top 100 highest-populated cities during any census year, in addition to all states excluding Texas. Regional place-names included any U.S. town within a 200-mile radius of Texas that had more than 10,000 residents during any census year, along with major cities in Mexico that were within a 200-mile radius of the border with Texas. I also manually included the major place-names of Texas, Mexico, Cuba, and Havana in the regional filter. Local place-names included any place listed in the Geographic Names Information System (GNIS) database that fell within a thirty-mile radius of Houston (full GNIS data available for download at: http://geonames.usgs.gov/domestic/index.html). The full gazetteer that I used, along with the frequency of each place-name in the two newspapers, can be downloaded at http://spatialhistory.stanford.edu/viewoftheworld/PlaceNameFrequencies.csv
After compiling a gazetteer of place-names, I then had to make decisions about which words to exclude based on ambiguous identification (such as Washington). Due to the poor quality of the digitized text, it would have been fruitless to try and evaluate each individual instance of a place-name using contextual clues. I instead either entirely excluded or included a place-name. For each potentially ambiguous place-name I evaluated a random sampling of occurrences in the text to determine whether the name’s ambiguity would have a discernible impact on the end results. If a place-name consistently referred to more than one location, I omitted it entirely. If it consistently referred to a single location, I included the name and used that location. For instance, Abilene is a city in both Kansas and Texas. Houston papers overwhelmingly referred to Abilene, Texas, so I included the place-name and used its Texas location. The same process applied to disambiguating between place-names and proper names. In the case of Jackson the name too often referred to people such as Andrew Jackson rather than places such as Jackson, Mississippi, so it was omitted. If I found consistent ambiguity in random samples of text, I omitted the place. The full list of the place-names I omitted and the reasons for doing so can be downloaded at http://spatialhistory.stanford.edu/viewoftheworld/PlaceNameRemovals.csv
The literature in computational linguistics and natural language processing is vast. Many of these approaches rely on higher-quality text than the sources I used, which allows for more sophisticated approaches. One of the most relevant recent examples in the field of digital humanities is Ian Gregory and Andrew Hardie, "Visual GISting: Bringing Together Corpus Linguistics and Geographical Information Systems” Literary and Linguistic Computing, 26 (Sept. 2011), 297-314. See also Kalev H. Leetaru, "Fulltext Geocoding Versus Spatial Metadata for Large Text Archives: Towards a Geographically Enriched Wikipedia," D-Lib Magazine, 18 (Sept./Oct. 2012), online at: http://www.dlib.org/dlib/september12/leetaru/09leetaru.html
13 I performed content analysis on a seventeen-issue sample of the Houston Daily Post (representing 1% of the total number of issues) by developing a program called ImageGrid. The program overlays a grid onto each page image and categorizes each cell in the grid, in this case into one of six different news categories. Each category was then aggregated according to the percentage of the total printed page space it took up. ImageGrid is an open-source program available at http://www.cameronblevins.org/imagegrid/. This sampled content analysis resulted in the following categories, with accompanying 95% confidence intervals. Statistical least squares estimates were used to produces statistically significant estimates for the entire collection.
- Traditional narrative news (stories, editorials, reports): 48.6% (confidence interval between 38.5% and 51.6%)
- Narrative-based miscellany (jokes, advice columns, sermons, speeches): 8.8% (confidence interval between 6.1% and 17.9%)
- Advertisements and classifieds: 29% (confidence interval between 26.5% and 32.2%)
- Commercial nonnarrative (stock listings, price reports, ship registers, exchange rates, etc.): 5.3% (confidence interval between 3.9% and 6.3%)
- Miscellaneous nonnarrative (railroad schedules, weather tables, hotel guest lists): 4.7% (confidence interval between 3.8% and 5.3%)
- Marginalia (newspaper headers, page numbers, subscription rates, contact information): 3.6% (confidence interval between 3.1% and 4.9%)
14 Fernand Braudel, The Mediterranean and the Mediterranean World in the Age of Philip II, 2 vols., trans. Sian Reynolds (Harper and Row, 1972). William Cronon, Nature’s Metropolis: Chicago and the Great West (W.W. Norton, 1991).
15Publications that came out of these projects include Frederick W. Gibbs and Daniel J. Cohen, “A Conversation with Data: Prospecting Victorian Words and Ideas,” Victorian Studies, 54 (no. 1 2011): 69–77; Caroline Winterer, “Where Is America in the Republic of Letters?,” Modern Intellectual History, 9 (no. 03, 2012): 597–623; and Edward L. Ayers and Scott Nesbit, “Seeing Emancipation: Scale and Freedom in the American South,” Journal of the Civil War Era, 1 (no. 1, 2011), 3–24.
16 For an overview of the field of digital humanities, see Matthew K. Gold, ed., Debates in the Digital Humanities (University Of Minnesota Press, 2012). The book of abstracts for the annual Digital Humanities Conference also offers a glimpse into the current popular topics in the field. The 2013 conference book of abstracts is available online at: http://dh2013.unl.edu/abstracts/files/downloads/DH2013_conference_abstracts_print.pdf. For an early roundtable on the role of computing specifically in the field of history, see “Interchange: The Promise of Digital History,” The Journal of American History, 95, (Sept. 2008), 452 –491.
17For more on the changing boundaries of academia in the humanities, see Bethany Nowviskie, ed., #alt-academy: Alternative Academic Careers for Humanities Scholars, http://mediacommons.futureofthebook.org/alt-ac/.