Digital Humanities: Final Project

Theoretical Issues

We decided to focus our deeper encoding on marking up the quotations because dialogue seems to be a central aspect of the text and quotes are defining features for differentiating characters. In working with the quotes, a few theoretical issues surfaced. First of all, we needed to decide how to break up the quotes. Should we include the narrative clauses within the <q> tags? We decided that including these additional words, technically not part of the quote, would make searching the quotes less accurate. We also needed to address the issue of nesting quotes within paragraphs. Throughout the text, the paragraph would end without the quote ending. In theory, the quote continues, but the paragraph doesn’t. However, due to requirements to make a well-formed document, we needed to make the quote start and end (<q>, </q>) within each paragraph (<p>, </p>). Moreover, we discovered that it was sometimes difficult to identify the characters. Since four group members were each encoding a section of the text (and due to a large incidence of pronouns and aliases), we found character name consistency to be a recurring issue. Sometimes characters “thought” something instead of “saying” the words out loud. We addressed this issue by making type= “spoken” the default attribute for quotes and type=“thought” an alternate option, which brings another level of specificity to searching quoted material.

Our decision to mark the quotes developed because we thought that researchers could gain valuable information from searching the quotes. For example, what does the content of dialogue in The Wilderness imply about the perspective of Irish Americans on Native Americans? Specific characters could also be analyzed: how stereotypical are the characters? Are the names of the characters significant, and how do the characters contrast with each other? Can the Irish American identity be essentialized from this text? The language of each character would also be interesting to analyze: how does the language in the dialogue define the character, taking into account ethnicity, class and dramatic role in the narrative? Searching within the individual quotes for keywords could also provide some interesting juxtapositions of information.

We noticed that sometimes poetry at the beginning of chapters was divided line by line and other times, the poetry ran together in one string. We decided to break up the strings into line breaks based on the capitalization of letters, and to note the line break criteria in the metadata. There were various instances of a "song" in the text, so we noted this aspect as well. If we came across a word or punctuation that seemed to be a mistake not intended by the author of the original text, we marked the mistake using [sic] directly after the perceived mistake without "correcting" the word or punctuation.

We would be interested in seeing how our resource could help someone hypothetically create a graphic representation of two characters talking, or a searchable way to look at incidences of particular character clusters speaking near each other. Our group was fascinated by the idea of creating a graphical representation of many characters' speeches, charting trends in length of dialogue of major versus minor characters in one text, or comparatively between texts. Also interesting would be the possibility of charting the flow of structural elements, i.e. narrative versus dialogue, throughout the novel itself: is there more dialogue towards the end, or an abundance of narration at the beginning of each chapter? In the end, we would be interested in seeing someone come up with an idea that we did not even think of.