The First Cursive Handwriting Recognizer Needed a Spelling Checker and so did the Rest of the World
Lester Earnest (les at cs.stanford.edu)
At the Massachusetts Institute of Technology (MIT) in the period 1959-63 I developed the first cursive handwriting recognizer, which included the first spelling checker. In 1966 I initiated the much simpler task of creating spelling checkers for use in editing text files. We gave that software away beginning in 1971 and it soon spread around the world via the new ARPAnet, which I also helped start.
Gettin the spelin rite. When I was growing up in the 1930s and ‘40s our world was even more racist and sexist than now and only girls took typing classes, so that they could become secretaries. In business, men either gave dictation or wrote longhand and expected their secretaries to get the syntax and spelling right. Embarrassingly my handwriting turned out to be an ugly scrawl with many misspellings. I thought about learning to type but my first two years in high school were in Louisville, Kentucky, where the schools were oddly segregated three ways: white boys, white girls, and colored. Naturally there were no typing classes offered in my school.
In my senior high school year (1947-48) my family moved back to San Diego, California, where the schools were integrated and I signed up for typing, figuring it would help me avoid revealing my embarrassing scrawl. This naturally resulted in some derision from my male classmates but being in a class just with girls was quite enjoyable and the teacher was so pleased to have me there that she invited me to cheat by always giving me the same paragraph to type on exams, so that I was able to memorize and type it really fast. Unfortunately when I composed new material my spelling remained bad, with my fingers doing even worse at spelling than my brain.
By the 1950s and early 1960s there were still no computerized collections of documents anywhere in the world and, as near as I could tell, not even a list of common English words, which was a prerequisite for creating a spelling checker. As discussed further on, I undertook the development of a cursive handwriting recognizer and found it necessary to create a spelling checker in order to make it work. I found a book that claimed to list the most common English words and punched the top 10,000 onto paper tape for use in experiments.
Thordike & Lorge, The Teacher’s Word Book of 30,000 Words, Columbia University, New York, 1944.
As I got into doing that I discovered that for some reason the authors had given heavy weight to the King James version of the Bible, with the result that many archaic terms were listed such as “thee”, “thy” and “thou”. Also a number of modern words were missing including “radar” and “television” so I added those and some others. The words were all in singular form but I decided they would still provide an adequate test of cursive handwriting recognition.
10,000 English words on seven reels of 6-hole punched paper tape.
After I came to Stanford University in 1965 to organize what became the Stanford Artificial Intelligence Lab (SAIL) I noted that we and others were using newly developed timesharing systems to write articles and reports but those of us with poor spelling were producing some embarrassing results. It occurred to me that my word list could be used to construct a spelling checker but the fact that it included no plural words posed a problem. I got around that by devising a suffix-stripping process that converted plurals to singular forms. This scheme could sometimes let a misspelled word pass scrutiny but in practice that seldom happened.
Aram Grayson, one of our systems programmers, was kind enough to read all those reels of paper tape and convert them to a single magnetic tape, which was the only kind of secondary storage we had initially. I then recruited a graduate student to write a spelling checker, which he accomplished in one or two days by writing it in the LISP programming language. However it was a bit slow and worked only in batch processing mode, delivering a list of unrecognized words together with the page and line numbers in which they occurred, but offered no help on what the correct spelling was. Consequently it was not very popular.
In 1970 I got another graduate student, Ralph Gorin, to write a more advanced version, called SPELL, in machine language that interactively suggested possible correct spellings. The ARPAnet began working a short time later and we put SPELL in our public file area, whence it promptly migrated around the world and became a standard editing tool.
A decade later, when personal computers became popular, nearly all text editors included spelling checkers and most of the people who created them claimed that they had invented them. As a result a number of histories have been written about these developments but I don’t know of a single one that is reasonably accurate. I find it amusing that the name “spell check” is now prevalent, which sounds to me like something that Harry Potter and his colleagues should be using.
A cursive beginning. Earlier, while working at MIT Lincoln Lab in the late1950s, I took some graduate courses with the intention of getting a Science Masters degree, which required that I do a thesis. In 1959 I chose the problem of recognizing cursive handwriting at the suggestion of faculty members Jack Dennis, Murray Eden and Morris Halle. The task of recognizing cursive writing turned out to be rather difficult. In fact I informally called my system “Curse” because that is what it made me when it malfunctioned.
MIT required that I do my thesis as a full time student so I went on leave from MITRE but did a bit of consulting there. I began my thesis research on the TX-0 computer, using its light pen to write samples on a bitmap matrix. I shared that machine with a number of other students, typically in half-hour runs, including Gordon Bell who went on to fame at Digital Equipment Corp. (DEC), Carnegie Mellon University and elsewhere and had the good sense to found the Computer Museum in Boston, Massachusetts, then helped its successor Computer History Museum in Mountain View, California.
Bitmap processor. In order to work on handwriting recognition I needed a representation of line graphics and chose bitmap representations – a two-dimensional array of bits, also known as a Boolean matrix -- in which each bit represented either black (1) or white (0). I then wrote an emulator of a new kind of computer that processed these images instead of ordinary computer words in a manner similar to a proposal by Unger:
Using that processor I was able to write subroutines that would dismantle both handwriting and other line drawings. In fact I was able to transform line drawings into dot-to-dot puzzles such that if you drew lines between the resultant printed dots in numerical order it would reconstruct the original drawing.
However I didn’t get very far on recognizing handwriting. My thesis advisors wanted me to continue full time but inasmuch as a I had a wife and three kids and had been able to do only a bit of consulting on the side I concluded that this would be infeasible and managed to persuade them to accept what I had done as qualifying for a Masters Degree in June 1960.
A short time later one of my advisors, Morris Halle, invited me to continue my work on the TX-2 computer at Lincoln Lab, a much larger and faster machine that was located near my home, and I did that just for fun, working evenings and weekends while continuing my regular day job at MITRE Corp. I started out by replicating the bitmap processor emulation I had developed on the TX-0, which ran a lot faster on TX-2.
By examining the vertical density of lines in handwriting samples I was soon able to reliably estimate both the imaginary base line on which words were written and the upper bound of the small letters, namely a c e m n o r s u v w x z, with other letters poking above and/or below that envelope. This allowed me to decompose the handwriting into a number of small features such as closed curves in the center, which I called “O”, small vertical strokes, called “I”, and up-downs that I called “R”. Strokes going up high I called either “L” or “T’ depending on whether there was a crossbar and those going below the baseline were called “J”.
Of course, real handwriting is imperfect so, for example, handwritten letters “a” or “o” might show up either as “O” or as “I I” depending on whether the strokes come together at the top.
“Analysis by synthesis” was a buzz phrase in research projects of that era, the idea being that if you could build a computer simulation of some process it could tell you how the real process worked. In thinking about that it occurred to me that if I had a list of potential words I could transliterate their individual characters into the simpler features mentioned above and try to get a match. Toward that end I created the word list mentioned earlier and, for each word, estimated the number of times the strokes in its handwritten version would cross a horizontal line midway between the upper and lower bounds of the small characters, which I then used to select a set of candidate words from the word list, then went through each of their letter sequences to see if any matched the features found in the sample, such as the one below, which worked.
A correct recognition. As shown, the computer first estimated the upper and lower boundaries of the central parts of the writing, as shown by the two horizontal lines, then looked for elementary features such as a tall character sticking up, indicated by the “L” below the first character, a closed curve in the center, indicated by the “O” just to the right, and “I”-like characters shown next, then an “O”-like feature (the upper part of the “g”) and a “J”-like feature below (the lower part of the “g”), etc.
In general this process might identify a single word, such as “LANGUAGE” in the case above, or it might show more than one word if it was unable to distinguish between them. Part way through these experiments a BBC film crew showed up to do a documentary interview and asked me to test it with the word “television,” which I agreed to do, knowing that it was included it in the list of words. As it turned out the program gave an ambiguous answer, showing both “TEDIOUS” and “TELEVISION.” The film crew loved that and zoomed in for a close-up of the screen.
While running tests with other people’s handwriting I made a discovery that took me awhile to figure out, namely that this program understood my sloppy handwriting better than anyone else’s. Because my handwriting was ugly I had given up on it as soon as I learned how to do block printing in a college freshman mechanical drawing class, but I had to take it up again in order to provide samples for my experiments, but why did the computer understand me better than others?
I figured it out after a bit of reflection. While running tests I could see cases in which my writing was misunderstood and could also see what went unrecognized. That apparently caused me to unconsciously adjust my handwriting to avoid such problems. In other words, the computer had subtly trained me to write in a way that it understood.
Because I was trying to develop a system that would work from paper images rather than interactively, from then on when running tests with others I recorded the results but didn’t display them so that my test subjects would not be similarly trained.
I eventually figured out that that my faculty mentor apparently wanted me to turn this project into a PhD dissertation, which I wasn’t interested in since I viewed myself as an engineer rather than an academic. Besides, with a family to support I didn’t think I could afford it. Little did I know that three years later I would be managing a university research lab for aspiring PhDs.
In the meantime I turned my results into a conference paper that won a free trip to Europe. A few years later that talk was turned into an IEEE Spectrum article by Nilo Lindgren.
L. Earnest, Machine recognition of cursive writing, Proc. IFIP Congress 1962 (Munich), North-Holland, Amsterdam
The overall the accuracy of my system was about 90%, which was not good enough for it to be used to replace a keyboard. However I believe it could have been used to physically sort mail envelopes semi-automatically by diverting to a human those cases where either a word was missed or where it gave an ambiguous result. However, as far as I know, nobody ever tried that. Instead the U.S. Postal Service later added ZIP Codes, which are easier to read by both humans and machines.
Write on. Over the years others experimented with various cursive writing recognizers and beginning in the 1980s some tablet makers began offering handwriting recognition though I see that, as of this writing, Wikipedia still hasn’t figured out that I created the first such capability. The first to offer it in a personal digital assistant was apparently Apple in their Newton tablet from 1987 to 1998. Some experiments with that device indicated to me that its recognition success rate was similar to the system I had developed 25 years earlier, which was not good enough.
By chance that project was headed by Larry Tesler, a former colleague and friend, but I hadn’t mentioned to him my recognition software. Apple reportedly did a literature search for cursive recognition software but didn’t go back far enough to find my paper (and neither has Wikipedia). They then contracted with a Russian group to develop new software at a cost of a million dollars or so when they could have had my system for nothing.
I still have all the code and being “Old School” I consider it an honor if someone chooses to use a program I have written. As you might guess, I support the Free Software Foundation (FSF) and am strongly opposed to software patents, which constitute a senseless scheme by lawyers and big business to make money and block competition. Whereas patents are supposed to stimulate innovation, software patents have the opposite effect. Allowing software patents makes about as much sense as allowing mathematicians to patent new mathematical concepts so that no one else can use them without paying a fee. Indeed, computer programs are, in fact, expressions of mathematical concepts. (Full disclosure: after I was fired by a company I co-founded, they took out a software patent in my name on a cryptographic scheme I had developed for distributing software.)
As things stand, no one seems to be using cursive handwriting recognition today because it doesn’t work reliably enough, as I learned 50 years ago. However voice recognition, which was pioneered in the mid-1960s by my friend and former colleague Raj Reddy, has since come into widespread use.