I’ve been working with orthography development a bit, and it has been a major challenge getting all the different pieces of the work to fit together well. One of the main divides I see in work is between people who would use computers exclusively, and others would who use them not at all (during the research process at least). I have never felt comfortable in either camp, in part because I am of the computer generation, but also because I have seen a lot that paper and pencil methodology has to offer. Perhaps the most relevant point to our work is that computers are inaccessible to most of our national colleagues. Which means that if I’m doing everything on a computer, I’m doing it by myself. Or I’m teaching people things I started learning in the third grade (they know what the 0/1 symbol is from cell phones, but other than that, I’m usually starting from scratch).
Fortunately for me, there are people working on making linguistic computer work more accessible to the less computer literate, so we can take advantage of computers, without pushing our national colleagues out of the work. One bright shining example is WeSay. Its interface is straightforward and simple. Think of a word and type it in. Give it a short meaning. Next word. Later, you can go back and add longer meanings, other senses, etc. You can work through a wordlist, or use semantic domains — both great ways to help people think of new words to put in their dictionary. I had someone working on it for several days, to bring his wordlist up to over 2,000 words, and we were both quite happy with how it went. I didn’t realize just how happy I should have been until I had the same guy do some other tasks on the same computer (I think it was still just typing, but in openoffice). Things immediately bogged down, and I was constantly needed to fix something.
So here’s my problem: I really like dictionaries, and I think a dictionary is a backbone to any other work done in a language. And it just doesn’t make sense to write a dictionary on pen and paper. But the majority of my time is developing writing systems, which is better done as a community process (i.e., not everyone standing around watching me use a computer). The first thing I do is collect a wordlist, which I eventually make the beginnings of a dictionary. But that dictionary is going to need to be put in a database in a standardized writing system, or it will be a mess. So writing system development and lexicography go best together, but how?
Constance Kutsch Lojenga (among others) has developed a participatory linguistic research methodology, which is good at bringing speakers of a language into the language discovery process from the beginning (c.f., “Participatory Research in Linguistics. ”Notes on Linguistics Vol. 73(2):13-27. Dallas, TX: SIL. 1996.). Language community members see the sound distinctions in their language (at about) the same time I do, and we get to make writing system decisions as a group, given the linguistic facts we have observed together. The basic discovery process puts two words together, and asks, “is the sound in question in these two words the same, or different?” If they are the same, they are put in the same pile, if different, in different piles. There is, of course, lots of background work to be done (like sorting words by syllable profiles, so we’re looking at the same position in the same type of word at a time), but we attempt to make the discovery process itself as attainable as possible, and I have seen it work well.
Which brings me to my dilemma. This methodology as currently conceived uses words written on paper, which are sorted as a group. Which means, that if I use WeSay to collect a wordlist, I (or some other computer savvy person) need to export that wordlist into a format that will print onto paper that can be cut into cards (not hard with mailmerge, and document templates, but it is work), and then after the sorting process, all the information gained (e.g., “fapa” really should be spelled “paba”) needs to be put back into the database — or else it will just remain on those cards. Even if we write up a summary of selected cards in a report, the wordlist/dictionary won’t improve, unless we can get the information off those cards, and back into the computer — which again is not hard, but it is time consuming, and prone to error.
Which got me thinking, would there be a way to simplify the round-trip, and make the same/different decisions in a way that the spelling change information would be immediately returned to the database? As nice as the simple card-sorting interface is, if we could make something (at least nearly) as simple on a computer, I know our colleagues would be up to it. If we could make it simple. Which got me thinking about orthography and WeSay.
Anyway, I’ve written up some thoughts on how this might work practically, which you can find here.
2 thoughts on “Using computers to Help the Computer Illiterate Develop their Language”