When I get back from my mission, I’m going to buy a scanner and start scanning and OCRing old langua... — Blog

When I get back from my mission, I’m going to buy a scanner and start scanning and OCRing old language texts from the university library. There are lots of old books from the 1800s that would clear copyright easily. I think my first project after I get back will be Sweet’s Anglo-Saxon Primer (1905) and Anglo-Saxon Reader (1894). I wanted to do Champollion’s Grammaire Egyptienne (I already have the images), but Egyptian hieroglyphs aren’t yet part of Unicode. Maybe by the time I get back… But there are lots of Latin and Greek grammars that wouldn’t be too hard to encode (Latin especially — the only odd characters would be the vowels with macrons over them, and those are easy enough). The whole etext movement is terrifically exciting to me, if it isn’t obvious. I also want to etextify Stanley’s Through the Dark Continent, which is his account of his trip to Africa after Livingstone’s death. If I have a scanner, I’ll be able to put the books on Distributed Proofreaders, which can probably proof them a bit faster than I can. (It depends on how popular the book is.)

The uniprint utility that comes with Yudit is extremely cool. It can print a text file using any TrueType font. And it actually works. No hassles with converting the file to Type 1 and listing it in a fonts.dir and making an AFM file and such. All you do is give it the font name on the command line and it works perfectly. Yum.

It would be really nice if you could categorize your fonts (in Linux). Yes, you can put them in different directories, but that doesn’t help when you’re actually selecting the font. I’m envisioning something like the bookmarks in a browser. I doubt it’ll ever happen, though. I have 829 fonts installed right now. It’s rather unwieldy.

I don’t know why I didn’t do this before, but today I added the Etexts page. There aren’t many projects on Distributed Proofreaders at the moment.