This is cool: Haverford College has created a tool called Bridge that creates Latin or Greek vocab lists from texts and textbooks. For example, I was able to start with the vocab from Moreland and Fleischer’s Latin: An Intensive Course (the text we used in my first Latin course in college) and then limit it to just nouns and verbs. You can export to Excel/TSV as well. Pretty neat.
We need transcriptions of public domain print editions to provide a starting point for work. These editions do not have to be the most up-to-date and they do not even have to be error free (99% may be good enough rather than 99.95%). If the community has the ability to correct and augment and to add features such as are described above and to receive recognition for that work, then the editions will evolve rapidly and outperform closed editions. If no community emerges to improve the editions, then the edition is good enough for current purposes. This model moves away from treating the community as a set of consumers and towards viewing members of the community as citizens with an obligation to contribute as well as to use.
This is cool:
Ancient Greek OCR is free software to accurately convert scans of printed Ancient Greek into unicode text and PDF files, which can be easily searched, copied, archived, and transformed. It uses the excellent Tesseract OCR engine, tailored for Ancient Greek typography, syntax and vocabulary.
I haven’t used Tesseract in 10+ years, but back then it wasn’t too great. According to their website, however: “Between 1995 and 2006 it had little work done on it, but since then it has been improved extensively by Google.” That’s encouraging. (I wonder if that’s what they’re using behind the scenes for Google Books and Google Drive and their other things.)