Home / Blog Menu ↓

Blog: #mormon-texts-project

34 posts / tag feed / about the blog / archive / tags

A Voice of Warning

New release: A Voice of Warning, by Parley P. Pratt.


Reply via email or office hours

Succession in the Presidency

New release: Succession in the Presidency of The Church of Jesus Christ of Latter-day Saints, by B.H. Roberts.


Reply via email or office hours

The Great Apostasy

New release: The Great Apostasy, by James E. Talmage.

Also, if you haven’t yet heard of Grandin Press, check them out. They’re publishing nice paperback versions of a lot of these Mormon classics and have some good videos on their site explaining why these books matter.


Reply via email or office hours

Key to the Science of Theology

New release: Key to the Science of Theology, by Parley P. Pratt.

And we’ve got three more I’m finishing up for submission to Project Gutenberg, plus another four we’re proofing.


Reply via email or office hours

More on Unbindery

For the first year doing the Mormon Texts Project, our process went like this: Volunteers would tell me they were ready for a batch. I would send them a range of five page numbers (145-149, for example) and a text file containing the unproofed text for those pages. (I would also track this in a Google spreadsheet.) The volunteer would then go to the book in Google Books, open the text file in a text editor, and proof the text. After they finished, they would email the text file back to me and I would assign them a new batch. Rinse and repeat.

Too much overhead. So I started working on Unbindery, a web app to automate almost all of this. This is what it looked like a couple weeks ago (a functional but completely unpolished bare-bones app):

I was kind of disappointed about the project — it was moving incredibly slowly, I didn’t really care about it anymore, etc. — and I had made up my mind to let it quietly fall by the wayside.

Then I was at the temple two weeks ago and got a clear impression that I wasn’t going to get off the hook that easily, and that I needed to keep going with MTP and finish Unbindery. With that impression came some inspiration on how to polish the app, and I’ve been working on it since then. Here’s what it looks like now:

For the past week we’ve been using it for MTP work, and since then our productivity has totally skyrocketed. Here are my educated guesses on why:

  1. Smaller chunks. Volunteers proof one page at a time instead of five. It’s easier and, because it’s easier, volunteers proof more than before.

  2. Progress bars. Since volunteers can see their progress visually, there’s more of a drive to keep proofing so they can make the black bar go all the way to the end.

  3. Scoreboard and leaderboard. Volunteers get points every time they finish proofing a page. With the leaderboard, there’s competition, and already I’ve seen an increase in productivity because of this (at least subconsciously).

  4. Convenience. Since Unbindery is a web app, volunteers can proof pages anywhere, instead of having to download/upload text files and all that.

There’s also a new feature I added today, where volunteers can see how much of a given project they proofed:

This way there’s more of a sense of ownership to the work. We’ll see how it goes.

Like I said a few days ago, we went from taking eight months per book to a few days per book. My volunteers have proofed 373 pages in the last week, and I’m now scrambling to get enough books into Unbindery so we don’t run out. That’s a very different problem from the stagnation I had a month or two ago. (Yes, it’s a good problem to have.)


Reply via email or office hours

Joseph Smith the Prophet-Teacher

New release: Joseph Smith the Prophet-Teacher, by B.H. Roberts.

As a sidenote, it took us eight months to proof and release our first book, Joseph Smith as Scientist, which was 173 pages long. It took us two days to proof and release Joseph Smith the Prophet-Teacher, with only three volunteers involved. Yes, that’s two days, not weeks, not months. Unbindery is really speeding things up for us. Expect more releases in the near future.


Reply via email or office hours

Life of Heber C. Kimball

New release: Life of Heber C. Kimball, an Apostle, by Orson F. Whitney. It’s a great book and we hope y’all enjoy it.

Thanks to the volunteers who helped out: Hilton Campbell, Byron Clark, Meridith Crowder, Cameron Dixon, Brian Jarvis, and Ted Lee.


Reply via email or office hours

MTP Q&A on AMV

A Motley Vision just posted a Q&A I did with them about the Mormon Texts Project:

You probably know Ben Crowder as the Editor-in-Chief of Mormon Artist magazine. But Ben is the kind of guy who always has several projects going on at one time, and I thought that one of them that he is actively working on right now — the Mormon Texts Project — would be of interest to AMV’s readers.

You can read the rest of the interview on AMV’s site (along with lots of other great articles).


Reply via email or office hours

Inside the Mormon Texts Project

Here’s a bit more in-depth detail on the Mormon Texts Project and our process for digitizing books and getting them on Project Gutenberg.

Why we do what we do

First, we’re doing this to make more Mormon books available through Project Gutenberg. Why PG? They’ve been around for almost forty years, they have lots of mirrors, and they use a low common denominator (plain text) so that everyone can read the texts.

We’re also doing this because people in other countries often don’t have easy access to these books. Google Books is only available in the U.S., for example.

Another reason is that most of these books aren’t available in Braille editions, and screenreaders are mostly guaranteed to be able to read plain text files.

Choosing the books

We’re looking for books published before 1923 so that we know they’re 100% public domain. (Books published in or after 1923 are most likely still in copyright. There are exceptions, but it’s a hassle to figure out which books fall in that category, so for MTP we’re restricting ourselves to pre-1923 books. Luckily that still gives us almost a hundred years’ worth of Mormon books to choose from.)

As for selecting which pre-1923 Mormon books to do, we started with a short list of books that I’d heard of and wanted to digitize (Joseph Smith As Scientist, The Life of Heber C. Kimball, etc.). We’ve taken a few reader requests as well, and I’ve added to our list by searching for other books by some of these same authors (Orson F. Whitney, B.H. Roberts, John A. Widtsoe, etc.). I should add that we’re not interested in digitizing anti-Mormon books. We’re doing this to build the kingdom, not to try to tear it down.

When we finish with the books currently on our list, I plan to start working through A Mormon Bibliography to find other books to digitize.

Getting the books

A lot of Mormon books are already available on Google Books and the Internet Archive, which makes our job a lot easier. We can download page images (as PDFs) and unproofed OCRed text from both sites. And I’ve checked with Project Gutenberg and they have no problem with our using Google Books images as our original source, as long as the book is pre-1923.

For books not on Google Books or the Internet Archive, we’ll have to scan them ourselves and then run the page images through OCR software.

Copyright clearance

At this point I’ll take the images for the title page and verso (the page right after the title page, which usually has the copyright statement) and submit them to Project Gutenberg for copyright clearance, via their website. They’ve always responded within a few days letting me know if we’re clear. (Since we only do pre-1923 works, we always get clearance.)

Digitizing

This is where most of the work takes place. I split the book up into batches of one to five pages each (if we’re doing a book from Google Books, I usually go with five pages since that’s how Google gives the OCRed text to me) and begin to assign batches to volunteers. They then take their assigned pages and go through the OCRed text, comparing it to the page images to eliminate typos and make sure we’re digitizing things correctly, also ensuring that the final text follows our MTP guidelines for formatting and such.

I’ve been emailing the OCRed text and page numbers to the volunteers (who can then just go to Google Books to see the page images and use Notepad or Textedit or another text editor to edit the OCRed text), but once I finish Unbindery, everything will be in the app and they won’t have to download anything. It’s a system that has worked pretty well for PGDP.

(Why are we not just using PGDP, then? Mostly because I wanted a cleaner user interface. And it’s nice knowing that if we need any special features, I can easily add them to Unbindery.)

Proofing

After everyone finishes and returns their batches, I collate the batch text files into a single file and make a quick pass through the text making sure that things generally look right. Then we make a final, more thorough pass to make sure everything is formatted correctly and that we didn’t miss anything.

I should add here that with Unbindery, each batch will be proofed twice before it’s considered complete, which will help with accuracy.

Releasing

Once we finish the proof, I go to the Project Gutenberg upload page, fill out the form, submit the finished text, and wait. It usually only takes a couple days — sometimes just a few hours — before the text is up on Project Gutenberg and available for download.

And then we start the process all over again with another book.

Volunteering with MTP

If this sort of thing interests you, email me. We’d love to have you.


Reply via email or office hours

In-progress: Unbindery

I’ve been spending more time on the Mormon Texts Project (we’re almost done with The Life of Heber C. Kimball, by the way), and I’ve realized that having a nice integrated system for assigning and editing pages would make things much easier. Enter Unbindery:

My friend Rikker and I started Unbindery a few years ago, but it petered out before it got off the ground, and there it languished until a month or two ago. Since then, I’ve gotten the core up and running and it’s now usable enough to start doing actual MTP work with it. (Taking OCRed text and cleaning it up, that is.) I still have a lot of polish left to do, though.

It’s written in PHP and Javascript and I’ll be releasing it on GitHub in the near future, once I clean up the source a bit.


Reply via email or office hours