Blog Page 1 of 3 (25 posts) :: archive :: feeds

Scanning journals

I’ve recently begun scanning my journals using my iPhone and the Scanner Pro app, and it’s working out fairly well. My process:

  • Using the built-in iPhone camera app, I long press to lock focus and exposure (this saves time so it doesn’t have to autofocus each time), then photograph each page of the journal. It’s not as high quality as it would be if I used an actual scanner, but it’s much, much faster, and far more portable.
  • After I’m done photographing, I open Scanner Pro and select the images from the camera roll, then use the Black & White Document setting to process them into a PDF.
  • From Scanner Pro, I export the PDF to Dropbox.

The resulting PDF is nice and clean and easy to read, and the files aren’t too big (150 pages is usually between 80 and 200 megs — for me, very much worth the space to preserve important documents).

A concocted example:


That’s before (the image is straight from my iPhone camera, no postprocessing), and this is after Scanner Pro is done with it:


I should add that ordinarily, with actual journals there wouldn’t be as much empty border around the content.

One hitch I’ve run into is that Scanner Pro chokes on anything larger than around 150 pages (it crashes), so I do long journals in chunks.

For that reason and a few other small annoyances, I’ve been looking into replacing Scanner Pro with a desktop-based script that takes a list of photos and processes them into a nice black and white PDF. Imagemagick gets me part of the way there with this command:

convert input.jpg -threshold 50% -blur 1x1 output.jpg

Here’s what it looks like for the above note card scan, at 30%, 50%, and 70% threshold, respectively:


At some point I’ll try writing a Python script that dynamically evaluates each page and adjusts the threshold as necessary to get the best result. Until then, though, I’m still using Scanner Pro.

Life sketches in Family Tree

I’ve been doing more family history lately (more on that soon), and one thing I’ve started doing is writing simple life sketches for each ancestor and putting them in Family Tree. For example, I took the data for Manuela Gandara Cobo and wrote this:

Manuela Gandara Cobo was born around 1811 in Setién (Marina de Cudeyo, Santander, Spain) to José Gandara Valdecilla of Ceceñas (Medio Cudeyo, Santander, eight kilometers from Setién) and Josefa Cobo of Setién. She was the second oldest of five children that we know of (she had an older brother, Manuel, younger sisters Nicolasa and Vicenta, and a younger brother Remigio).

She married José Fuentevilla Fuentevilla when she was 18 and he was 20, in his hometown of Polanco (around 36 kilometers from Setién). They had nine children (seven girls, two boys), three or four of which lived to adulthood. Their first child died within the first year, Josefa died when she was two, Francisca died when she was almost eight, Maria Dolores died the day after she was born, and José Maria died when he was six months old.

Her mother died at age 47 in 1836 when Manuela was 25, and her father died at age 71 in 1853 when she was 42.

Manuela was 68 years old when she died in 1879, a year after her husband José died. (Incidentally, she died just ten days after her older brother Manuel.) At her death she had had ten grandchildren through her daughters Maria Remedios and Maria Isabel.

The prose is far from poetic, and it’s just a restatement of the basic facts of her life (and as you can see, the information I have is somewhat death-heavy), but I think it has a couple benefits: it’s a story, so it’s more parseable and memorable, and including ages and context makes the dates more meaningful. For example, seeing that Francisca was born in 1835 and died in 1843 doesn’t convey the same weight for me as reading that she was seven years old when she died.

I’m haven’t added these for very many ancestors yet, but I plan to do it for all of them, even the ones we know next to nothing about.

Update: My friend Barney Lund recommended adding world events as well. I haven’t done this yet, but I like the idea a lot. I’d probably add a paragraph at the end listing the major events that happened during Manuela’s lifetime (and probably how old she was and what her family composition was at that point).


For a while I had been itching to have a better way to track genealogy research todo items — something that organizes items by family and links back to Family Tree, mainly. And thus Gent was born:


I’ve been using it for a couple months now and really like it. Before, I felt disorganized and didn’t know where all my notes were; now, even if I come back to my research after a month or two away, it’s easy to get back into it.

The app is built on Django (Python), and I’m calling it a 0.1 release since there are almost certainly bugs I haven’t found. But it does work for me. (For what that’s worth.)

Sparkline pedigree chart

This proof of concept takes the genealogy sparklines idea and puts it on a pedigree chart:

The white diamond represents a marriage, and the small circles represent children. The length of the line corresponds to how long the person lived. (Also, the data is very made up.)

As I’m writing this, I’m thinking these sparklines might work better on a family group sheet instead of a pedigree chart.

Assertion-based genealogy proof of concept

This is a rough, experimental proof of concept showing how an assertion-based genealogy app could work. The basic idea is that you type in the facts you know (usually from a record you’ve found), and the app pieces together the people and relationships. Here’s the video:

And a screencap:


  • I’m not very good at After Effects. (Thus the oversize mouse cursor I threw together, the less than ideal pacing, etc.)
  • Overall, the idea of assertion-based genealogy continues to intrigue me. It feels simpler — I add a source and list the facts/hypotheses found in it, and the system takes care of linking it all up.
  • Toward the end of the video, the mouse clicks on the first fact and it dims, and Domenico and Mariantonia disappear from the chart area. The idea here is similar to toggling a layer’s visibility in Photoshop — disable a fact to see what the chart looks like without those conclusions.
  • I didn’t mock this up, but I envisioned the right (empty) sidebar being used for analysis, somewhat like my Family Analysis prototype, and for flagging errors (a father born after his child is born, a mother dying two years before she gives birth, etc.).

Experimental family pedigree

This experiment takes the style introduced in January and uses it for a family pedigree (this time with real names and dates from my Italian side in Morrone del Sannio):

Three generations would have been better than four (mostly because of spacing). There’s also a bit of redundancy — people on the main lines show up twice, once as a child and once as a father/mother. Overall, though, I like being able to see the children of each family across multiple generations.

Census source tracker

A few weeks ago the FamilySearch blog posted about Source Tracker, a web app that hooks into FamilySearch and shows you which U.S. censuses your ancestors should have been in, and (more importantly) which ones have already been sourced. For example:

Yes, please. It’s brilliant. And it makes it really, really easy to see where the holes are — in mine, for example, you can see that I need to find Mary Louise Chambers in the 1880, 1930, and 1940 censuses. Clicking the magnifying glass takes you to a search for that person in that census. I’ve spent a fair amount of time these past couple weeks going through and hunting people down in the censuses (and I still have a lot to do, as you can see).

Experimental pedigree chart

Because I apparently can’t stop making genealogy charts: here’s a pedigree chart I put together as an experiment to see what things would look like if the more recent names were larger. The result:

I do like the larger names, but it seems that on the left side of the chart the hierarchy is harder to read. This kind of chart might work better with just four generations instead of six.

On interviewing family members

I’ve got this itch to record as much of the stories of my family members as I can — particularly the histories of my parents and grandparents who are still around. They’re all getting older and memories are fading and at some point relatively near in the future they’re each going to go full incommunicado. At that point, family history research gets harder, working in the realm of conjecture and secondhand reporting. Much easier to talk to primary sources while they’re still alive. (Sounds coldblooded when you put it that way, though, doesn’t it.)

Yet in spite of these lurking deadlines (literally), I hardly ever actually talk to my parental and grandparental units about their histories.

It’s a pity. Every time I do talk with them, it’s wonderful, and I learn things about their past and my past that make my life more meaningful and that help me relate more to them, especially now that I’m a father. Tonight, for example, we visited my parents and somehow ended up talking about one of my younger brothers who was born at only twenty-one weeks along and passed away when he was forty-five minutes old. I sort of knew the story from when it happened, but I was only seven at the time and my memory’s fuzzy. Now, though, I’m an adult with two children of my own, including a daughter with some fairly severe medical issues. It wasn’t till I heard my parents talk about it tonight that I really even understood what losing their son must have been like. And now I’ve got the story recorded so I can refresh my memory later when my kids are old enough for us talk about it, and even better, they can hear it from their grandparents themselves. That’s worth a lot to me.

The thing, too, is that it’s far easier to record these things now than it ever was before. I have a phone in my pocket almost all the time. That phone has a microphone and can record audio to MP3s, which take up so little space that I can store hours and hours and hours of conversations on my phone. It’s amazing.

Now I just need to figure out a way to remind myself to do more of these oral interviews before it’s too late…

Genealogy notebook proof of concept

Ever since seeing the IPython notebook, I’ve been thinking about how its notebook idea would be great for genealogical research. So I put together a (very rough) proof of concept:

Some notes:

  1. A text-based query interface (the queries are in purple) to get information from a database. I don’t know that the best query language would be natural language like this — it’s more just to get the idea across — but the important thing is being able to easily do these types of queries against the genealogical information you’ve stored, enabling all sorts of analysis. “Who in my tree is born more than nine months after their father died?” “Who got married when they were younger than 12 years old?” “Who are all the children who died when they were younger than eight years old?” And so on.
  2. Interleaving queries, their results, and other text (using Markdown, of course). This is the core notebook idea. Or you can look at it as an annotated transcript of a query session. Rather than just having a list of queries and their results, you sort of embed them in between your writing about the research. (Similar to literate programming.) For me, writing things out helps me to see what I think and wrap my head around the research.
  3. There’s a lot of room for data visualization here. I’ve only shown pedigree charts and some basic tables, but it’d be easy enough to have a query return any other type of chart — bar, pie, fan, you name it. Including interactive things like the family analysis tool.
  4. I didn’t include this in the mockup (because I, uh, just barely thought of it), but you could extend the query language to support hypotheticals. “Show me what William Crowder’s family would look like if he and Sarah got married in 1820” or “From now on, pretend Samuel Shinn was born in 1860.” And then following queries would act as if that were true. It’s nice to be able to establish a few suppositions and then see what the ramifications would be, whether they’re plausible, etc. Or to compare two different hypotheses.
  5. Which brings us to the purpose of all this: to have a good place to do the rough, messy side of genealogical research. Thinking through things, seeing what comes of it, and keeping a record of the journey, basically.
  6. Since the information in the database would be subject to change (as you do more research), it would probably be good to cache the results. Or maybe highlight the queries whose results have changed when you reload a notebook. Personally, I lean more towards caching, so that the notebook accurately represents the database at the time it was written (making it a valuable historical record of your research), but you could also look at it as a living document that updates automatically when the background data is updated.

Anyway, the proof of concept is just a static HTML page. I’m not planning to do anything more with it, but I wanted to get the idea out there.