Blog Page 3 of 46 (460 posts) :: archive :: feeds

I’ve reorganized my Mormon page, primarily to gather together all of the scripture-related materials into their own pages, like the new Book of Mormon page. I also have a few new related projects in the works, some of which are very exciting. More details later.

From Charles Mann’s 1491:

Almost 150 years before Columbus set sail, a Tartar army besieged the Genoese city of Kaffa. Then the Black Death visited. To the defenders’ joy, their attackers began dying off. But triumph turned to terror when the Tartar khan catapulted the dead bodies of his men over the city walls, deliberately creating an epidemic inside. The Genoese fled Kaffa, leaving it open to the Tartars. But they did not run away fast enough; their ships spread the disease to every port they visited.

Whoa.

Progress on Press has been a bit slower lately. I’ve fixed most of the errors I discovered by running the exported PDFs through the 3-Heights PDF validator. I also refactored the code and reorganized the package per Kenneth Reitz’s advice.

I’ve implemented initial support for embedding subsetted fonts (doing the subsetting via fontTools.subset), and while the fonts (including uninstalled fonts) display fine on my macOS box, the PDFs don’t validate properly and the fonts don’t show at all on iOS, which means the embedding isn’t actually working right. Current suspects include the /Differences array (which I’m not generating properly yet) and the CMap (which I haven’t implemented at all yet). I still have to implement ToUnicode as well, so that copying and pasting does what it should, but I’m fairly certain that isn’t what’s causing the fonts to not embed properly.

I’m also trying to figure out color spaces. In general I believe I want the output to be either DeviceRGB or DeviceCMYK, with some way of specifying an output intent, and also an option for the user to embed an ICC profile if they want. I’m part of the way there.

Anyway, the font stuff is far more complicated than I expected going in, but I’m still making progress, and I’m learning a lot.

A quick update: I was stuck for a while on the Dagh story, but I’ve started spending my lunch hour working on it, and it’s coming together nicely. I should have a complete first draft done soon. (I’ve got 7,000 words on it so far.)

Rather than starting work on Ink with the low-level typesetting engine, I’m thinking it’ll be worthwhile instead to start with a processor that goes through Ink rules and outputs TeX and/or SILE code. More to come later.

Press can now generate PDFs. For example:

from press import Press

with Press('press-demo.pdf', size=Press.LETTER, margin=1*Press.INCH) as p:
    # Top and bottom borders
    p.pen(0.25)
    p.line(p.page_min_x, p.page_min_y, p.page_max_x, p.page_min_y)
    p.line(p.page_min_x, p.page_max_y, p.page_max_x, p.page_max_y)

    # Rotated colored rectangle
    p.push()
    p.pen(15)
    p.stroke(hsl=(0, 0.5, 0.5))
    p.fill('#fd0')
    p.translate(p.page_min_x + 4*Press.INCH, p.page_max_y - 4*Press.INCH)
    p.rotate(45)
    p.rect(0, 0, 2*Press.INCH, 2*Press.INCH)
    p.pop()

    # Lines of varying thickness
    p.translate(p.page_min_x + 1*Press.INCH, p.page_min_y + 1*Press.CM)
    for i in range(1, 20):
        p.pen(i / 2)
        p.line(0, 0, 30, 0)
        p.translate(0, 20)

That code generates the following PDF (linked):

press-demo.png

I’m working on text/font support now, which is by far the most complicated thing about this project.

Since there isn’t a clean, cross-platform way to select a font via code, I’ve decided to use font maps (inspired by @font-face in CSS):

fontmap = {
    'paths': [ './fonts', '/Library/Fonts', ],
    'Minion Pro': [
        { 'weight': 300, 'italics': False, 'filename': 'MinionPro-Regular.otf', },
        { 'weight': 300, 'italics': True,  'filename': 'MinionPro-It.otf', },
        { 'weight': 600, 'italics': False, 'filename': 'MinionPro-Bold.otf', },
        { 'weight': 600, 'italics': True,  'filename': 'MinionPro-BoldIt.otf', },
    ],
}

with Press('output.pdf', size=Press.LETTER, fontmap=fontmap) as p:
    p.font('Minion Pro', size=24, weight=300, italics=True, dlig=True, smcp=True, tracking=50)
    p.align(Press.LEFT)
    p.text("This is a test.", 50, 50)

Font maps are admittedly extra work, but they do have some advantages as well: you can use fonts you haven’t installed, for example, and you can specify exactly which font files you want to use. And I can’t see any good way around the lack of a cross-platform font selection mechanism (meaning, a way to pass in ‘Minion Pro’ with specific weight and styles, and get a font filename in return).

Anyway, I’m in the middle of reading the PDF spec on CIDFonts and CMaps. It’s … complicated. It makes my head hurt. But it’ll be awesome when it’s done.

Rule-based typesetting with Ink

The plan for Ink took a bit of a turn a few nights ago. Erlang’s pattern matching was on my mind (having read about it earlier that evening) when I came across a passage from Mitchell’s Book Typography on house rules:

The following are examples of the authors’ own house rules:

  • Speech to be indicated by single quotation marks (‘quote’ not “quote”)
  • Circa shortened to italic c. with no word-space (c.1895 not c. 1895)
  • Use multiplication symbol, not ‘x’ for dimensions (24 × 36 not 24 x 36)
  • Letter-space strings of capital letters (ABCD not ABCD)

The two ideas came together and I saw that declarative typesetting (rule-based typesetting) could be a much nicer way to typeset.

For example: if you want to use the multiplication symbol for dimensions (× instead of x), you usually have to edit your source file (InDesign, TeX, etc.), find matching instances, and change them. It’s a one-time thing, a permanent transformation.

If, on the other hand, you had a rule that said “find any ‘x’ characters between numbers and transform them to ‘×’), then you could leave your source file alone and let the rule do the work for you instead. Using rules like this — textual and stylistic transformations applied at compile time — seems far more reusable, shareable, and easier to use.

From there, the Ink language morphed almost completely from how I was envisioning it earlier (TeX with nicer syntax, basically) to this new thing, inspired by Erlang, XSLT/XPath, CSS, Inform, and more.

A few quick notes before we get to the examples:

  • I’m leaning very much towards a template/data separation, like in Django and Mustache and other template engines popular in web frameworks.
  • For this rule-based thing to work well, you have to be able to set general rules but also fix specific cases where the rule doesn’t apply, or where it doesn’t make sense to write a general rule. At the moment I’m leaning towards having those specific fixes be rules as well, rather than tagging the source file. See the second rule listed in Exceptions below for an initial stab at this idea.
  • Ink will be three languages — High Ink (or just Ink), which is the rule-based language shown below; Medium Ink, a tagged version of the source text with all the rules applied; and Low Ink (har har), a page description language that gets compiled to PDF.
  • Splitting it up like this allows for extra flexibility — it would be relatively easy, for example, to write a compiler that takes Medium Ink and outputs HTML/CSS or EPUB or what have you. I don’t know that that would actually be a good idea, but it’s more possible this way. Splitting it up also makes it more manageable.
  • Rephrased, the Ink-to-Medium-Ink compilation involves applying the rules intelligently. Medium-Ink-to-Low-Ink compilation involves the typesetting itself — line breaks, page breaks, etc. Low-Ink-to-PDF compilation will be easiest, translating the Low Ink code to PDF code.
  • This morning I came across Jon Gold’s post on declarative design tools, with somewhat similar ideas. I like the direction he’s gone in with the combinations — it’s a nice workflow. We could do something in that vein here, with syntax to output a bunch of variations (typefaces, sizes, leading, etc.) with minimal effort.

Examples (gist)

Note: these are all first-draft thoughts on how to do this kind of a thing. Syntax is very much not set in stone at all — rule/endrule vs. rule { }, selector syntax, whether to use regular expressions or something simpler, etc.

rule
    size 6x9";
    # alternates
    size letter;
    size 210x297mm;
    size a4;

    font Arno Pro, 10/13pt;

    margin 1";
    inner-margin .75"; # overrides earlier margin value
endrule

# Named rule (for use later)
rule @times
    find \dx\d : replace \1 × \1
endrule @times

rule @year-labels
    # Find "a.d." and turn on the smcp OpenType feature
    find a.d. : feature smcp;
    # alternate way
    find [b.c. | b.c.e. | c.e. | a.d.] : feature smcp;
endrule

# Exceptions
rule
    # If a.d. is found in heading style text, don't run @year-labels on it
    find a.d. (style=heading) : ignore @year-labels;

    # Find a specific "m.a.d." and don't run @year-labels on it
    # This selector language needs a lot of work
    find /chapter:4/paragraph:2/word:[m.a.d.] : ignore @year-labels;

    # Usually, though, you'd want to revise the general rule like this
    # Only run the rule if it's by itself (word boundaries) and not a heading
    find \wa.d.\w (style!=heading) : feature smcp;
endrule

rule @paragraph-indents
    # Indent first line of all paragraphs 1.25em;
    find %paragraph : initial-indent 1.25em;

    # Override for first paragraph of a section/chapter
    find %paragraph:nth(1) : initial-indent 0;

    # Paragraphs following tables aren't indented
    find %table + %paragraph : initial-indent 0;

    # Hanging indents for paragraphs with hanging tag
    find %paragraph.hanging : hanging-indent .125";
endrule

# Tracking/kerning
rule
    # Find sequential uppercase and bump tracking up
    find [A-Z]+ | tracking 50;

    # Find V followed by a and kern -25
    find Va | kern -25;
endrule

# Coptic (named Unicode range)
range @coptic
    u+2c80 .. u+2cff;
    u+03e2 .. u+03ee;
    u+03ef; # separated for demo purposes
endrange

rule @coptic-font
    # Any characters in the Coptic range should be set in Antinoou
    find range @coptic : font Antinoou, 24/28pt, dlig;
endrule

# Unicode properties
rule @numbers
    # Replace any numbers with old-style figures
    find [ unicode.Nd | unicode.No ] : feature onum;
endrule

# Masters
master @a
    frame ...; # incomplete, but this part would have text frames
endmaster

# Apply masters
rule
    # All pages get master @a by default
    page * : master @a;

    # Remove master for pages i-iv
    page i-iv : master none;

    # Ignore @paragraph-indents rule on page 9
    page 9 : ignore @paragraph-indents;

    # Set data to be put into masters
    data @main, @index;

    # Variable used for running heads
    $title War and Peace;

    # Set running heads
    # (@inside-header etc. are frames in the master)
    page.odd : @inside-header $pagenum;
    page.odd : @outside-header $section; # set in data
    page.even : @center-header $title;
endrule

# Data
data @main
    transform @ingest;
    include preface.txt;
    include chapters/chapter*.txt;
enddata

data @index
    include index.json;
enddata

# Transform
# Used for an initial transform if necessary
transform @ingest (text)
    // JavaScript or other scripting language
    // These aren't great examples, though
    var response = text.replace(/CHAPTER/, "\n\nChapter");
    response = response.replace(/\wteh\w/, "the");
    return response;
endtransform

# Styles
# I don't have an example yet of how to use this, but imagine
# something ala InDesign or CSS
style @heading1
    font Warnock Pro;
    size 18/24pt;
    onum;
    -smcp; # turn off small caps
    space-before 1.24em;
    space-after 1.24em;
endstyle

Going forward

I’m open to feedback on all of this, of course, so feel free to comment or get in touch with me.

I’ve renamed inkpdf to Press (as in printing press).

I reached the point where creating the PDF manually is no longer feasible, so I’ve been working on getting Press to a point where I can implement the PDF generation. The basic structure is in place, sans the PDF part. (That’s next.)

Here’s what a Press script looks like right now:

from press import Press

p = Press('output.pdf', width=6*Press.INCH, height=11*Press.INCH,
          margin=1*Press.INCH)

# Horizontal borders at top and bottom of page
p.stroke('#000')
p.pen(1.0)
p.line(p.page_min_x, p.page_min_y, p.page_max_x, p.page_min_y)
p.line(p.page_max_x, p.page_min_y, p.page_max_x, p.page_max_y)

# Page 2
p.page(2)
p.layer('base')
p.stroke(rgb=(1, 0, 0))
p.line(150, 150, 300, 300)
p.layer('fg')
p.stroke(hsl=(0, 0.5, 0.8))
p.line(300, 300, 450, 150)

# Go back and add another line to page 1
p.page(1)
p.stroke('#025')
p.line(p.page_min_x, p.page_min_y, p.page_min_x, p.page_max_y)

p.save() # this doesn't work yet

You can also do something like this:

with Press('output2.pdf', size=Press.LETTER,
           margin=(0.5*Press.INCH, 1.0*Press.INCH),
           inner_margin=0.5*Press.INCH,
           outer_margin=1.25*Press.INCH,
           bleed=.125*Press.INCH) as p:
    p.line(50, 50, 250, 50)
    # And so on

(Context manager, inner/outer margin, bleed, built-in paper sizes.)

Up next: adding more primitives, designing the font selection mechanism, getting it to generate an actual PDF, embedding fonts, using arbitrary Unicode code points, integrating HarfBuzz, etc.