Over the past 2 days I described hypothetical hardware to layout & render a webpage. So how would I incorporate text into this? That's crucial for a hypothetical browser!
The first step would be to select which "glyphs" from the font to display onscreen (like Harfbuzz). The second step would be to rasterize those (vector) glyphs so they can be composited onscreen.
See if this'll be one thread or two?
1/?
The tricky bit with glyph selection is that it has to run inside webpage layout, parallelized per-paragraph. It would have to run it on the hypothetical tree-SIMT I described the other day.
The process for looking up & laying out glyphs is complex, possibly including opcodes from the font or writingsystem-specific code. But mostly it involves looking up data from various tables in the font.
From a hardware perspective I'll focus on data-lookup & opcode interpretation.
2/3 + rasterization.
@alcinnz Generally I think that in cases where glyph selection depends on context, the context function needed can be expressed as a DFA, and DFA execution can be parallelized using the log-time parallel prefix sum algorithm (I saw a paper a few years ago using some version of SSE for this for regular expression evaluation).
Would be pretty cool to support glyph selection that *can't* be expressed as a DFA, like TeX parenthesis size selection, but for just rendering writing systems like Arabic, Devanagari, and country flags, I think a DFA is probably enough.
@radehi Ah yeah! That would fit this hypothetical hardware very well, I specifically made sure to include support for running-sums! Since that's heavily used in layout...
And I didn't know about that research! Also in this hypothetical I'd have a hardware pushdown automaton handling networking & styling, so that could be for e.g. parentheses selection.
@alcinnz Right, the parallel prefix sum algorithm generalizes to arbitrary monoids (the observation that launched Stepanov on the exploration that led to the STL) including, in particular, the finite map composition that gives the semantics of DFA execution.
(Does that make sense? I'd be happy to elaborate if it's too telegraphic.)
@alcinnz Well, so, the effect of any finite-length string on a particular DFA is a finite map from possible states at the beginning of the string to resulting states at the end of the string. For the empty string it is the identity function.
You can compute the effect of a concatenation of strings by composing these maps. If you do this bottom-up on a long string, starting from all its one-character substrings and then consolidating them into a tree of substrings of length 2, 4, 8, etc., each exhaustively covering the original string, you have a log-time parallel algorithm for DFA evaluation on the string. The final step is to apply the finite map computed for the entire string to the initial state defined by the DFA.
Then, if desired, you can propagate the results back down the tree to find the state of the DFA at every character.
Is precisely the parallel prefix-sum algorithm, with the monoid operation being function composition rather than, for example, integer addition.
Does that make sense? I don't know how to evaluate the clarity of my explanation in part because I don't know how familiar you are with the background.
@radehi Yes, it makes perfect sense!