**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 20:59

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 20:59

alcinnz @alcinnz@floss.social

Nov 17, 2022, 20:59

Over the past 2 days I described hypothetical hardware to layout & render a webpage. So how would I incorporate text into this? That's crucial for a hypothetical browser!

The first step would be to select which "glyphs" from the font to display onscreen (like Harfbuzz). The second step would be to rasterize those (vector) glyphs so they can be composited onscreen.

See if this'll be one thread or two?

1/?

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 21:11

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 21:11

Nov 17, 2022, 21:11

alcinnz @alcinnz@floss.social

The tricky bit with glyph selection is that it has to run inside webpage layout, parallelized per-paragraph. It would have to run it on the hypothetical tree-SIMT I described the other day.

The process for looking up & laying out glyphs is complex, possibly including opcodes from the font or writingsystem-specific code. But mostly it involves looking up data from various tables in the font.

From a hardware perspective I'll focus on data-lookup & opcode interpretation.

2/3 + rasterization.

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 21:25

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 21:25

Nov 17, 2022, 21:25

alcinnz @alcinnz@floss.social

To support data-lookups in the hypothetical tree-SIMT I described the other day I'd start by adding an opcode which splices a subtree into the binary tree. Or would I do that as a preprocessing step?

Either way I'd have a child of each inline element be the font to apply to it. Add opcodes for traversing up, left, or right in that subtree. Which could be needed anyways to store nontrivial amounts of text for processing!

So add GRAFT, UP, LEFT, & RIGHT opcodes!

3/4! + rasterization.

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 21:33

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 21:33

Nov 17, 2022, 21:33

alcinnz @alcinnz@floss.social

Usually opcode interpretation is easy, but given we're saying we're running this on a SIMT... Overusing branching can inefficiencies. Since with all compute-units receiving the same data the shared control unit cannot skip any branches at least one compute unit needs to take.

With enough cleverness or automation a bytecode interpreter could minimize this cost by sharing code between branches. Or I could further strategy of letting compute units choose their instruction stream.

4/4! + rasterize

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 21:48

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 21:48

Nov 17, 2022, 21:48

alcinnz @alcinnz@floss.social

Once we know which glyphs to render at which offsets (the biggest chunk of layout computation on this hypothetical hardware!) we'd need to convert them into raster images we can composite onscreen. We'd want to rasterize glyphs between frames & cache them so the same rasterization can be reused for each occurrence of e.g. "e".

The first step would be to apply matrix transforms & interpolate any curves into line segments. Then we can pair those up into trapezoids.

1'/2'?

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 21:56

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 21:56

Nov 17, 2022, 21:56

alcinnz @alcinnz@floss.social

One algorithm for filling shapes formed from line segments which fits this hypothetical hardware well is the "Bentley-Ottmann Algorithm".

Keep 2 collections under both branches. One being a minheap of upcoming line-starts, intersections, & line-ends, & the other being a binary-search-tree of line-segments intersecting the current row being rendered. Each "event" from the minheap would be processed generating upcoming intersection & line-end events whilst emitting trapezoids.

2'/3'!

**R. A. Dehi** @radehi@qoto.org · Nov 17, 2022, 22:10

**R. A. Dehi** @radehi@qoto.org · Nov 17, 2022, 22:10

Nov 17, 2022, 22:10

R. A. Dehi @radehi@qoto.org

@alcinnz If you're just rasterizing trapezoids to pixels, do you really need to know the intersections of the edges? I'd think it would be enough to know the ordering of edges on the current scan line, which you can generate in guaranteed linearithmic time by heapsorting the current edges by their current X-coordinates.

You can also do this in usually linear time by insertion-sorting the list, but then your worst-case execution time guarantee is only quadratic, because you can construct a pathological case where the ordering of active edges reverses from one scan line to the next, hitting insertion sort's worst case.

@freemo, is there a chance of getting Markdown code blocks fixed on #qoto? I'd like to be able to explain what I mean in cases like this to Adrian using actual code instead of prose, but that's not really feasible if the pseudo-Markdown handling is eating all my newlines and indentation. And I imagine this kind of desire is common among your intended target audience, people who discuss ideas using reasoning and evidence; evidence very often comes in the form of code nowadays.

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 22:20

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 22:20

Nov 17, 2022, 22:20

alcinnz @alcinnz@floss.social

@radehi @freemo@qoto.org Depends...

Are you using "winding" or "even-odd"? Are the lines intersecting at the outer edge of the font.

Oh and, intersections reorders the line-segments between scanlines!

In regards to fonts these are arguably edge cases not worth handling. And the Tor Scanlines algorithm is probably a better fit to the problem. But Bentley-Ottmann is a better fit to the hypothetical I designed for the broader layout issues.

**R. A. Dehi** @radehi@qoto.org · Nov 17, 2022, 22:29

**R. A. Dehi** @radehi@qoto.org · Nov 17, 2022, 22:29

Nov 17, 2022, 22:29

R. A. Dehi @radehi@qoto.org

@alcinnz I don't think matters if you're using a nonzero winding rule or an even-odd winding rule; what you care about is which side of each edge each pixel on scanline Y=218 is on, not whether two edges intersected at Y=217.5 or Y=217.2.

Yes, intersections are what reorder the line segments between scanlines. But if you have \(N\) line segments you can have \(\frac{N(N-1)}{2} = O(N^2)\) intersections between them between two successive scanlines; heapsort can nevertheless get them back into the right order for winding-rule tests in only linearithmic time, which you can't do if part of your algorithm involves enumerating all the intersections.

You could be right about Bentley-Ottman; there are surely aspects of the problem you're trying to solve that I don't understand. I've only ever written extremely simple scanline renderers myself.

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 22:43

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 22:43

Nov 17, 2022, 22:43

alcinnz @alcinnz@floss.social

@radehi A couple of Bentley-Ottmann points:

1) The basic point is to track all the scanlines intersecting the current scanline in sorted order. From which trapezoids can be generated for the given fill rule.

2) Only the linesegments intersecting are considered for intersections against the linesegment off the minheap.

3) Caching an iterator that can move left or right along the scanline's intersections is an effective hueristic to avoid the O(N^2) performance problem.

**R. A. Dehi** @radehi@qoto.org · 2022-11-17T22:46:17Z

R. A. Dehi @radehi@qoto.org

@alcinnz How does #3 help if you have O(N^2) intersections "on" (really just above) that scanline? I agree that it should work fine most of the time because you usually have comparatively few intersections, but in those cases insertion-sorting an array of active lines will also work fine, and will probably be faster than maintaining a binary heap?

Nov 17, 2022, 22:46 · · · ·

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 22:55

**alcinnz** @alcinnz@floss.social · Nov 17, 2022, 22:55

Nov 17, 2022, 22:55

alcinnz @alcinnz@floss.social

@radehi It optimizes for the commoncase where the intersection is a line continuation.

And yes, #2 probably helps more.

Anyways that's something Cairo Vector Graphics does and I don't know if I fully understand why. Probably do the same here in part to minimize how much data I'd be shifting around.

---

As for array indexing: That's a great technique. Didn't include it in my hypothetical design, minimal need for it beyond array iteration. Which can be handled largely within the control unit.