Okay, so I figured this part out: a matrix multiplied by its transposition is a covariance matrix. By which I mean: the higher the value in a given (row, col), the more data in those axes were correlated.

https://en.wikipedia.org/wiki/Covariance_matrix

To simplify, consider a 3x3 matrix `A` and multiply `A` by `transpose(A)`.

What each cell of the result is telling you is how likely it is that when you change the value on the row axis, the value on the column axis changes the same way. So the diagonal will always be large, because data on an axis will always correlate with itself (i.e. when you change the value of `x`, the value of `x` changes in *exactly* the same way, `x*x = x^2`), but cell 0,2, for example, tells you how much changing x causes z to change the same way (if it's the same value as cell 0,0, then the points lie on a diagonal in the xz-plane: changing `x` causes the exact same change in `z`).

I still need to cogitate a bit on why the eigenvector with the largest eigenvalue of this matrix is the axis along which the data has the highest variance in the original coordinate space.

Show thread

**Mark Tomczak** @mtomczak@qoto.org · Nov 09, 2022, 16:33

**Mark Tomczak** @mtomczak@qoto.org · Nov 09, 2022, 16:33

Nov 09, 2022, 16:33

Mark Tomczak @mtomczak@qoto.org

Hey lazymastodon, I have a linear algebra question.

So I've been thinking a bit about principle component analysis as of late. The way to find the vector of most variance in a multidimensional dataset is to put every datapoint in a column matrix, multiply that matrix by its transpose, and find the eigenvectors of the resulting square matrix.

Here's my question: I don't have a good intuition for what "multiply the matrix by its transpose" is doing. That compares every point to every other point by multiplying the same-dimension components together and summing the result across dimensions, but like... Why does that result in an interesting matrix instead of a pile of noise?

3c6f98219e922489.png

**Mark Tomczak** @mtomczak@qoto.org · Nov 08, 2022, 19:37

**Mark Tomczak** @mtomczak@qoto.org · Nov 08, 2022, 19:37

Nov 08, 2022, 19:37

Mark Tomczak @mtomczak@qoto.org

@drquuxum I've definitely eaten that.

Not a great experience.

Klingons were involved.

**Mark Tomczak** @mtomczak@qoto.org · Nov 08, 2022, 18:49

**Mark Tomczak** @mtomczak@qoto.org · Nov 08, 2022, 18:49

Nov 08, 2022, 18:49

Mark Tomczak @mtomczak@qoto.org

Election day in the US.

It's indicative I think of the modern nature of social interaction that I meet more of my geographic neighbors working the polls than I do any other day of the year.

They're all fine and lovely people, but we spend our lives living next to each other and commuting to thousands of different places to spend most of our waking time.

**Mark Tomczak** · Nov 08, 2022, 13:14

Mark Tomczak boosted

**Mallory Moore** @chican3ry@social.transsafety.network · Nov 08, 2022, 13:14

Nov 08, 2022, 13:14

Mallory Moore @chican3ry@social.transsafety.network

Hi there. Please if you see this toot boost for helping our new Trans Safety Network instance federate across the 'verse.

**Mark Tomczak** @mtomczak@qoto.org · Nov 04, 2022, 16:54

**Mark Tomczak** @mtomczak@qoto.org · Nov 04, 2022, 16:54

Nov 04, 2022, 16:54

Mark Tomczak @mtomczak@qoto.org

@jleedev Not a moment too soon!

**Mark Tomczak** @mtomczak@qoto.org · Nov 03, 2022, 13:53

**Mark Tomczak** @mtomczak@qoto.org · Nov 03, 2022, 13:53

Nov 03, 2022, 13:53

Mark Tomczak @mtomczak@qoto.org

Oop, looks like I've been out of the C++ game for long enough that I forgot some fundamentals. ;)

CoderPad does support `unique_ptr`; you have to `#include <memory>` to pull it in because C++'s std lib is carved up into a bunch of smaller headers.

It's been awhile!

Show thread

**Mark Tomczak** @mtomczak@qoto.org · Nov 03, 2022, 13:43

**Mark Tomczak** @mtomczak@qoto.org · Nov 03, 2022, 13:43

Nov 03, 2022, 13:43

Mark Tomczak @mtomczak@qoto.org

@orcmid Very true. Regarding infringement, the relevant question, I think, is whether creation of the recognizer itself is an infringement. Once the recognizer has been created, the act of using it to synthesize art *smells* a lot like "creating novel work in the style of the artist," and (unlike Google v. Oracle where we had no precedent on the copyrightability of APIs) precedent for copyright is that a style cannot be copyrighted.

If creation of the recognizer is infringement, that opens a whole can of worms... ContentID (which protects artists) is also a recognizer, and I don't think artists in general want it to be illegal overnight to create a ContentID thumbprinting algorithm.

**Mark Tomczak** @mtomczak@qoto.org · Nov 03, 2022, 13:38

**Mark Tomczak** @mtomczak@qoto.org · Nov 03, 2022, 13:38

Nov 03, 2022, 13:38

Mark Tomczak @mtomczak@qoto.org

Anyone familiar with the CoderPad environment:

Is there anything special one must do to access `std::unique_ptr`? I'd expect that to be available but it seems to be missing from their C++ environment.

**Mark Tomczak** @mtomczak@qoto.org · Nov 02, 2022, 19:10

**Mark Tomczak** @mtomczak@qoto.org · Nov 02, 2022, 19:10

Nov 02, 2022, 19:10

Mark Tomczak @mtomczak@qoto.org

@rob Back in the day I attempted to source a simple USB pushbutton. It ended up being such a bear that I soldered a button to an Arduino and fed the push signal back to the computer over serial instead.

... which almost turned into a mistake. Did the math wrong on my voltages and put not nearly enough resistance in the circuit, resulting in a *very* warm pushbutton.

**Mark Tomczak** @mtomczak@qoto.org · Nov 02, 2022, 19:08

**Mark Tomczak** @mtomczak@qoto.org · Nov 02, 2022, 19:08

Nov 02, 2022, 19:08

Mark Tomczak @mtomczak@qoto.org

@jleedev The only thing I will not miss about having lost my job working on self-driving vehicles is it will probably be a *long* time before I need trouble myself with the gory details of school zones again.

**Mark Tomczak** @mtomczak@qoto.org · Nov 02, 2022, 19:06

**Mark Tomczak** @mtomczak@qoto.org · Nov 02, 2022, 19:06

Nov 02, 2022, 19:06

Mark Tomczak @mtomczak@qoto.org

@jleedev I *must* know which one. ;)

**Mark Tomczak** @mtomczak@qoto.org · Nov 02, 2022, 19:05

**Mark Tomczak** @mtomczak@qoto.org · Nov 02, 2022, 19:05

Nov 02, 2022, 19:05

Mark Tomczak @mtomczak@qoto.org

Hypothetically speaking, I can see no obvious reason copyright law cannot be extended to exclude from "fair use" the injection of images into a machine learning engine without consent of the copyright owner of the image.

This will need to be done carefully, because if the goal is to throw a wrench into Stable Diffusion and its siblings, that wrench can easily ping-pong into e.g. banning Google from creating ContentID fingerprints to protect artists from copyright abuse. Some specific dimensions to pin down:

- what is a machine learning engine?

- what does it mean to train one on an artist's work?

- what does 'consent' look like? How explicit must it be?

Implementation will be messy but it's always messy; this is copyright, there's no other kind of implementation of copyright. "...other nations have thought that these monopolies produce more embarrassment than advantage to society" (Thomas Jefferson). But it may be a good idea if we have no alternative that protects the livelihoods of artists in a world where everyone is now a mediocre visual artist.

... And, of course, US copyright law has no bearing on China's law. Nor Russia. Nor dozens of other countries. So we would have to be prepared for our entertainment industries paying top-dollar for human labor competing on the international stage with a sea of mediocre artists that cost electrons and little else.

**Mark Tomczak** @mtomczak@qoto.org · Oct 31, 2022, 17:36

**Mark Tomczak** @mtomczak@qoto.org · Oct 31, 2022, 17:36

Oct 31, 2022, 17:36

Mark Tomczak @mtomczak@qoto.org

When you want to update a LaTeX document with some information provided on the command line.

https://blog.fixermark.com/posts/2022/latex-content-from-command-line/

**Mark Tomczak** @mtomczak@qoto.org · Sep 20, 2022, 14:59

**Mark Tomczak** @mtomczak@qoto.org · Sep 20, 2022, 14:59

Sep 20, 2022, 14:59

Mark Tomczak @mtomczak@qoto.org

General #GraphQL thought:

I fear the possibility GraphQL is a trap. I think, in principle, it's a good idea. In practice, I'm concerned that pretending one can offer a pretty flat access to the underlying data in a backend store ignores some irreducible complexity in the question of data storage and retrieval.

How easy is it, in general, to build a GraphQL query that is expensive to answer because it skips all the indexes the backing database supports? I don't know; I've only worked with one or two GraphQL APIs. But one advantage to REST and RPC is that since you have to be intentional about creating new procedures or new resources, you have to be intentional about creating indexes.

Can modern DB engines compute those indexes based on usage patterns or is it still a very manual process?