I've been using ChatGPT's image recognition to extract information from receipts (with identifying information removed, of course), and it works vastly better than any OCR programs that I've used. It almost never makes any mistakes

I hate cookies! Every time I see a cookie I try to eliminate it from the world as quickly as possible in the only way I know how!

It seems to me that voting systems improve when they collect more information from voters. But its a competition between simplicity of ballots (and calculation) and information received from ballots. So that's probably why we see approval voting is superior to plurality and why variants of score are on the voting system quality pareto frontier

I'd propose a better voting system would be like score (or approval) but where you vote for issues, policies, and candidates' arguments, rather than candidates themselves. That way you are gaining even more information about the voters' preferences without significantly increasing the complexity of the ballot -- though it would increase the length of the ballot. Then, you select the candidate who's responses for the same ballot are most similar to the voters' responses

I've found a very fundamental methodology for discovery is the experimentation + induction loop: where you first do various things without any real intentions, lock on to the most interesting and unexplained thing you did, do variants of that thing, then model the results of the thing and its variants

I really wish this was more common in R&D in particular. Typically, R&D sets out with the goal of solving something or developing something that for a particular end. You could call this top-down R&D. Whereas inductive / bottom-up R&D probably looks like combining and assembling things you already have and understand with no particular strong goals or uses in mind, then finding uses for the results

A hypothetical example of this: combine food ingredients in various combinations and proportions until you discover a good recipe from the results. This might go far beyond what you imagine. For example, what happens if you mix clay in with bread dough? Can you use soy sauce in cookies? Is it possible to make something edible with salt or sugar epitaxy on a regular food substrate?

If we use the time-mediated pseudocausal influence loop definition of reality, then your own nonphysical qualia informationspace is real iff its time evolution operator exists

By time-mediated map loop I mean some composition of maps between objects that start and end at the same place though not necessarily at the same time. And these are pseudocausal maps, which point in the direction of pseudocausal influences. Pseudo- because causal influences probably actually only go one way (meaning with no reverse causality: two things cannot mutually cause eachother), but changes in one thing (eg: your brain) might causally influence another thing (eg: your mind) in a way that looks like its the other direction (eg: your mind moves your arm)

So for instance, for a physical process P, another physical process Q is real from P's perspective iff there is a pci map G from P to Q and a pci map H back to P from Q such that for P's time evolution operator Tp and Q's time evolution operator Tq that H Tq G P = Tp P. Note that this composition (H G) takes P back to itself (its an identity), but the composition conjugated with Q's time operator (H Tq G) takes P back to itself at a different point in time (H Tq G = Tp). Q is real from P's perspective because of this loop

For your own qualia informationspace u with time operator t and the physical process causally simulating it P with time operator T, there is a pci map e to u from P and inverse h to P from u. From *your* perspective P is real since e T h u = t u. And the identity function Iu for u can form a trivial loop with t: Iu^-1 t Iu = t, so u is always real granted it has a time evolution operator. Of course, if you have no time evolution operator then you're kind of fucked anyway

I think I'd categorize this idea as not-useful for a variety of reasons. But it could probably be made useful (well, maybe not useful, but consequential or at least implicative) if you tweaked some assumptions. For instance if you assume all pseudo-causal influences are actually causal, that would leads to some crazy shit let me tell you hwhat

Here's more metaphysical pseudotheoretical ramblings: assuming you believe that: A) you yourself have a nonphysical consciousness (NC), B) having a NC is good, C) its uncertain whether other people have NCs (because you aren't directly experiencing from their perspective, I suppose, so you can't be sure). Then: there is a weak moral imperative to form a groupmind with as many people as possible. You could even take this further: there is a weak moral imperative to incorporate as many physical processes into your psyche (via sensors, memory synchronization, recording, etc) as possible, with the goal of incorporating *the entire universe* into your conscious experience so that everything has NC

Note: any philosophical zombies (neurologically-conscious but not nonphysically conscious entities) might believe they have NC. There are no ways (as far as anyone knows) to identify whether something has a NC, or even if NCs exist in the first place (the existence of NCs is a huge and completely unjustified assumption because of this)

One HUGE problem I run into literally every day: the display and switch problem: where a webpage (especially a search results page) partially loads and shows something you want to click on, but is still loading, so you go to click on that thing, only for another element to suddenly load in and move everything on the page, causing you to click on some random thing and go to a different page. Another example of this is when you're doing just anything on a computer, you go to click on something, and another program produces a popup which appears an instant before you click, causing you to click on the popup. Generally I have no idea what these sorts of popups even say because they're then gone

I propose: a web standard where HTML elements that will appear when they are fully loaded have placeholder widths and heights settable via attributes, and show as a placeholder element before the HTML element is swapped out with its loaded form. Also, on the OS level, popups should have a grace period (of like 1 second) where you cannot click on their contents; to prevent accidental clicks

---

Along similar lines is another major issue I've been dealing with the last few days: where when you're working on some physical thing that requires you move tools and resources between locations, and you (me) then forget where you left the tools and resources. Its been such a tremendous problem for me (I've been working on my house's plumbing) that I've probably spent combined 1+ hour over the last week just looking for things I don't remember where I set

This is related to CPU caching, and search algorithms. Its generally cheaper to maintain a cache and pull a bunch of related things into the cache, drag the cache around with you, etc, than it is to go back to the source of the things and get them again in the original way

I haven't found a solution to this, but my first order approximation is to keep a bin where I demand of my self I put anything I will use repeatedly (eg: tools, resources: pipe, glue, etc). Then, I am guaranteed to know where to look with a relative cheap search instead of performing an exhaustive search

I've been trying to set up reasonable TTS on my linux machine again for the Nth time in many years. Each time I've tried in the past I've failed because either the TTS voices have been such low quality they're literally incomprehensible (still), the programs themselves have so many problems they're essentially impossible to use, or downstream programs don't accept installed TTS programs (eg: speech-dispatcher) / the tooling is bad

My conclusion: native TTS on linux is still *completely and utterly fucked beyond comprehension*. Time to install balabolka in wine for the Nth time I guess

Due to seeming inadequacies with combining the probabilities from multiple agents (averaging, elementwise multiply + renormalize, etc are all unsatisfying in some cases), I've been looking at this betting model of predictions. In this model, multiple agent's estimates are explicitly taken into account, and instead of a list of probabilities -- one probability for each outcome -- you have a matrix of real-valued bets and a payout multiplier vector. So I've been thinking: well, how is this useful when you only have *one* bettor? Simple: that one bettor has multiple sub-bettors who each bets differently. In the case where you have one sub-bettor for each outcome, each sub-bettor bets on only one outcome, and no two sub-bettors bet on the same outcome, there is a simple correspondence between the expected payout for any outcome and the probability of that outcome. But the really interesting version is when you have fewer sub-bettors than outcomes, and when the sub-bettors' bets can overlap. It seems that in that case, the sub-bettors each correspond to models of the outcome-space, and out falls MoE. ie: It seems natural to combine multiple models when making predictions, even for a single predictor (at least when using this betting model for prediction)

There's this (semi-)common qualitative formula for motivation that goes like: M = (V E) / (I D) where M is your motivation toward something, E is your confidence level of getting something from it, V is the combined value of the task and its outcome, I is your distractability, and D is the delay in arriving at the outcome (task length, etc). But -- even though this is just purely qualitative -- and this is extremely minor and stupid, but still: -- you can take the logarithm of both sides and get a linear combination of new, qualitatively-equivalent metrics: M = V + E - I - D. The derivative of this new equation is much simpler: dM = dE + dV - dI - dD. The derivative form can be used to determine which thing you should focus on; for instance if dE > dV then you should prefer increasing your confidence of success rather than increasing the value of the task and its outcome

Note the derivative in the typical form is something like: dM = (dV / V + dE / E - dI / I - dD / D) M, which is significantly more cumbersome. Though, this is all qualitative, not quantitative, so...

Also, note that some studies have estimated human value metrics for money are roughly logarithmic, so there is reason to believe the linear form should be preferred. Though, the original and logarithmic forms aren't compatible if I or D can be 0 (ie: value-less), or if I or D can be negative

You can turn any function with N parameters into N non-additive derivatives:
* Take the regular additive derivative of the function N times, this gives you N equations
* Solve for each of the parameters and your input x across the equations
* The parameters are the derivatives

eg: a x^2 = y
Take the derivative twice
2 a = y''
So we have
a = y'' / 2

We can also solve this by first solving the original equation to get x:
x = +- (y / a)^(1/2)
Then plug that in and solve to get the derivative (squaring removes the +-):
a = y'^2 / (4 y)

Since we have a 'y' in the denominator in that last parameteric derivative, we have singularities, and that's unattractive. Instead lets do this:
eg: a x^2 + b x = y
Take the derivative twice:
2 a = y''
So
a = y'' / 2
Then we can solve the original equation: x = (-b +- sqrt(b^2 + 4 a y) / (2 a))
And take the derivative of the original equation once:
2 a x + b = y
Plug stuff in and solve for b:
b = +- sqrt(y'^2 - 2 y y'')

eg: For the regular additive derivative its really simple:
a x = y
Take the additive derivative:
a = y'
Which is already solved

eg: a^x = y
Solve for x:
x = ln y / ln a
Take the derivative, plug in for x, and solve for a:
a = exp(y' / y)
Which is the regular multiplicative derivative

eg: x^a = y
Solve for x:
x = y^(1/a)
Take the derivative, plug in for x, and solve for a:
a = ln y / W(y ln(y) / y')
This gives you the exponential

eg: sin(a x) = y
Take the derivative twice:
y'' = -a^2 sin(a x) = -a^2 y
Solve for a:
a = sqrt(- y''/y)
Which gives you the equivalent oscillation rate for a 0-phase sine wave best matching your function y

Ok so I previously talked about how natural physical processes P in a physical universe U can simulate other universes u as long as the time evolution operator T for U matches the time evolution operator t for u as long as the simulation state function e (which takes P to u like e P = u) are consistent; like: e T P = t u. I had a new related idea which is pretty heavily in woo territory (in a sort-of way), but if you reverse e as d you get a situation where a universe u with time evolution operator t simulates a process P with time evolution operator T such that: d t u = T P, (and d u = P). Assuming P supervenes on u (like how u supervenes on P in the U-simulates-u case), then what is U and where does it come from in this new scenario? Since the original e takes a process within U and not U itself, it is lossy wrt U, and so no function inverse exists which can reconstruct U from u; so in this new case, you have the presumption of P existing in a larger context (U) which cannot possibly supervene on u despite P entirely supervening on u. In other words, if S is a positive supervenience relationship, then S(P, U) and S(u, P) in the original scenario, and S(P, u) and S(P, U) in the new scenario

If the original scenario is ever valid in physicalism, then the analog of this in the new scenario would inevitably be idealism. Which -- again, this is in woo territory and my beliefometer here is hovering at 0% -- sort of implies an analogy between the idea of a natural simulation (which is probably likely to some degree conditional on physicalism) and multi-player / cooperative idealism between multiple independent mental universes that must agree on reality. And this maybe opens up the possibility of the belief that supervenience relations can only be connected vs can be non-connected (see en.wikipedia.org/wiki/Connecte). Or something like that... The word salad here will inevitably only get worse, so I'll stop

(actually, come to think of it, are there any arguable properties of supervenience??)

Universe simulations might naturally occur all over the place. By simulation I mean: when there is some physical process P in universe U with time evolution operator T and there is a map M from P to another universe u with time evolution operator t such that M P = u and M T P = t M P, ie M P' = u', then universe U is simulating universe u (tentative definition, but it seems to work) -- see the attached image

If time-evolution operator consistency like this is all that's necessary for a universe to be simulating another universe, then there might be arbitrarily complex universes embedded in some real-world physical processes. There could be entities living in things around us as encrypted information-containing processes

But, the maps between physical processes in our universe and these simulated universes might be incredibly complex and high-entropy (when taken together). The physical processes themselves might be incredibly complex and individually high-entropy, and they may be open systems (as long as there is time-operator consistency), so they might be spread through and around other processes

I think too there is maybe some more-abstract model here involving self-simulation and holographic self-encoding. The universe could be in essence simulating itself in some sense. If there are larger patterns of time-evolution operator consistencies in arbitrary maps between information-spaces in sections of the universe. In essence, the laws of physics themselves might be some image of a much more grand physics, and the universe we see is an image of a much more grand universe. This coming, of course, from me as I continue to look at quantum mechanics from an information-theoretical perspective, where this kind of stuff is natural because a classical universe a la MWI is one term in a decomposition of a universal superposition

Note: there may be other preferred operator consistency properties like energy conservation, momentum conservation, etc. But this general time-operator consistency puts no constraints on the physics of the simulated universe (other than that it changes in time)

Another note: ultimately whether a person thinks there is a simulated universe in their coffee mug or whatever implies that there is some closed cycle of maps which allows them to consistently extract coherent information from that universe, and if it quacks like a duck and so on it is a simulated universe

I've recently discovered that box breathing (inhale 4 seconds, hold 4 seconds, exhale 4 seconds, hold 4 seconds) is extremely effective and reliable at calming irritation. Based on my experience so far, I'd estimate that doing it while irritated eliminates the irritation 90%+ of the time, which is extraordinarily reliable

Though, it only seems to work optimally when you are deliberately paying attention to doing it. So you can't just be doing it without thinking about it (which I've learned is possible). Probably something about activating your prefrontal cortex or something

Last year (because of fuckin long covid) I had what amounted to panic attacks (I think the name is stupid because it implies something more than what it is; but it was happening to me) and discovered that box breathing stops it with pretty high reliability (like 70%+) too

Hope this helps someone else :)

I also discovered that you can estimate the sinuosity using the width of the path (the maximum straight-line-orthogonal distance between points on the path). Roughly, you take the width of the path, multiply it by 1.6 and add 0.9. The attached image is the fit for that line

Show thread

I wanted to see what various sinuosity values actually looked like, so I wrote a small mma script using filtered random walks to visualize it. In the attached image a path is shown in each grid and in the top-left corner of each is the percent you have to add to the straight-line distance to get the path distance. eg: +20% is a 1.2 times multiplier, ie: add 20% of the straight-line length to get the path length

Sinuosity is the length of a path divided by the shortest distance between the endpoints. Its how much you have to multiply the straight-line distance between two points to get the real distance. Very useful for estimating distances

Recently while programming I've been trying to think of the design of systems in terms of *entanglemes* which are fundamental units (-eme) of entangled structures (entangl-). These correspond (mostly; they're a superset) to (cross-cutting concerns)[en.wikipedia.org/wiki/Cross-cu] in aspect-oriented programming. But unlike that programming-only term, entanglemes are much more general. For instance, the pipes, wiring, and structural elements in a house are all entanglemes because the house is an entanglement of the pipes, wires, etc

If you do any preliminary systems design at all when programming, it seems particularly important to identify entanglemes because entanglemes strongly inform what abstractions to use and they cannot be encapsulated. So identifying entanglemes is like identifying the limits of what can be encapsulated in your system

Ideally, you can decompose your system into a minimal set of entanglemes and maximally encapsulate everything else, but this seems to be an exceptionally hard, nonlinear problem

Consider some universe at a moment in time which appears classical (all waves are unitary / point-like) and which contains a brain-like information-containing structure (ICS) that is encoding the present state of the static universe as a classical universe, along with sensors attached to the ICS which update the encoded information in the obvious way. Now time-evolve the system for a moment. The universe now will be in a superposition (ie: non-unitary) of possible classical and non-classical evolutions. The ICS will also be in a superposition of such states

There should be a possible unitary state now consistent with this time-evolved universe where the collapsed time-evolved ICS still encodes a classical universe. It only possibly encodes some variant of what it initially encoded, updated with new information from its sensors. Obviously its possible for the ICS to record information about the non-classical time-evolution of the system through statistical analysis of observations because we can actually do that -- thats how we know about quantum-mechanical effects. But did the ICS only observe what it believed were classical events?

If we define a classical universe as what an ICS in one term of a superposition of classical univseres encodes the universe as, then obviously the ICS will encode the universe as classical by definition

Is a scenario possible where you have some arbitrary non-unitary or unitary universe with some ICS + sensors within it, with some arbitrary time evolution operator T that isn't necessarily a classical or non-classical time evolution operator, such that the information the ICS encodes is consistent with T? What I'm getting at: is a classical / semi-classical time evolution operator arbitrary, and only relative to an arbitary ICS which happens to encode information using that operator?

I suppose in this scenario such a time evolution operator might treat certain superpositions which are unitary only after some transform as unitary. I'd imagine there are certain transforms corresponding to certain time evolution operators with this property

This is just a thought, or rather a more-or-less succint way of putting a line of thought I've been having for awhile. I don't know if this actually works or not, or even if I believe it at all, but its interesting, none-the-less:

When the collapse of a wavefunction occurs, of the information we know about a system, that system has changed faster than the speed of light allows. But if we assume the system hasn't changed faster than the speed of light allows, then the obvious explanation is our information about the system isn't complete and we simply think the collapse happens faster than light allows. Probably because the information in our minds evolves from one incomplete set of information to another. The obvious conclusion is that because we're stuck in a classical interpretation of the universe, the collapse is one classical-ation of a process in the true nonclassical universe. And that the other classicalations (which presumably do exist) represent other possible collapse outcomes

So what is keeping us entirely classical? Or rather, what is preventing us from encoding non-classical information? I imagine it as a circular process: our minds are classical because we encode classical information in them, and we encode classical information in them because they are classical. But under this line of reasoning, there are likely infite other encoding schemes which define different types of univereses. These would all just be different interpretations of the true universe, though

Epistemic status: probably wrong, and way too vague (what is information?), but very interesting: Idea about ANN grokking, generalization, and why scaling works: a particular generalization forms only when its prerequisite information is learned by the ANN. Since larger nets can encode more information at a given time, they have at any given time more generalizations' prerequisite information, so those generalizations form. In smaller nets, information is constantly being replaced in each new training batch, so its less likely any given generalization's prerequisite information is actually learned by the net

This would also imply:
* Training on ts examples in a particular order will develop the generalization for that order (if there is one)
* Generalizations are also learned information, and they are more resistant to being replaced than regular information, probably because they are more likely to be used in any given training example

As training time increases, it becomes more and more likely a net encounters batches in the right order to learn the information for a generalization to develop

This would indicate superior training methods might:
* Somehow prevent the most common information (which might merge into the most common generalizations) from being replaced (vector rejections of ts subset gradients maybe?)
* Train on more fundamental ts examples first (pretraining is probably also in this category)

I've been meaning to try training a simple net N times on various small ts subsets, take the one with the best validation set loss, mutate via random example replacement, and recurse. Might be a step in the direction of discovering ts subsets which form the best generalizations

Show older
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.