Ok so I previously talked about how natural physical processes P in a physical universe U can simulate other universes u as long as the time evolution operator T for U matches the time evolution operator t for u as long as the simulation state function e (which takes P to u like e P = u) are consistent; like: e T P = t u. I had a new related idea which is pretty heavily in woo territory (in a sort-of way), but if you reverse e as d you get a situation where a universe u with time evolution operator t simulates a process P with time evolution operator T such that: d t u = T P, (and d u = P). Assuming P supervenes on u (like how u supervenes on P in the U-simulates-u case), then what is U and where does it come from in this new scenario? Since the original e takes a process within U and not U itself, it is lossy wrt U, and so no function inverse exists which can reconstruct U from u; so in this new case, you have the presumption of P existing in a larger context (U) which cannot possibly supervene on u despite P entirely supervening on u. In other words, if S is a positive supervenience relationship, then S(P, U) and S(u, P) in the original scenario, and S(P, u) and S(P, U) in the new scenario
If the original scenario is ever valid in physicalism, then the analog of this in the new scenario would inevitably be idealism. Which -- again, this is in woo territory and my beliefometer here is hovering at 0% -- sort of implies an analogy between the idea of a natural simulation (which is probably likely to some degree conditional on physicalism) and multi-player / cooperative idealism between multiple independent mental universes that must agree on reality. And this maybe opens up the possibility of the belief that supervenience relations can only be connected vs can be non-connected (see https://en.wikipedia.org/wiki/Connected_relation). Or something like that... The word salad here will inevitably only get worse, so I'll stop
(actually, come to think of it, are there any arguable properties of supervenience??)
Universe simulations might naturally occur all over the place. By simulation I mean: when there is some physical process P in universe U with time evolution operator T and there is a map M from P to another universe u with time evolution operator t such that M P = u and M T P = t M P, ie M P' = u', then universe U is simulating universe u (tentative definition, but it seems to work) -- see the attached image
If time-evolution operator consistency like this is all that's necessary for a universe to be simulating another universe, then there might be arbitrarily complex universes embedded in some real-world physical processes. There could be entities living in things around us as encrypted information-containing processes
But, the maps between physical processes in our universe and these simulated universes might be incredibly complex and high-entropy (when taken together). The physical processes themselves might be incredibly complex and individually high-entropy, and they may be open systems (as long as there is time-operator consistency), so they might be spread through and around other processes
I think too there is maybe some more-abstract model here involving self-simulation and holographic self-encoding. The universe could be in essence simulating itself in some sense. If there are larger patterns of time-evolution operator consistencies in arbitrary maps between information-spaces in sections of the universe. In essence, the laws of physics themselves might be some image of a much more grand physics, and the universe we see is an image of a much more grand universe. This coming, of course, from me as I continue to look at quantum mechanics from an information-theoretical perspective, where this kind of stuff is natural because a classical universe a la MWI is one term in a decomposition of a universal superposition
Note: there may be other preferred operator consistency properties like energy conservation, momentum conservation, etc. But this general time-operator consistency puts no constraints on the physics of the simulated universe (other than that it changes in time)
Another note: ultimately whether a person thinks there is a simulated universe in their coffee mug or whatever implies that there is some closed cycle of maps which allows them to consistently extract coherent information from that universe, and if it quacks like a duck and so on it is a simulated universe
I've recently discovered that box breathing (inhale 4 seconds, hold 4 seconds, exhale 4 seconds, hold 4 seconds) is extremely effective and reliable at calming irritation. Based on my experience so far, I'd estimate that doing it while irritated eliminates the irritation 90%+ of the time, which is extraordinarily reliable
Though, it only seems to work optimally when you are deliberately paying attention to doing it. So you can't just be doing it without thinking about it (which I've learned is possible). Probably something about activating your prefrontal cortex or something
Last year (because of fuckin long covid) I had what amounted to panic attacks (I think the name is stupid because it implies something more than what it is; but it was happening to me) and discovered that box breathing stops it with pretty high reliability (like 70%+) too
Hope this helps someone else :)
I also discovered that you can estimate the sinuosity using the width of the path (the maximum straight-line-orthogonal distance between points on the path). Roughly, you take the width of the path, multiply it by 1.6 and add 0.9. The attached image is the fit for that line
I wanted to see what various sinuosity values actually looked like, so I wrote a small mma script using filtered random walks to visualize it. In the attached image a path is shown in each grid and in the top-left corner of each is the percent you have to add to the straight-line distance to get the path distance. eg: +20% is a 1.2 times multiplier, ie: add 20% of the straight-line length to get the path length
Sinuosity is the length of a path divided by the shortest distance between the endpoints. Its how much you have to multiply the straight-line distance between two points to get the real distance. Very useful for estimating distances
Recently while programming I've been trying to think of the design of systems in terms of *entanglemes* which are fundamental units (-eme) of entangled structures (entangl-). These correspond (mostly; they're a superset) to (cross-cutting concerns)[https://en.wikipedia.org/wiki/Cross-cutting_concern] in aspect-oriented programming. But unlike that programming-only term, entanglemes are much more general. For instance, the pipes, wiring, and structural elements in a house are all entanglemes because the house is an entanglement of the pipes, wires, etc
If you do any preliminary systems design at all when programming, it seems particularly important to identify entanglemes because entanglemes strongly inform what abstractions to use and they cannot be encapsulated. So identifying entanglemes is like identifying the limits of what can be encapsulated in your system
Ideally, you can decompose your system into a minimal set of entanglemes and maximally encapsulate everything else, but this seems to be an exceptionally hard, nonlinear problem
Consider some universe at a moment in time which appears classical (all waves are unitary / point-like) and which contains a brain-like information-containing structure (ICS) that is encoding the present state of the static universe as a classical universe, along with sensors attached to the ICS which update the encoded information in the obvious way. Now time-evolve the system for a moment. The universe now will be in a superposition (ie: non-unitary) of possible classical and non-classical evolutions. The ICS will also be in a superposition of such states
There should be a possible unitary state now consistent with this time-evolved universe where the collapsed time-evolved ICS still encodes a classical universe. It only possibly encodes some variant of what it initially encoded, updated with new information from its sensors. Obviously its possible for the ICS to record information about the non-classical time-evolution of the system through statistical analysis of observations because we can actually do that -- thats how we know about quantum-mechanical effects. But did the ICS only observe what it believed were classical events?
If we define a classical universe as what an ICS in one term of a superposition of classical univseres encodes the universe as, then obviously the ICS will encode the universe as classical by definition
Is a scenario possible where you have some arbitrary non-unitary or unitary universe with some ICS + sensors within it, with some arbitrary time evolution operator T that isn't necessarily a classical or non-classical time evolution operator, such that the information the ICS encodes is consistent with T? What I'm getting at: is a classical / semi-classical time evolution operator arbitrary, and only relative to an arbitary ICS which happens to encode information using that operator?
I suppose in this scenario such a time evolution operator might treat certain superpositions which are unitary only after some transform as unitary. I'd imagine there are certain transforms corresponding to certain time evolution operators with this property
This is just a thought, or rather a more-or-less succint way of putting a line of thought I've been having for awhile. I don't know if this actually works or not, or even if I believe it at all, but its interesting, none-the-less:
When the collapse of a wavefunction occurs, of the information we know about a system, that system has changed faster than the speed of light allows. But if we assume the system hasn't changed faster than the speed of light allows, then the obvious explanation is our information about the system isn't complete and we simply think the collapse happens faster than light allows. Probably because the information in our minds evolves from one incomplete set of information to another. The obvious conclusion is that because we're stuck in a classical interpretation of the universe, the collapse is one classical-ation of a process in the true nonclassical universe. And that the other classicalations (which presumably do exist) represent other possible collapse outcomes
So what is keeping us entirely classical? Or rather, what is preventing us from encoding non-classical information? I imagine it as a circular process: our minds are classical because we encode classical information in them, and we encode classical information in them because they are classical. But under this line of reasoning, there are likely infite other encoding schemes which define different types of univereses. These would all just be different interpretations of the true universe, though
Epistemic status: probably wrong, and way too vague (what is information?), but very interesting: Idea about ANN grokking, generalization, and why scaling works: a particular generalization forms only when its prerequisite information is learned by the ANN. Since larger nets can encode more information at a given time, they have at any given time more generalizations' prerequisite information, so those generalizations form. In smaller nets, information is constantly being replaced in each new training batch, so its less likely any given generalization's prerequisite information is actually learned by the net
This would also imply:
* Training on ts examples in a particular order will develop the generalization for that order (if there is one)
* Generalizations are also learned information, and they are more resistant to being replaced than regular information, probably because they are more likely to be used in any given training example
As training time increases, it becomes more and more likely a net encounters batches in the right order to learn the information for a generalization to develop
This would indicate superior training methods might:
* Somehow prevent the most common information (which might merge into the most common generalizations) from being replaced (vector rejections of ts subset gradients maybe?)
* Train on more fundamental ts examples first (pretraining is probably also in this category)
I've been meaning to try training a simple net N times on various small ts subsets, take the one with the best validation set loss, mutate via random example replacement, and recurse. Might be a step in the direction of discovering ts subsets which form the best generalizations
https://jmacc93.github.io/essays/skyrim_game_player.html
Here's a (long!) essay I wrote about what it would take to build a SOTA game playing agent that can finish a modern videogame zero-shot. It's an AGI; it would take an AGI to beat Skyrim. There's just no way around it as far as I can see. Note: I'm a layperson in machine learning and AI. I'm not involved in those fields professionally and am just a hobbyist
I wrote this interactive dynamic-dispatch type conjunction prototype a few days ago
https://jmacc93.github.io/TypeConjunctionDispatchPrototype/
Its pretty neat I think
I realized the other day that I prefer the more general ontological equivalent of duck typing, which I now call duck ontology (I have no idea if theres another name for this; there probably is). If two things look the same, and you can't do anything to distinguish them, then they're the same thing, is how it goes in duck ontology. If it looks like a duck, and it quacks like a duck, then its a duck. Ducks are things that look, act, sound, etc like ducks
The ship of theseus is the same ship as before because they have the same name, they look the same, they are used the same, etc
If you remove small bits of sand from a mound of sand, then at some point you can't do the same things with the mound (eg: load a bunch into a shovel at once), so it isn't the same things as before
This is a very pragmatic model of what things are what. Its probably an example of the category of models that try to use immediate tautologies or their equivalent. Like: the sun is bright because the sun is bright, is a trivial and not helpful tautology. The analogical equivalent of that in duck ontology is like: theseus' ship looks the same to me, simple as
This is probably all stupid and has been fleshed out much better by other people. Works for me
I've noticed this thing for a long time: when I listen to music, have music stuck in my head, or even just listen to a metronome, the tempo affects how I think. I know its primarily the tempo because a metronome seems to produce most of the same effect. The effect it has seems to correspond to what makes sense conceptually: fast tempo encourages quick judgements, not thinking things through, etc; slow tempo encourages contemplative thought, etc
I feel like (but don't necessarily believe that) the preferred [neurological oscillation](https://en.wikipedia.org/wiki/Neural_oscillation) frequency of various parts X of your brain must vary, and if the rhythmic activity in other areas (eg: your auditory and motor cortex) is harmonic with X's activity then that is excitatory for X. No clue if this is true, but it feels true
If I'm listening to a metronome while I'm working, it seems like my work when I'm working fast corresponds to a tempo of ~170 BPM, and functionally slow work corresponds to a tempo of ~80. If I listen to a BPM not corresponding to the right working tempo, then it seems to trip me up
Could be just expectation effects, of course
It is currently -2f outside where I live right now (Missouri). Room temperature is lets say 72f. A typical very high temperature on earth is 120f. That is 48f over room temperature. Room temperature minus 48f is 24f. Difference between the current temperature outside where I live and room temperature is 74f. Room temperature plus 74f is 146f. Max recorded temperature on earth is 134f
Epistemic status: provides absolutely zero definitive knowledge
PC: physical / neurological consciousness. Analogous to the state of a computer running an AI
MPC: metaphysical consciousness. This is consciousness in the *watching from elsewhere* sense. A transcendent consciousness. I think its likely that a consciousness in idealism is probably always a MPC
Assuming MPCs can only attach to one PC at a time; MPCs are always attached to some PC; and PCs can be divided to get two PCs (justification for this is that PCs are abstractly equivalent to the state of some physical process as it changes in time, so we can divide that process into two physical processes with two states). Note: all of these assumptions are extremely tenuous
If we divide a PC with attached MPC into two PCs A and B. Our MPC must still be attached to either A or B (not both). Since A and B are both PCs, we can do this procedure again as many times as we want. We can then narrow what physical point the MPC is attached to on the original, unsplit PC, in this particular case (potentially just as the result of where we divided the initial PC, and the PCs state as we divided it) or universally (where the MPC attaches every time)
If the MPC attaches at a singular point every time, there is a heavy implication of some unknown physics governing the MPC-PC connection and that potentially the MPC is essentially an unknown physical process (or attaches to some other, unknown physical process), or physical-like process that can be modeled in the same we model regular physics
If the MPC attaches randomly, then that indicates some small scale process in the vein of thermodynamics that determines the MPC-PC connection
Both of these cases could also be contrived by some intelligent force behind the scenes, which may itself imply that MPCs are epiphenomenal in some exotic model of reality. This also may imply pantheism, any other form of d/theism really, and supernaturalism in general potentially
If you split a PC and you get two PCs with a MPC attached to each, then that seems to imply panpsychism (in the sense of consciousnesses are attached to everything). If you split a PC and you have two PCs with no MPCs attached, then what does that imply?
Note: MPCs are extremely paradoxical, partly because most humans will say they are principally an MPC attached to a PC, but all PCs without attached MPCs can say the same thing. So its safe to say there is no known test to determine if a PC has an attached MPC. And, more extremely, its unknown whether MPCs exist at all (PCs do definitively exist)
In my opinion, if people truly believe MPCs exist (I do), people should be trying to develop tests that identify MPCs. Though, MPCs may be beyond the scientific process to explain. If there *is no test* to identify MPCs, then its probably impossible beyond speculation to reason about them. It may be possible somehow (idk how) to make predictions using them, though
Epistemic status: needs to be tested to be confirmed, but it seems right
After thinking about it for awhile, humans actually go and search out information they don't have memorized. Most modern LLMs have (almost) all of the information they used memorized, rather than using external resources. If trained so that all semantic information is presented on their inputs, along with their prompt input, I imagine LLMs will not memorize the semantic information (not store the information encoded in their parameters), but *will* store metainformation about that semantic information, and will store information about what they actually have to model (how words go together, the syntax of the language, etc)
So this might be a viable way to train a model so its parameters hold information primarily about some particular aspect of its training set, rather than the training set verbatim. In the LLM case: you train the model so it models how facts interact, rather than both facts and how they interact
To train a model like this, luckily you can use big LLMs already in existence because they act like big databases already. You can also use internet searches
I think you could probably have a system of models that each have been trained to store different sorts of information. For instance, you could have a database model that stores facts about the world (eg: the capital of the USA is Washington DC) but with no world modeling, along with a world modeling model that stores how things interact and procedural information (eg: if I splash water on myself I'll get cold), and integrate them into a unified model
This is also related to biased models. If you train an LLM on one particular kind of prompt, you bias the information it has encoded in its parameters toward that prompt. For instance, an LLM with N parameters, that is trained on a big training set B (eg: a set that includes questions and answers about geography questions), will be able to achieve a lower loss on B than an LLM with N parameters that is trained on a set A (eg: a set of all sorts of questions and answers) which is a superset of B. The LLM trained on just B is biased towards the question-answer pairs in B. Now, there's a risk of the B model overfitting to B if B is small enough. But I'm assuming B is a huge set
A model biased toward, for example, solving equations would synergize with a model that is biased toward memorizing notable equations
It appears that ChatGPT has memorized the latitude and longitude coordinates of almost every significant city and town on earth (or at least all of the ones I tested). Try a prompt like: "I'm at 40.97 degrees north and -117.73 degrees west. What town do I live in?". ChatGPT gave me: "The coordinates you provided, 40.97 degrees north and -117.73 degrees west, correspond to a location in Nevada, USA, near the town of Winnemucca. ...". Which is correct...
This is the kind of shit I've been talking about. Like, a human is considered *more* intelligent than ChatGPT, and a human being absolutely *cannot* memorize the latitude and longitudes of literally every fuckin town on earth. Yet, estimators of machine intelligence metrics complain that we'll never have AI as intelligent as humans because of the huge amount of memory and processing power required to match the human brain. Well, clearly those 10 trillion (or whatever) parameters that go into building GPT3.5 aren't being used in the same way a human brain uses its parameters. Clearly, a *much* larger emphasis is on memorizing particular details than world modeling
So how do we make LLMs do more world modeling? I imagine that the hallucination problem would be solved with the same technique as inducing more world modeling. Inevitably, preventing the LLM from learning particular details necessarily requires stripping some information from the outputs (and probably inputs too) before training. I'd imagine using an AE or similar dimensionality-reducing function
I recently noticed that the form:
```js
let procI = N
let args = [...]
while(true) {
switch(procI) {
case 0:
...
case 1:
...
...
}
}
```
And the form:
```js
function proc0(...) {...}
function proc1(...) {...}
...
procN(...)
```
Are functionally equivalent from a stackless perspective. Where, calls `procM(...)` in the 2nd form are equivalent to `args = [...]; procI = M; continue` in the 1st form. The while+switch / 1st form simulates pushing to an argument stack and jumping to different instruction offsets
This is really useful because it allows you to "call" certain functions as continuations without increasing the stack depth
I still contend that beating Skyrim zero shot should be a major goal for general game playing AI development. It would be a incredible accomplishment, I think. Though, I think it would be easier than it seems at first (eg: I don't think it requires any text / speech comprehension). After that, of course, there are much more difficult games to aim for. Like fully completing The Witness / Taiji, Noita, La Mulana, etc. If a general game playing AI could fully beat *any* of those game zero shot, I would be completely amazed. Pretty sure we'll have superintelligent AGI before a general game player could do that