While messing around in MMA, I was trying to draw shapes in `Spectrogram`s of `Audio@Table[__]`s and I discovered that you can make a curve appear in the spectrogram with `Sin[F[t]]` where `F[t] := Integrate[f[x],{x,0,t}]` and `f` is the function of the curve you want to appear in the spectrogram, from time to frequency. And if you want to color in a region in the spectrogram, you add the curves a la `Audio@Sum[Table[__],_]`. To get something like white / brown / etc noise you add up a bunch of your function with a random sample of some distribution for a parameter, eg: `fa t + f (-fa + fb) t` linearly interpolates between `fa` and `fb`, and the parameter `f` is the interpolating parameter. If you insert a random uniform on `[0, 1]` in for `f`, then you fill in the region from `[fa, fb]` in your spectrogram
Idea: in the usual web ChatGPT interface: a subdialog where you can discuss with ChatGPT about a particular response it produced and what it did well, what it did poorly, and how it could improve, to aid in its development
I have to imagine that the future of LLMs for chat / dialog like ChatGPT involves training using its own predictions about what it did well and what it could improve, then retraining to correct what it didn't do well
Idea: train a smaller LLM / classifier which takes input text and produces YES/NO/MAYBE/FAIL answers by generating training data using ChatGPT or another fluent LLM
You can potentially generate training data given a set of input questions; you append the subprompt `(Only give Yes, No, Maybe, or Fail answers. An answer that isn't Yes, No, or Maybe should be Fail)` to each question and feed them into ChatGPT. Its responses (if they match Yes, No, or Maybe; and anything else is implicitly Fail) are the unit-vector outputs to train the new classifer
You could also potentially produce training data by taking random snippets S of text from some large dataset of arbitrary text, and ask ChatGPT: `Given the text "S", please list N questions related to the above text that can be answered with Yes, No, or Maybe, and at the end of each question write their answer (one of: Yes, No, or Maybe)`. Where `N` is some small integer (maybe `5 <= N <= 100`)
This classifier could potentially be used to update a system that is keeping track of how some human-programmable state is evolving when the evolved state is not human-programmable but human-describable: you evolve the system and describe it in text, then ask a finite set of questions to synchronize the programmable state with the new system state description
For example, anyone who played the old AI Dungeon back when it used GPT-2 (and probably still now), or who has played a text adventure using ChatGPT (which is really fun: try it out!), knows that the finite length of the input for those systems means they lose track of information frequently, and there are a lot of small details that are lost in general. A human-programmable text adventure, on the other hand, has limited generality, but has a definitive state. With the above classifier you could potentially make a program with a definitive, human-programmable state, evolve the state using a LLM, then update the human-programmable state with the new state's text-description using the classifier
This same technique might be useful for LLMs themselves to generate notes to augment their memories
Another stable diffusion controlnet idea:
A module similar to the reference preprocessor but with a text prompt. The prompt controls what the model's attention goes to in the reference image. Presumably this would allow you to reference just one feature of the reference image, and essentially ignore everything else
Averaging different seeds at the same denoising strength in img2img shows the scale that denoising strength affects
As seen in this video / image: https://imgur.com/a/pz2BCgS
Idea for stable diffusion: train a model to correct an image which has been randomly deformed. It may be cheap enough to use perlin noise, or similar, to generate random deformations, but otherwise something like GIMP's pick noise, which just randomly exchanges pixels with nearby pixels n times, may be faster
Theoretically, you could use a regular image as the initial noisy image, and the model would then deform it to match what it thinks is the denoised equivalent. This might allow for, eg: correction of anatomical problems for characters, composition problems, etc
I've been using [obsidian](https://obsidian.md/)'s [canvas](https://obsidian.md/canvas?trk=public_post-text) feature as a sort-of multi-dimensional [kanban board](https://en.wikipedia.org/wiki/Kanban_board) plus calendar and goal timeline and it works amazingly well. It seems like its really important to have the right representation for these sorts of things, and this works really well as a representation for me. I highly recommend checking it out