Follow

Idea: train a smaller LLM / classifier which takes input text and produces YES/NO/MAYBE/FAIL answers by generating training data using ChatGPT or another fluent LLM

You can potentially generate training data given a set of input questions; you append the subprompt `(Only give Yes, No, Maybe, or Fail answers. An answer that isn't Yes, No, or Maybe should be Fail)` to each question and feed them into ChatGPT. Its responses (if they match Yes, No, or Maybe; and anything else is implicitly Fail) are the unit-vector outputs to train the new classifer

You could also potentially produce training data by taking random snippets S of text from some large dataset of arbitrary text, and ask ChatGPT: `Given the text "S", please list N questions related to the above text that can be answered with Yes, No, or Maybe, and at the end of each question write their answer (one of: Yes, No, or Maybe)`. Where `N` is some small integer (maybe `5 <= N <= 100`)

This classifier could potentially be used to update a system that is keeping track of how some human-programmable state is evolving when the evolved state is not human-programmable but human-describable: you evolve the system and describe it in text, then ask a finite set of questions to synchronize the programmable state with the new system state description

For example, anyone who played the old AI Dungeon back when it used GPT-2 (and probably still now), or who has played a text adventure using ChatGPT (which is really fun: try it out!), knows that the finite length of the input for those systems means they lose track of information frequently, and there are a lot of small details that are lost in general. A human-programmable text adventure, on the other hand, has limited generality, but has a definitive state. With the above classifier you could potentially make a program with a definitive, human-programmable state, evolve the state using a LLM, then update the human-programmable state with the new state's text-description using the classifier

This same technique might be useful for LLMs themselves to generate notes to augment their memories

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.