Aug 15, 2024Updated Sep 3, 2024

Program search & synthesis

Prompting LLMs is program search

Francois Chollet wrote a great tweet last October on how to work with LLMs. Essentially, his take is that we should think about prompting an LLM as program search.

An LLM prompt:

Serves as a kind of database key to fetch a program from the LLM.
Provides the parameters to run the fetched program.

Importantly, because LLMs can generate code, we can separate (1) and (2). We can:

Prompt an LLM to generate a program.
Evaluate the generated program (running the program, unit tests, or static analysis).

And we can repeat this process – iteratively editing and evaluating programs.

Autonomous program synthesis is the next frontier

Recently, Chollet ignited a campaign for more creative thinking in AI research with the ARC prize. His framing of the challenge is related to the above interpretation of LLMs.

Essentially, the thrust of ARC-AGI is that LLMs alone are not enough – but we can use them to build a self-regulating system of programs that autonomously evolves towards a goal.

References

Francois Chollet, Prompt Engineering

(Oct 2023) https://x.com/fchollet/status/1709242747293511939

A LLM is a repository of many (millions) of vector programs mined from human-generated data, learned implicitly as a by-product of language compression. A "vector program" is just a very non-linear function that maps part of the latent space unto itself.

When you're prompting, you're fetching one of these programs and running it on an input -- part of your prompt serves as a kind of "program key" (as in database key) and part serves as program argument(s). Like, in "write this paragraph in the style of Shakespeare: {my paragraph}", the part "write this paragraph in the stye of X: Y" is a program key, with arguments X=Shakespeare and Y={my paragraph}.

The program fetched by your key may or may not work well for the task at hand. There's no reason why it should be optimal. There are lots of related programs to choose from.

Prompt engineering represents a search over many keys in order a find a program that is empirically more accurate for what you're trying to do. It's no different than trying different keywords when searching for a Python library.

Everything else is unnecessary anthropomorphism on the part of the prompter. You're not talking to a human who understands language the way you do. Stop pretending you are.

Rich Sutton, The Bitter Lesson

(Mar 2019) http://www.incompleteideas.net/IncIdeas/BitterLesson.html

General methods that leverage computation are ultimately the most effective, and by a large margin.

ARC Prize, Impact

(June 2024) https://arcprize.org/arc

At minimum, solving ARC-AGI would result in a new programming paradigm. It would allow anyone, even those without programming knowledge, to create programs simply by providing a few input-output examples of what they want.

This would dramatically expand who is able to leverage software and automation. Programs could automatically refine themselves when exposed to new data, similar to how humans learn.

If found, a solution to ARC-AGI would be more impactful than the discovery of the Transformer. The solution would open up a new branch of technology.