Sep 3, 2024Updated Sep 5, 2024

Designing the Substrate API

We decided to stop building Substrate because we stopped believing in the inference service. But across the board, people appreciated our take on API design. Here's how we designed the Substrate API following 3 principles:

Minimal abstractions
Simple inputs & outputs
Lazy evaluation

Minimal abstractions

We were obsessed with reducing cognitive overhead. Every abstraction you introduce is a new concept for users to learn and remember. There’s only one high-level abstraction in the Substrate API: Nodes.

topic = "a magical forest"
story = ComputeText(prompt=f"Tell me a story about {topic}")

Names do a lot of conceptual work in an API – and we chose to convey the underlying process when you call a language model, by using the term Compute.

Simple inputs & outputs

A guiding API design principle I learned at Stripe is that advanced use cases shouldn't complicate the simple path. The OpenAI Chat Completions API has become the de facto standard, but feels suboptimal in this regard.

topic = "a magical forest"
completion = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=[{ "role": "user", "content": f"Tell me a story about {topic}" }],
)
print(completion.choices[0].message)

We chose to keep the simple use case (generating a single choice) simple, and allowed generating multiple choices via another node. Because any Substrate node can be connected to any other node, we aimed to simplify inputs and outputs as much as possible.

Substrate nodes were inspired by Unix: an enduring standard library of simple, composable programs for common tasks. The Unix philosophy begins with two relevant principles:

Make each program do one thing well. To do a new job, build afresh rather than complicate old programs by adding new ‘features’.
Expect the output of every program to become the input to another, as yet unknown, program. Don’t clutter output with extraneous information.

Lazy evaluation

Our programming model let you describe compound AI workflows as a computation graph, and submit the entire graph to our inference engine. This made it possible to optimize workflows (e.g. merging multiple parallel ComputeText nodes into a single batch call), and automatically schedule them with optimal parallelism.

You didn't have to build a graph explicitly. By referencing the future value of one node in the input of another node, you could implicitly connect nodes. It was an interesting declarative approach that simulated imperative, eager evaluation.

topic = "a magical forest"
story = ComputeText(prompt=f"Tell me a story about {topic}")
summary = ComputeText(prompt=sb.format("Summarize in one sentence: {story}",
                                       story=story.future.text))
res = substrate.run(summary)

The examples below give a feel for what it was like to use Substrate in practice.

prompt = "Recipe for banana chiffon pie"
get_store = FindOrCreateVectorStore(collection_name="almanac", model="jina-v2")
fetch_sources = QueryVectorStore(
  collection_name=get_store.future.collection_name,
  model=get_store.future.model,
  query_strings=[prompt],
  include_metadata=True,
)
template = """
{{ prompt }}

Use the reference materials from the farmers almanac provided below and cite page numbers.
{% for item in results %}
  {{ item.metadata }}
{% endfor %}
"""
answer_question = ComputeText(prompt=sb.jinja(template, prompt=prompt, results=fetch_sources.future.results[0]))

Unreleased ideas

Before we stopped, we were working on several initiatives to further simplify the experience of using Substrate.

1. Blending local and remote evaluation

Working with future references and sb operators was too challenging for beginners. We were working on enabling normal references and locally-defined functions – under the hood, we could automatically schedule your graph between local and remote evaluation.

b = MyNode(some_input=log_input(other_node.output))

def log_input(input):
  print(input)
  return input

2. Shareable graphs

A nice benefit of the declarative approach is the ability to reference programs as data. We were close to enabling this feature: the ability to publish a graph as a module. From there, we planned to build a library of useful pre-built workflows.

x = sb.var(type="string", default="hello")
y = sb.var(type="string")
z = sb.var(type="object", properties={})

publication = substrate.module.publish(
  name="my reusable graph",
  nodes=[a, b],
  inputs={"x": x, "y": y, "z": z}
)

The search for a name is a fundamental part of the process of inventing or discovering a pattern. So long as a pattern has a weak name, it means that it is not a clear concept.

Sebastian Bensusan, APIs as ladders

(Jan 2022) https://blog.sbensu.com/posts/apis-as-ladders

What developers need:

In order to get started, beginners need an API to be convenient.

In order to take the next step, novices need the API to be gradual.

In order to solve most problems, experts need the API to be flexible.

What the developer market empirically cares about:

If the API is not convenient → beginners don’t adopt the API

If the API is not gradual → novices find it complicated and don’t become experts.

If the API is not flexible → experts eventually “eject” to something else to solve their problems.

While having a flexible API makes the other two steps easier, the market doesn’t care about flexibility at first. It is tempting to start by making the API convenient and ignore its flexibility. When targeting beginners, convenience has the most immediate impact on adoption but starting with it leads to a dead end.

But when targeting developers with an existing project (like a large enterprise) convenience is less important. There, the developer will spend lots of time learning and scoping out an integration. What matters is that they can use your API in the first place. More often than not, your API has one restriction that makes it impossible for them to adopt it. The more flexible your API, the more likely it is to satisfy the project’s constraints.

Designing the Substrate API

Minimal abstractions

Simple inputs & outputs

Lazy evaluation

Unreleased ideas

1. Blending local and remote evaluation

2. Shareable graphs

References

Doug McIlroy, Bell System Technical Journal

Christopher Alexander, The Timeless Way of Building

Sebastian Bensusan, APIs as ladders

On this page