Everyone’s building “async agents,” but almost no one can define them

For an example of what an "async" agent implementation should help you accomplish: https://youtu.be/hGhnB0LTBUk?si=q78QjgsN5Kml5F1E&t=5m15s

You can use the idea to spin-off background agent tasks that can then be seamlessly merged back into context when they complete.

The example above is a product specific approach but the idea should be applicable in other environments.... it's really an attempt to integrate long running background tasks while continuing with existing context in an interactive manner.

When you start working on the problem of working with automation programs (AKA agents) in an interactive human-in-the-loop fashion, you will naturally run into these kinds of problems.

We've all seen sci-fi movies with AI assistants that seamlessly work with humans in a back and forth manner, async spin-offs are essential for making that work in practice for long running background tasks.

- I ask for butter and walk away. - It passes the butter to where I expect it to be when I return. - That is its purpose.

That's just a slow response with extra steps.

There's also the concept of a daemon process that looks for work to do and tells you about it without being prompted.

(Rick and Morty reference: https://www.youtube.com/watch?v=X7HmltUWXgs)

Do you try to pull the butter onto your knife periodically, or do you wait somehow until it pushes the butter onto your knife? When does it become less work to just go get the butter yourself?

You just need a self-buttering knife, like a self-licking ice cream cone:

https://en.wikipedia.org/wiki/Self-licking_ice_cream_cone

hey, ishaan here (kartik's cofounder). this post came out of a lot of back-and-forth between us trying to pin down what people actually mean when they say "async agents."

the analogy that clicked for me was a turn-based telephone call—only one person can talk at a time. you ask, it answers, you wait. even if the task runs for an hour, you're waiting for your turn.

we kept circling until we started drawing parallels to what async actually means in programming. using that as the reference point made everything clearer: it's not about how long something runs or where it runs. it's about whether the caller blocks on it.

Not to be all captain hindsight, but I was puzzled as I was skimming the post, as this seemed obvious to me:

Something is async when it takes longer than you're willing to wait without going off to do something else.

IMO feels sorta like Simon Willison's definition of agents. "LLMs in a loop with a goal" feels super obvious, but not sure if I would have described it that way in hindsight

Maybe, but that's what I thought while reading the "what actually is async?" part of the post, so I don't think I got biased towards the answer by that point.

One nuance that helps: “async” in the turn-based-telephone sense (you ask, it answers, you wait) is only one way agents can run.

Another is many turns inside a single LLM call — multiple agents (or voices) iterating and communicating dozens or hundreds of times in one epoch, with no API round-trips between them.

That’s “speed of light” vs “carrier pigeon”: no serialization across the boundary until you’re done. We wrote this up here: Speed of Light – MOOLLM (the README has the carrier-pigeon analogy and a 33-turn-in-one-call example).

Speed of Light vs Carrier Pigeon: The fundamental architectural divide in AI agent systems.

https://github.com/SimHacker/moollm/blob/main/designs/SPEED-...

The Core Insight: There are two ways to coordinate multiple AI agents:

  Carrier Pigeon
    Where agents interact: between LLM calls
    Latency: 500 ms+ per hop
    Precision: degrades each hop
    Cost: high (re-tokenize everything)
  Speed of Light
    Where agents interact: during one LLM call
    Latency: instant
    Precision: perfect
    Cost: low (one call)
  MCP = Carrier Pigeon
    Each tool call:
      stop generation → 
      wait for external response → 
      start a new completion
    N tool calls ⇒ N round-trips

MOOLLM Skills and agents can run at the Speed of Light. Once loaded into context, skills iterate, recurse, compose, and simulate multiple agents — all within a single generation. No stopping. No serialization.

i just imagine it as the swap between "human watching agent while it runs"

vs "agent runs for a long time, tells the user over human interfaces when its done" eg. sends a slack. or something like gemini deep research.

an extension would be that they are triggered by events and complete autonomously with only human interfaces when it gets stuck.

theres a bit of a quality difference rather than exactly functionally, in that the agent mostly doesnt need human interaction beyond a starting prompt, and a notification of completion or stuckness. even if im not blocking on a result, it cant immediately need babying or i cant actually leave it alone

[deleted]

"Background job"?

The real question is what happens when the background job wants attention. Does that only happen when it's done? Does it send notifications? Does it talk to a supervising LLM. The author is correct that it's the behavior of the invoking task that matters, not the invoked task.

(I still think that guy with "Gas Town" is on to something, trying to figure out connect up LLMs as a sort of society.)

Marvin Minsky thought of it a long time before Gas Town, and yes, he was on to something.

https://en.wikipedia.org/wiki/Society_of_Mind

>The Society of Mind is both the title of a 1986 book and the name of a theory of natural intelligence as written and developed by Marvin Minsky.

>In his book of the same name, Minsky constructs a model of human intelligence step by step, built up from the interactions of simple parts called agents, which are themselves mindless. He describes the postulated interactions as constituting a "society of mind", hence the title. [...]

>The theory

>Minsky first started developing the theory with Seymour Papert in the early 1970s. Minsky said that the biggest source of ideas about the theory came from his work in trying to create a machine that uses a robotic arm, a video camera, and a computer to build with children's blocks.

>Nature of mind

>A core tenet of Minsky's philosophy is that "minds are what brains do". The society of mind theory views the human mind – and any other naturally evolved cognitive system – as a vast society of individually simple processes known as agents. These processes are the fundamental thinking entities from which minds are built, and together produce the many abilities we attribute to minds. The great power in viewing a mind as a society of agents, as opposed to the consequence of some basic principle or some simple formal system, is that different agents can be based on different types of processes with different purposes, ways of representing knowledge, and methods for producing results.

>This idea is perhaps best summarized by the following quote:

>What magical trick makes us intelligent? The trick is that there is no trick. The power of intelligence stems from our vast diversity, not from any single, perfect principle. —Marvin Minsky, The Society of Mind, p. 308

That puts Minsky either neatly in the scruffy camp, or scruffily in the neat camp, depending on how you look at it.

https://en.wikipedia.org/wiki/Neats_and_scruffies

Neuro-symbolic AI is the modern name for combining both; the idea goes back to the neat/scruffy era, the term to the 2010s. In 1983 Nils Nilsson argued that "the field needed both".

https://en.wikipedia.org/wiki/Neuro-symbolic_AI

For example, combining Gary Drescher’s symbolic learning with LLMs grounds the symbols: the schema mechanism discovers causal structure, and the LLM supplies meanings, explanations, and generalization—we’re doing that in MOOLLM and spell it out here:

MOOLLM: A Microworld Operating System for LLM Orchestration

See: Schema Mechanism: Drescher's Causal Learning

https://github.com/SimHacker/moollm/blob/main/designs/LEELA-...

Also: LLM Superpowers for the Gambit Engine:

https://github.com/SimHacker/moollm/blob/main/designs/LEELA-...

Schema Mechanism Skill:

https://github.com/SimHacker/moollm/blob/main/skills/schema-...

Schema Factory Skill:

https://github.com/SimHacker/moollm/blob/main/skills/schema-...

Example Schemas:

https://github.com/SimHacker/moollm/tree/main/skills/schema-...

People can appreciate others for their work but... Minsky is not just named several times in the Epstein files: he went to Epstein's island after Epstein had already been charged several times with sex offenses. And one of the main witness, Virginia Giuffre, said Epstein instructed her to have sex with Minsky.

> "minds are what brains do"

And "a man is what he does".

The record doesn’t say what you’re implying. Virginia Giuffre’s deposition is that Epstein told her to have sex with Minsky. It does not say that Minsky agreed, touched her, or did anything. That’s “he was instructed to be offered to,” not “he did it.”

What we have from people who were there:

Greg Benford (physicist and SF author, present that day) stated publicly: "I was there. Minsky turned her down. Told me about it." [InstaPundit, Aug 2019, quoting Benford: https://instapundit.com/339725/ ]

>Typical Crap Journalism from NYT:

>“In a deposition unsealed this month, a woman testified that, as a teenager, she was told to have sex with Marvin Minsky, a pioneer in artificial intelligence, on Mr. Epstein’s island in the Virgin Islands. Mr. Minsky, who died in 2016 at 88, was a founder of the Media Lab in the mid-1980s.”

>Note, never says what happened. If Marvin had done it, she would say so. I know; I was there. Minsky turned her down. Told me about it. She saw us talking and didn’t approach me.

https://en.wikipedia.org/wiki/Gregory_Benford

Minsky was there with his wife, told her about the approach, and told Benford right afterward. So we have a first‑hand, on-the-record account that he declined, plus the fact that he immediately told his wife and a colleague. There is no evidence he “did” anything.

So: (1) the allegation that he did something is unsupported by the testimony and contradicted by an eyewitness; (2) even if it weren’t, “a man is what he does” has nothing to do with whether Society of Mind or his other theories are valid. Newton’s physics and Minsky’s cognitive architecture stand or fall on evidence and argument, not on moral purity. Conflating a disputed personal allegation with the worth of his ideas is a smear, not an argument.

David Henkel-Wallace (gumby) has posted about this before on HN:

https://news.ycombinator.com/item?id=22015840

>gumby on Jan 10, 2020 | next [–]

>I know several people who were at that island and have discussed this event; one even told me that he remembered it because Marvin came over to him and said "this woman just offered to have sex with me." Also Gloria, his wife, was there, though I haven't asked her about it (and wouldn't). This seems believable to me.

>OTOH I did read Giuffre's deposition and she says not just that she was told by Epstein to proposition various people but that it happened. I find that very hard to believe having known him so long, but she made that statement under oath. Also I'm not sure Marvin was famous enough to be worth making up a story about (as opposed to, say, a famous heir to a throne).

Gumby was mistaken in claiming the deposition says “it happened”; he was very likely inferring it from the same transcript. What "happened" is she was told to have sex with him, but there is absolutely no evidence or testimony that he did, and there is evidence from Greg Benford that he didn't.

Gwern draws the same distinction:

https://news.ycombinator.com/item?id=20774197

Look for yourself here:

https://www.documentcloud.org/documents/7010864-virginia-giu...

Now do you have anything interesting to say about his theories, other than trying to smear him?

So, say you want this. How do you do it with Claude Code?

Read my post on this from 9 months ago: https://jdsemrau.substack.com/p/designing-agents-architectur...

^^ requires paid subscription.