ganesha/
logwritingworkgallerygraphabout

  1. Home
  2. Writing
  3. agents need interfaces

agents need interfaces

May 2, 2025
9 mins


ganesh kumar

i'm ganesh kumar. design engineer. i build with mycelium, figma, typescript, and whatever's in between since 2018 & believe the best interfaces are the ones you forget you're using... read about the work and team i'm after

  • Email
  • Twitter
  • LinkedIn
  • Github
  • Substack

how do you make a non-human actor’s behavior legible at a glance?

Just dove into this whole AI agent space, A2A by Google following MCP and I’m honestly mind-blown at what’s possible with such simple code. Wanted to capture my thoughts while they’re fresh…

I define an agent as:

agent = AI + tools + autonomy to reach goals & decide when to stop

The coolest part is watching it solve problems on its own:

USER: "can you check if anyone's in the bedroom?"
SMART HOME AGENT:   *thinking*
                  1. Need to see bedroom
                  2. It's dark in there
                  3. Should turn on lights first
                          ↓
SMART HOME AGENT:   *turns on bedroom lights*
                    *activates camera*
                 "The bedroom is empty."

What blows my mind is how SIMPLE the code flow is to get this kind of emergent behavior. That’s the behavior I’m obsessed with but also brings in a new set of UX problems:

SIMPLE CODE ──> COMPLEX BEHAVIOR
     │               │
     │               ▼
     │         ┌──────────┐
     │         │UNEXPECTED│
     │         │SOLUTIONS │
     │         └──────────┘
     │               │
     ▼               ▼
┌──────────┐    ┌──────────┐
│  HOW TO  │    │  HOW TO  │
│UNDERSTAND│    │ CONTROL? │
└──────────┘    └──────────┘

Now, say an AI agent embedded in a electric stove that wants to troubleshoot… it raises huge UX questions:

     USER INTENT
         │
         ▼
    ┌─────────┐
    │ CONFIRM │◀───┐
    │ REPAIR  │    │
    └────┬────┘    │
         │         │
         ▼         │
    ┌─────────┐    │
    │ VISIBLE │    │
    │ ACTIONS │────┘
    └────┬────┘
         │
         ▼
    ┌─────────┐
    │  STOVE  │
    │  AGENT  │
    └─────────┘

like…

  • How does the user actually SEE what the agent is planning?
  • How do you let users veto or modify plans?
  • What’s the interaction model for “wait, don’t do that”?
  • What happens when the agent makes a bad decision with a physical device?

Since OpenClawPermalink

OpenClaw and similar stacks normalized persistent memory, multi-agent routing, and autonomous actions across messaging, files, and devices. Same pattern across domains… Agentic commerce is one row in the table:

INTENT          →  PLAN (editable)  →   CONSENT / SCOPE  →  LOG / RECEIPT
"fix the bug"       diff + steps        branch + secrets    merge + audit
"reply to X"        draft               send-as + tone      sent + thread
"book + pay"        basket + fees       limits + tokens     order + trail

Same obligations everywhere… plan visibility, authority boundaries, audit and replay, plain-language failure, visible handoffs, off-by-default for irreversible or physical actions.

Agent-Native UX Missed OpportunitiesPermalink

Agents traverse stacks… MCP for tools, A2A for coordination, plus domain layers: git, calendar, inbox, identity, and newer agent-commerce rails (ACP, AP2, x402, MPP — one example among many). The failure mode is the same for a card charge or a pull request deploy… outcome in, invisible chain, opaque error out.

[ THE ABSTRACTION GAP ]

  "buy groceries under $75"
   │
┌──▼─────────────────────────────────────────────────────────┐
│  [ THE AGENT PROTOCOL STACK ]                              │
│                                                            │
│  1. MCP ──> discovers tools (GrocerMart API)               │
│  2. A2A ──> coordinates sub-agents (Pricing, Shopping)     │
│  3. ACP ──> standardizes checkout (Agent Commerce Protocol)│
│  4. AP2 ──> proves trust (Agent Authority Protocol)        │
└────────────────────────────────────────────────────────────┘
   │
   ▼
  ✓ done. receipt in email.

that creates new UX obligations:

  • Plan visibility… a live “what i’m about to budget” outline you can edit before execution
  • Consent and authority boundaries… valid substitutions, scoped tokens, sandboxed/timeboxed access, and a real kill‑switch
  • Audit + Replay… readable transcripts and action logs with diff/pr views for cart changes
  • Failure surfaces… partial fulfillment, alternatives, and “why i can’t” reasons in plain language
  • Cross‑agent handoffs… visible handover to payment agent, accountability, and a way to pull work back
  • Off‑by‑default for physical actions; explicit confirmation for irreversible steps

Terminal-native tools (claude code, codex, openclaw-style setups) expose the obvious: agents are log-producing processes. the user sees none of it until something breaks. and when it breaks, the error says “payment failed.” from which layer? which hop? which agent?

Nobody is designing the failure surface for multi-protocol chains. We’re building the happy path and hoping the error messages sort themselves out.

$ agent "book cheapest flight NYC→SF Apr 1"

tool: flights_search(...)
tool: flights_filter(...)
... (more tools)
tool: book_flight({...})

✓ booked. confirmation: AA-7X2K9

After nine tool calls, the booking happened. no gate. no confirmation. the action was irreversible and the interface gave you a receipt, not a choice.

What we actually need is an interface that treats the terminal not as a raw log, but as a parsed surface:

╭─────────────────────────────────────────────────────────────╮
│  ● TASK // book cheapest flight NYC→SF Apr 1                │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│  [✓] search_flights(JFK→SFO, 2026-04-01)                    │
│      ↳ 3 options found: $312 · $445 · $520                  │
│                                                             │
│  [✓] select_flight(AA1234)                                  │
│      ↳ $312 · 7h 20m · nonstop                              │
│                                                             │
│  [!] EXECUTION HALTED // IRREVERSIBLE ACTION                │
│  ┌───────────────────────────────────────────────────────┐  │
│  │  GATE: book_flight()                                  │  │
│  │  ───────────────────────────────────────────────────  │  │
│  │  amount: $312.00                 card: ****1234       │  │
│  │  policy: non-refundable          agent: claude-v1     │  │
│  └───────────────────────────────────────────────────────┘  │
│                                                             │
│  ❯❯ Action required: [Y] Confirm  [N] Cancel  [D] Diff      │
╰─────────────────────────────────────────────────────────────╯

Most terminal-native agent platforms are still designing around the assumption that users want to see tool calls. They don’t. They want to see decisions.

What gets surfaced isn’t every tool call. It’s decisions, gates before irreversible steps, and anomalies against the stated goal.

    [ LAYER 01 : AMBIENT ] ── always visible, never interrupts
    ░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░░
    ● shopping-agent [active] 4m 12s
    ↳ goal: groceries ≤ $75   ↳ current: $64.99


    [ LAYER 02 : SURFACED ] ── state changes & autonomous decisions
    ▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒▒
    [decide] swapped rice brand A → B (saved $1.50)
    [status] 12 items · 1 substitution · on track


    [ LAYER 03 : INTERRUPT ] ── the irreversible gate
    ▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓▓
    ⚠ CHECKOUT REQUIRED
      GrocerMart · $64.99 total
      [ APPROVE ]   [ EDIT CART ]   [ CANCEL ]


      RAW LOG (what happened)         GOAL VIEW (what matters)
    ───────────────────────────     ───────────────────────────
      14:23:01 search("milk")         ▸ buy groceries
      14:23:02 search("eggs")             $64.99 · 12 items
      14:23:03 cart_add("milk")           1 substitution
      14:23:04 cart_add("eggs")           next: checkout
      14:23:05 price_check()
      14:23:06 compare_brands()       ▸ [expand 48 events ↓]
      14:23:07 substitute("rice")
      ...48 more events

Group by goal, not timestamp… the raw log is for debugging. only decisions (agent chose between options), gates (before irreversible actions), and anomalies (behavior deviates from the stated goal).

When to surface: Reversibility is the primary signal. not urgency, not frequency.

REVERSIBLE ◀────────────────────────────────────────▶ IRREVERSIBLE

  search    plan    add-to-cart      pay     send     ship
    ✓         ✓          ✓           ⚠        ⚠        ✗
  silent   silent     silent      CONFIRM   CONFIRM   done

Nothing right of center triggers a gate. Anything left runs silently. Simple rule. Most interfaces don’t implement it.

AUDIT · session grocery-2026-03-25
────────────────────────────────────────────────────
14:23:01  [search]   "weekly groceries near me"
14:23:04  [plan]     cart drafted · $66.50 · 13 items
14:23:08  [decide]   ↺ rice brand A→B · -$1.50    ← why?
14:23:08  [decide]   ✗ soda · out of budget scope  ← why?
14:23:10  [gate]     PAYMENT · awaiting human
14:23:42  [human]    ✓ approved · $64.99
14:23:43  [action]   order #GRM-2847 · placed
14:31:00  [done]     delivered · receipt logged
────────────────────────────────────────────────────
[step-through]   [export]   [share with agent]

the ”← why?” links are the key. each decision should carry its rationale, inspectable on demand… not in the ambient view, but always available. this is what turns a log into a real audit trail.

Coming to accountability problem in multi-agent chains:

▼ TRACE: ORDER #GRM-2847
│
├─[ EXECUTION ]── GrocerMart checkout API
│                 status: 500 (Insufficient Funds)
│
├─[ DELEGATION ]─ Payment-Agent (ACP+AP2)
│                 token:  #8b2c [scoped: $65.00]
│                 action: applied payment policy
│
├─[ DELEGATION ]─ Shopping-Agent (A2A)
│                 token:  #3f9a [scoped: $75.00 max]
│                 action: built cart, requested checkout
│
└─[ ROOT AUTH ]── user@gktk.in
                  device: verified terminal
                  time:   14:21:00 UTC

> DIAGNOSTIC: Shopping-Agent allowed $75, but Payment-Agent
  token #8b2c was hard-capped at $65 during earlier session.
  Conflict found at hop 2.

Right now, most handoffs are invisible. The agent acts. The human sees the outcome. The chain of delegation disappears.

This is the design work nobody is shipping… Making the chain legible at a glance, and fully reconstructable when something breaks. Whether the inspector is a human or another agent.

When things break down… UX for Negative AI ExperiencesPermalink

I keep thinking about how terrible we are at handling the negative spaces in AI interfaces. Like, we’ve all seen those “I’m sorry, I can’t do that” messages that explain nothing and solve nothing.

  USER                     AGENT
    │                        │
    │      REQUEST           │
    │──────────────────────▶ │
    │                        │
    │                        │
    │  ┌────────────────┐    │
    │◀─┤sorry, i can't  │    │
    │  │do that because │    │
    │  │[generic reason]│    │
    │  └────────────────┘    │
    │                        │
    │      FRUSTRATION       │
    │──────────────────────▶ │
    │                        │

The deeper UX questions nobody’s solving:

  1. How do we show users what happened when context exceeds? It’s such an abstract concept but it makes their experience break completely.
  2. What’s the right visual metaphor for “I understood what you asked but I’m not allowed to do it”? right now it’s this weird deflection that makes users feel gaslit.
  3. How do we design graceful degradation for AI systems? They don’t degrade gradually… they just hit walls and stop.
  4. In a multi-agent chain, when something fails, which agent do you blame? how does that attribution surface to the user? right now it doesn’t.
┌──────────────────────────┐
│      NEGATIVE SPACE      │
│                          │
│ ┌─────┐      ┌────────┐  │
│ │LIMIT│─────>│BOUNDARY│  │
│ └─────┘      └────┬───┘  │
│                   │      │
│                   ▼      │
│              ┌─────────┐ │
│              │USER     │ │
│              │RECOVERY │ │
│              └─────────┘ │
└──────────────────────────┘

Research Directions I’m Obsessed With Right Now…Permalink

Multiplayer human–AI, agents acting on the world, tool discovery at scale, text beyond prompting, ambient intelligence… and legibility: devtools optimize spans; humans need decisions and gates; agents need diffs and policy hooks.

                  +-------------------+
                  | my research zones |
                  +---------+---------+
                            |
          +-----------------+----------------+
          |                 |                |
+---------v--------+ +------v------+ +-------v----------+
| human-ai teams   | |simple agents| | text beyond      |
+---------+--------+ +-----+-------+ | prompting        |
          |                |         +--------+---------+
+---------v--------+ +-----v-------+          |
| multiplayer      | |ai that acts |          |
| experiences      | |on the world | +--------v---------+
+------------------+ +-------------+ | ubiquitous       |
                                     | intelligence     |
                                     +--------+---------+
                                              |
                                     +--------v---------+
                                     | embodiment &     |
                                     | physical ai      |
                                     +------------------+

What’s Next for Me?Permalink

I’m looking for meaty projects for my freelance or a role - something that could justify pulling together a small team for 3+ months, ideally with public outcomes.

My experience shows that focused prototypes with tangible outputs lead to:

PROTOTYPE ────> NEW PRODUCT IDEAS
     │
     ├─────> INTERESTING UX CHALLENGES
     │
     └─────> DRIVING TECHNICAL RESEARCH

Maybe it’s not a typical client project though? open to other approaches or even starting something completely new.

Honestly just putting this out there to see what resonates. this feels like such a fertile space right now and I’m itching to build something that matters, specifically on one of these UX problems:

  1. agent transparency and control… live plan surfaces + editable steps + audit logs
  2. multi‑agent coordination… sessions/routing ui, visible handoffs, conflict resolution
  3. physical interfaces… authority scopes, consent gates, kill‑switch semantics
  4. text as interactive medium
  5. agent legibility… the ambient/surface/interrupt stack — and what auditability means when the auditor might be an agent

last updated: 25th feb, 2026

note to self: reach out to folks in agentic product and multiplayer software — natural fit

Topics:

aiuxagentsresearch