We’re living in an interesting transition period. Most companies aren’t building pure AI applications from scratch; they’re adding AI agents to existing systems with established patterns, APIs, and expectations. This creates a fascinating architectural challenge: how do you bridge the reliable, synchronous world of traditional applications with the dynamic, asynchronous nature of AI agents?
Summary
A SaaS team introduced AI agents alongside their existing web app and backend services. The agents worked well in isolation but didn’t integrate cleanly with the rest of the system. Events were dropped, workflows were unpredictable, and debugging was painful. By introducing an event control plane, the team created a clean interface between the dynamic behaviour of AI and the reliability expectations of traditional software.
The problem: two worlds, no reliable handoff
The company had a mature backend system. It used REST APIs, well-defined services, and consistent data flows. Then they added AI agents to automate customer tasks and internal workflows.
The two architectures had different assumptions:
- Traditional services expected strict delivery guarantees
- AI agents ran asynchronously and produced variable output
- Failures were often swallowed
- Message passing relied on polling or grpc
- Coordination logic lived in glue code or ad hoc scripts
Agents dropped events without retries. Backend services processed the same event twice. Nobody could say with confidence what had happened, or when.
What broke first
As usage grew, problems multiplied:
- Billing services processed duplicate requests
- Agents failed silently and skipped important tasks
- Debugging required digging across multiple systems
- Developers had no way to trace events across the full flow
- Customers reported inconsistent outcomes from the same inputs
There was no unified path for events. Just a mix of queue handlers, HTTP calls, and shared database writes.
The change: one control plane, two interfaces
The team introduced Sailhouse as a shared event control plane. Rather than wiring systems together directly, each component subscribed to and emitted structured events.
Here’s how they made the connection:
- Agents emitted events like task.completed or summary.generated
- Traditional services subscribed with strict delivery guarantees
- AI workflows used more relaxed, eventually consistent delivery
- Deduplication and retries were handled at the event level
- Dashboards showed each event’s full lifecycle
This allowed both sides of the system to evolve independently, while still working together.
Why not use Kafka or a message broker?
The team considered using Kafka. It was powerful, but introduced complexity they didn’t want:
- Cluster management and topic planning
- Complicated dev environments and offset tracking
- Uniform delivery semantics across all consumers
- Manual retry and failure handling
They needed more flexibility. With Sailhouse, each subscriber could have its own delivery settings. Some services required rate limiting. Others tolerated pushing whenever an event arrived. No brokers to manage. No infrastructure to provision.
Different reliability needs, one model
This setup gave the team clean separation without duplication:
- Billing services processed events once, with full audit trails
- AI agents handled failures without bringing down the whole system
- Retry behaviour was consistent and visible
- No more silent failures or unexpected duplicates
- Events were observable from creation to delivery
The backend remained stable. The agents stayed flexible.
What changed
- Agent-related failures dropped significantly
- Backend services no longer processed duplicated input
- Debugging cross-system workflows became much easier
- New agent features could be added without risk to existing systems
- Teams gained confidence in how events moved across boundaries
Why it matters
Modern systems aren’t built from scratch. They evolve. Adding AI capabilities to an existing platform means integrating two different styles of software development.
This team didn’t force one model to match the other. They used events to bridge the gap. Agents could remain dynamic. Traditional systems could stay safe and auditable. Everyone had the observability and control they needed — without rewriting what already worked.