Case Study · 003Live · Paying enterprise customers

From whiteboard to paying enterprise customers.

Client
DELOS Analytica · Zürich, CH
Sector
Public Affairs · ESG
Engagement
Embedded product team
Duration
Q3 2024 — present
Team
5 Streavers · 2 DELOS
Stack
Next.js · Neon · Trigger.dev · GPT-5.5
4 monthsWhiteboard to live production
60h → 4mStakeholder research time per dossier
91%News-to-entity matching precision
−71%LLM inference cost vs. naive approach
The DELOS monitoring dashboard: a weekly feed where each card surfaces the issues detected and the stakeholders involved, with a search bar and filters across the top.
What we shippedThe Monitoring workspace — every signal DELOS gathers, triaged into a weekly board with issues and stakeholders attached to each item.
The DELOS Reporting view: a generated board-ready report with an executive summary, strategic-horizon sections, and a 'Top Articles of the Week' rail.
Board-ready outputEvery week compiles into a cited, board-ready report — quadrant summaries up top, the week's key articles down the side.
01A new category of intelligence, born in Zürich.

The category exists. Nobody had built it natively for AI agents.

Founded
2024
HQ
Zürich, Switzerland
Market
Enterprise public affairs & ESG teams in EU/UK
Backers
Bootstrapped
Competes with
FiscalNote, Quorum, Onclusive

DELOS sells into one of the most underserved enterprise functions: corporate affairs and ESG teams accountable for tracking thousands of stakeholders, regulators, journalists, NGOs and policy threads — and translating that noise into board-ready intelligence on a weekly basis.

The incumbents in this space were built between 2010 and 2018. They are databases with dashboards bolted on. DELOS' thesis: the entire workflow — discovery, classification, monitoring, summarisation — should be performed by specialised AI agents, with humans elevated to judgment and validation. Every claim the system makes should be timestamped, traceable, and defensible in a regulated context.

When Silvan Krähenbühl, DELOS' founder and CEO, came to Streaver, the product existed as a vision document, a Figma sketch, and three signed letters of intent from Tier-1 European corporates. There was no codebase. There was no engineering team. There was a runway clock.

02The Challenge

Four hard problems, all to be solved at once.

DELOS' pilots had already been written into commercial procurement cycles. Slipping the launch wasn't a missed milestone — it was a missed market window. The MVP had to be live, in front of paying customers, in sixteen weeks. Inside that window were four problems that don't typically coexist in a single product.

01

A multi-agent system that earns trust

Stakeholder classification, issue mapping, and news-to-entity linking all needed to run continuously — and every output had to be inspectable. A black-box agent that hallucinates a regulator's position is worse than no agent at all in this market.

AI Architecture
02

Real-time ingestion of the open web

Six thousand news sources, NGO sites, parliamentary feeds, regulator press rooms. Ingestion had to be cheap, polite to the source, and resilient to schema drift — without a six-month data engineering build-out.

Data Infrastructure
03

An interface that disarms novices

The buyer is a head of public affairs, not a power user. The first ten minutes had to feel less like a database and more like a researcher who already knows their portfolio. Every screen had to surface the next sensible action.

Product Design
04

Swiss-grade governance from day one

Audit logs, data residency, SSO, role isolation, GDPR posture. Standard for European enterprise sales — frequently fatal to startups that try to retrofit it later.

Enterprise Readiness
We interviewed three other studios. Streaver was the only team that pushed back on our architecture before the contract was signed — and the only one whose engineers had already shipped agentic systems into production.
Silvan KrähenbühlFounder & CEO · DELOS
03Selection

Why DELOS chose Streaver.

Silvan ran a structured selection across three vendors. Two were larger Swiss/EU agencies; one was a remote-first AI specialist. The procurement scorecard weighted four dimensions.

Production AI experience, not prompt-jockeying.

Streaver had already shipped multi-agent systems that survived contact with real users, real budgets, and real latency budgets. The reference checks happened on Slack with prior CTOs, not on a curated case study page.

Time-zone and culture alignment.

Montevideo overlaps Zürich's afternoon and New York's morning. Daily standups happen at 10:00 CET with no ceremony. Two of our engineers had previously worked with Swiss clients and understood the directness of the feedback culture.

A predictive delivery model.

No shadow offshoring. Every engineer on the engagement, working client-facing and shipping from day 1. Same senior team that started with the initial concept.

Ownership, not staffing.

Streaver agreed to commit to delivering sustainable outcomes — MVP live, agents at high precision, three named integrations — rather than just billing hours. Pricing includes a huge commitment through a fractional CTO engagement and stable team.

04Architecture

Four specialised agents, one validation loop.

We rejected the “one big agent that does everything” pattern early. It's the natural default — and it's wrong for this workload. Discovery, classification, monitoring and summarisation have different latency budgets, different cost profiles, different precision/recall trade-offs, and benefit from different model sizes.

We landed on four narrow agents, each with a single, evaluatable job, orchestrated by a deterministic state machine with a separate validation pass.

DELOS system architecture diagramDELOS system architecture: ingestion sources feed the Trigger.dev job orchestrator, which dispatches work to four specialised agents — Discovery, Classification, Monitoring and Summarisation. A validation agent gates outputs before they reach the Next.js application backed by Neon Postgres with pgvector. A human-in-the-loop feedback signal informs the next discovery cycle.SOURCESINGESTIONAGENT LAYERQUALITYSTORAGEAPPLICATIONUSERNews API6,000+ outlets · globalCustom scrapersNGOs · regulators · govRSS & feedsparliamentary · trade pressTRIGGER.DEVJob orchestratorPolite crawlingRetries · backoffDead-letter queue~14M docs / moAGENT 01DiscoveryGPT-5.5 · public-web sweepAGENT 02ClassificationGPT-5.5-mini · high-volumeAGENT 03MonitoringGPT-5.5-mini + embeddingsAGENT 04SummarisationGPT-5.5 · on-demand onlyVALIDATIONQuality gateCitation checkConfidence floorProvenance traceHuman review queueREJECT→ flag & loop backACCEPT→ write & indexNEON · POSTGRESRelational + pgvectorEntities · issues · news · embeddings · audit logVERCEL · NEXT.JSApp · API · SSO · audit logTypeScript · React Server Components · RadixHUMAN-IN-LOOPPublic affairs teamValidates · annotates · exportsHuman feedback loop · informs next discovery cycle
Figure 1 · DELOS system architecture · simplified for clarity
The DELOS Issues board: an impact-versus-urgency quadrant where the agents plot detected issues as colour-coded cards, with a 'Generate Issues' action.
Agent output · IssuesThe discovery and classification agents surface issues onto an impact/urgency board — one click regenerates them from the latest signals.
The DELOS Stakeholders map: a grid of Tier 1 to Tier 3 rows across Media, Government, Politics and Research columns, with a 'Generate Stakeholders' action.
Agent output · StakeholdersThe stakeholder agent maps actors into tiers across Media, Government, Politics and Research — the core public-affairs intelligence view.
05Decisions

Three technical bets that defined the product.

01

Two-tier model routing on every agent call.

Every agent call routes between GPT-5.5 and GPT-5.5-mini based on a confidence-and-cost heuristic computed at invocation time. High-volume, narrow tasks — entity classification, simple news-to-issue matching — run on mini. Anything that requires reasoning over a graph of relationships, or producing prose a customer will read, routes to GPT-5.5. Summaries are not generated eagerly. They are produced on-demand when a user opens a stakeholder card, with results cached for 24 hours.

ResultAverage cost per stakeholder profile fell from CHF 0.34 to CHF 0.09 between week 6 and week 16. The mini/5.5 ratio stabilised at roughly 84/16 by call volume.
02

A validation agent that can say 'I don't know.'

The validation pass is not a vibe check. For every claim an agent produces, the validator demands a citation, a confidence score, and a provenance chain back to the source document. Outputs below a configurable confidence floor are routed to a human review queue rather than published. The floor is tunable per customer — Swiss banks set it stricter than NGOs. Precision went from 67% (single-pass) to 91% (two-pass with validator) while recall held at 88%.

PrincipleIn regulated B2B, a system that occasionally says 'low confidence — please review' is trusted faster than one that always answers with false certainty.
03

Buy the backbone. Build the edges.

We argued against building proprietary news ingestion. A commercial news API gives us 6,000+ outlets, multilingual coverage, and a contractual SLA — for roughly 1/8th the cost of running an in-house crawl fleet that would still under-cover the long tail. What we did build: custom scrapers for the niche sources that mattered to DELOS' specific buyers — Swiss regulators, EU parliamentary committees, ESG-focused NGOs. Sixty-two of them, hand-tuned, polite, monitored. Commercial backbone, custom edge — the right shape for almost every 'AI on top of the open web' product.

06Honest

What didn't work. And what we'd do differently.

Pivot · Week 5

The single-agent prototype that wasn't.

Our first architecture used a single orchestrator agent that planned and executed all four tasks in sequence. It was elegant in slides. In production, it was slow, expensive, and produced wildly variable output quality depending on the order in which it decided to work. We rebuilt it as four specialised agents behind a deterministic state machine in week five. Two engineer-weeks lost. Worth it.

Discarded · Week 7

Fine-tuning we didn't need.

We spent eight days exploring a fine-tuned classification model for news-to-entity matching. The results were marginal — and the maintenance cost (re-training as entity sets evolve) was a tax DELOS shouldn't pay this early. We threw out the fine-tune and got a better outcome from careful prompting plus a retrieval step. We should have set a tighter time-box at the start.

Simplified · Week 9

The graph visualisation users hated.

The original stakeholder map was a force-directed graph. It was beautiful in the Figma. It was an unreadable hairball at 200+ entities, which is below our average customer's portfolio. We replaced it with a quadrant view (influence × alignment) plus a structured list with filters. Beta NPS moved fifteen points in two weeks.

In progress

Tech debt we are paying down now.

Shipping a 16-week MVP means shipping with technical debt. Specifically: our test coverage on the agent orchestrator was 41% at launch, and three of the custom scrapers had no monitoring. We've since brought coverage to 78% and instrumented every scraper, but we want to be honest that 'MVP velocity' and 'production hardness' are different settings, and Q1 2026 was about converting one into the other.

07Outcomes

The numbers that mattered to DELOS.

We focus on outcomes the business uses internally — not vanity metrics. Each figure below is paired with the baseline it's measured against, and is reproducible from DELOS' internal telemetry.

60h manual4 min automated

Stakeholder dossier production time

End-to-end production of a 30-entity briefing pack. Previously a senior analyst week. Now a verification task.

67% baseline91% post-validation

News-to-entity matching precision

Measured against a 4,200-row gold set hand-labelled by DELOS' analyst team. Recall held at 88%.

CHF 0.34CHF 0.09

Cost per stakeholder profile (LLM)

Net of the two-tier routing and on-demand summarisation decisions. 71% reduction over the first 90 days post-launch.

pre-pilot8 paying pilots

Commercial traction · first 6 months

Closed pilots across Swiss banking, EU energy, and Tier-1 corporate communications. Two converted to annual contracts within 90 days.

no telemetrySentry + PostHog

Monitoring & observability

Every exception surfaces in Sentry and every product event flows to PostHog — so regressions and usage shifts are caught and triaged in hours, not when a customer reports them.

no codebase16 weeks to live

Whiteboard to paying customer

From signed engagement to first revenue-generating customer. No standing engineering team prior to engagement.

Streaver doesn't feel like a vendor. They feel like the engineering team I would have built if I'd had eighteen months and a Swiss salary budget. The MVP is in live production with paying customers, and the team has kept everything on track without a single missed sprint.
Silvan KrähenbühlFounder & CEO · DELOS · Zürich
08The Team

Five Streavers. Two from DELOS. One product.

Every person named below was on the team at week one and is still on the team today. No staffing churn, no shadow offshoring. This is the team that shipped, and the team that will still be shipping in 2027.

Fede
Fede
CTO

Provides technical leadership and strategic direction, helping shape the platform architecture, guide key engineering decisions, and ensure the team balances rapid delivery with long-term scalability.

Feld
Feld
Team Leader

Leads day-to-day execution, aligning priorities across disciplines and helping the team maintain a consistent delivery cadence while navigating evolving product requirements.

NicoW
NicoW
Senior Full-Stack

Drives the development of core platform capabilities, contributing across frontend, backend, and infrastructure while solving complex technical challenges that support the product’s growth.

Joaquín
Joaquín
Junior Full-Stack

Contributes to feature development and product improvements across the stack, helping accelerate delivery through reliable execution and continuous learning.

Cate
Cate
Product Designer

Designs intuitive user experiences that simplify complex workflows, translating product requirements into interfaces that are clear, accessible, and effective for end users.

How the engagement is structured

Cadence

Two-week sprints. Daily standup at 10:00 CET. Sprint review with Silvan and DELOS product on Fridays. Monthly retro and quarterly business review.

Communication

Shared Slack workspace, Linear for engineering, Notion for product documentation. DELOS has full read access to every channel and repository.

Pricing

Fixed-fee per milestone for the first sixteen weeks. Monthly retainer with capacity commitment thereafter.

IP and security

All IP transfers to DELOS on payment. Audit log and access reviews monthly. Two of our engineers hold the credentials they need; no shared accounts.

Timeline

WEEK 00Engagement signedThree LOIs in hand · zero code · architecture sketched on a Friday whiteboard.
WEEK 02Architecture frozenFour-agent decision made · Neon + Vercel + Trigger.dev confirmed · evaluation harness scaffolded.
WEEK 05First end-to-end flowDiscovery → classification → DB write working on a single test customer. Output quality: not yet shippable.
WEEK 07Validation agent onlinePrecision jumps from 67% to 84% in a week. We stop optimising the upstream agents and start trusting the gate.
WEEK 09UI pivotGraph viz replaced with quadrant + list. Onboarding flow rebuilt around 'first sensible action.'
WEEK 11Pilot customer onboardedFirst paying customer in staging. Three feedback loops in seven days.
WEEK 16Production launchLive. Paying. Audit log, SSO, role isolation all in place. Sprint cadence continues uninterrupted into Q2.
09Stack

Boring choices everywhere except where we couldn't afford them.

We don't reach for novelty. Every choice below earned its place — and could be operated by a small team without a dedicated platform engineer.

TBD · stack list pending engineer verification against current DELOS codebase

Languages & Runtime

  • TypeScriptstrict mode, end-to-end
  • Node.js20 LTS, on Vercel + Trigger.dev
  • SQLPostgres dialect

Application

  • Next.jsApp Router, RSC
  • Radix · Tailwindcomponent primitives, design tokens
  • NextAuth + WorkOSSSO, audit log
  • tRPCend-to-end typed APIs

Data & Infra

  • Neon Postgres+ pgvector for embeddings
  • Trigger.devlong-running agent jobs, retries
  • Vercelapp + edge + cron
  • S3 + CloudFrontdocument store + CDN

AI & Observability

  • GPT-5.5 · 5.5-minirouted per-call
  • text-embedding-3-largeretrieval & similarity
  • Braintrusteval harness
  • Sentry + PostHogerrors + analytics
10What's Next

The roadmap we're building toward.

The MVP is the beginning, not the deliverable. Three threads define the next six months of work with DELOS.

A conversational layer over the entire workspace.

A grounded chatbot that lets a user ask 'what changed about the proposed EU CSRD revisions this week, and which of our stakeholders said something?' — and answer with cited sources, in under fifteen seconds.

Multilingual depth.

Today the system reads English, German, French. Adding Italian, Spanish and Portuguese coverage in Q3 unlocks the southern European procurement pipeline.

Geographic and vertical expansion.

The same architecture works for US public affairs, healthcare advocacy, and pharma regulatory tracking. We are actively scoping the second vertical now.

Working on something similar?

Strengthen your tech with engineers who've shipped production AI.

Streaver embeds senior product teams inside companies building agentic and AI-native products. From whiteboard to live customers, with the same engineers throughout.

11Continue reading
Supreme Golf
Featured
Live

Supreme Golf

Building a $1M product for $125K with a non-technical CEO at the keyboard

8× lower cost · 3–5 deploys/dayRead the case study
American Institute of Architects
Featured
Live
RescueRegulatedScaleAssociation · USA

American Institute of Architects

48% lower costs across six mission-critical products, with no team handover

48% lower costs · 24/7 across 6 productsRead the case study