WAB — Web Agent Bridge

Your AI Stack, Intelligently Managed.

WAB sits between your application and every AI provider. It classifies each request, routes it to the cheapest sufficient model, enforces your budget, and reports every dollar saved — all without changing a single line of your code.

Architecture

Intelligent Routing in 4 Layers

Every request passes through 4 WAB layers in under 5ms.

LAYER 1

Auth & Identity

Verify API key, load org plan, check spending caps — before any AI call is made.

LAYER 2

Classification Engine

ML + NLP analysis: task type, complexity (1-10), sensitivity flags, cache lookup. Decision in <3ms.

LAYER 3

Routing Decision

Apply your routing policy (COST_FIRST / QUALITY_FIRST / LOCAL_FIRST) to select the optimal provider.

LAYER 4

FinOps Metering

Count tokens, record actual vs. baseline cost, update spending caps, write audit log.

Model Routing

The 3-Tier Routing Cascade

WAB always starts with the cheapest tier and only escalates when the task demands it.

01~$0.00 / 1K tokens

Local / On-Device

Simple tasks: Q&A, classification, drafts

Llama 3.2 3B, Mistral 7B via Ollama

02~$0.001 / 1K tokens

Open-Source Cloud

Medium tasks: summarization, translation, code assist

Llama 3.3 70B (Groq), Gemini Flash, Mistral

03~$0.01–0.06 / 1K tokens

Frontier (Premium)

Complex tasks: reasoning, legal, medical, code review

GPT-4o, Claude 3.5 Sonnet, Gemini Pro

Result: A team that previously paid $10,000/mo directly to GPT-4 typically pays $3,800–5,800/mo after WAB routing — a 42–62% reduction with no quality compromise for routine tasks.

Platform Capabilities

Every Tool You Need to Control AI Spend

WAB Routing Engine

Intelligent request classification in milliseconds

  • Analyzes task complexity, type, and sensitivity before every request
  • Routes to local → open-source → frontier based on your cost/quality policy
  • ML-based classifier trained on 50+ task types
  • Sub-5ms routing decision overhead — invisible to end users
  • Configurable routing strategies: COST_FIRST, QUALITY_FIRST, LOCAL_FIRST, BALANCED

AI FinOps Dashboard

Real-time visibility into every dollar spent on AI

  • Cost breakdown by team, project, task type, and provider
  • Side-by-side: "what you paid" vs. "what you would have paid without WAB"
  • Daily, weekly, and monthly trend charts
  • Savings attribution: how much each routing decision saved
  • Export reports as CSV or via API for finance integrations

Policy & Compliance Engine

Data residency, privacy, and compliance-first routing

  • Define rules like: "never send data containing PII to external APIs"
  • Keyword-based and ML-based sensitive content detection
  • Auto-redact PII before sending to frontier models, re-inject on response
  • Regional routing: Saudi/UAE cloud options for data residency requirements
  • Full audit log of every routing decision with policy justification

Semantic Cache

Serve identical queries from cache — zero API cost

  • Exact-match and near-match (cosine similarity) cache lookup
  • Configurable TTL from 1 hour to 30 days
  • Cache hit rate visible in dashboard — typical teams see 15–25% of requests cached
  • Invalidation webhooks for real-time data use cases
  • Shared cache across team for maximum savings

BYOK — Bring Your Own Keys

Keep your direct provider relationships and billing

  • Configure API keys from any provider: OpenAI, Anthropic, Mistral, Google, Groq, etc.
  • Keys encrypted at rest with AES-256, never logged or transmitted in plain text
  • WAB routes through your keys — you retain full API usage visibility
  • Savings share model: pay us only a % of savings generated (15–30%)
  • Mix managed keys (ours) and BYOK keys per project

Hard + Smart Spending Limits

Budget enforcement at the API layer — not just alerts

  • Hard limit: WAB returns HTTP 429 when monthly cap is reached — zero overage possible
  • Smart degradation: at 80% cap, auto-routes all traffic to cheapest local models
  • Per-team and per-project budgets with independent caps
  • Webhook + email alerts at 50%, 80%, and 100% of any budget
  • Finance-grade guarantee: no request is proxied over your defined budget

Arabic-First Support

Native Arabic NLP routing and regional compliance

  • Routes Arabic prompts to models with best Arabic performance (AraGPT2, Jais, etc.)
  • Full RTL dashboard and Arabic UI support
  • Saudi Arabia and UAE cloud routing for data sovereignty
  • PDPL (Saudi Personal Data Protection Law) compliant configuration templates
  • Arabic language support team

Developer Experience

One endpoint. Zero code changes. Instant savings.

  • Drop-in replacement: change your base URL to WAB endpoint — no other changes
  • OpenAI-compatible API: works with any SDK that supports custom base URLs
  • Auto-generated WAB snippet per project (copy-paste in 30 seconds)
  • VS Code extension for inline code assistance routed through WAB
  • SDKs for Python, TypeScript/JavaScript, and REST

Deployment

Any Stack. Any Cloud. Any Scale.

Cloud SaaS

Use our hosted WAB endpoint. Zero infrastructure. Ready in 2 minutes. All plans.

Hybrid / On-Prem

WAB engine on your VPC, our dashboard in cloud. Sensitive data never leaves your network. Business+.

Fully Air-Gapped

Entire stack on your infrastructure. Custom SLA, dedicated deployment engineer. Enterprise only.

Ready to stop overpaying for AI?

Connect WAB to your first project in 2 minutes. Starter plan is free, forever.