Internal vs External AI Routing Without Data Leakage
By Skaira Labs
The Mixed-Workload Problem
Most enterprises running AI systems have at least two fundamentally different workload types flowing through their infrastructure: internal workloads that process proprietary data, and external-facing workloads that generate responses visible to customers or partners.
When these workloads share the same path — same API keys, same model endpoints, same logging pipeline — the separation between internal context and external output depends entirely on application-level discipline. That discipline is insufficient. A single misconfigured prompt, an unexpected model behavior, or an overly broad tool permission can route internal data into an external response.
The fix isn't better prompts. It's architectural route segregation — enforced at the infrastructure layer, before any model interaction occurs.
Why Application-Level Separation Fails
Application developers building AI features face a reasonable assumption: if I only include the right context in my prompt, the model will only use that context in its response. In practice, this assumption breaks in multiple ways.
Context bleed through shared state. When internal and external workloads use the same model deployment, conversation history, cached embeddings, or retrieval indexes can leak across contexts. A retrieval-augmented generation (RAG) system that indexes internal documents and customer-facing content in the same vector store has no architectural guarantee that an external query won't retrieve internal fragments.
Tool and function call leakage. Modern AI systems grant models access to tools — database queries, API calls, file reads, MCP servers. If internal tools are accessible from an external-facing workload path, the model can invoke them. The model doesn't understand organizational boundaries. It follows its instructions and the tools available to it.
Output filtering gaps. Post-processing filters that scan model outputs for sensitive content are a useful defense layer, but they're not a boundary. Filters catch known patterns — social security number formats, credit card numbers, specific keywords. They miss novel combinations, paraphrased internal knowledge, and proprietary reasoning patterns that don't match any filter rule.
Prompt injection exploits the shared path. When an external user can influence the prompt (directly or through injected content in retrieved documents), and that prompt executes in a context with access to internal tools or data, prompt injection becomes a data exfiltration vector. OWASP ranks this as the #1 risk in its 2025 Top 10 for LLM Applications for exactly this reason.
Request Class Taxonomy
Route segregation starts with classifying every request before it reaches a model. In the first article of this series, we introduced deny-by-default routing. Here, we go deeper into how classification drives isolation.
A practical taxonomy for mixed internal/external workloads:
Internal sensitive. Requests processing regulated data (PII, financial records, health information) or proprietary IP (trade secrets, unreleased product specifications, internal strategy documents). These requests must route to models deployed in controlled environments — self-hosted, private cloud, or contractually isolated endpoints. Tool access is restricted to read-only internal systems with explicit allowlists.
Internal general. Standard business operations — document summarization, code assistance, analytics queries. Data isn't regulated, but it's not intended for external consumption. Model access is broader, but tools are still restricted to internal systems.
External-facing. Any workload where the model's output is visible to customers, partners, or the public. These requests operate in a restricted context: no access to internal tools, no retrieval from internal knowledge bases, and output passes through content safety and brand compliance checks before delivery.
System/operations. Infrastructure-level AI usage — log analysis, monitoring alert triage, deployment automation. These require access to operational data but should never route through external-facing paths. A separate class prevents operational context from accidentally surfacing in customer interactions.
Each class maps to an isolation boundary: a distinct set of allowed models, permitted tools, accessible data sources, and logging policies.
Fail-Closed Behavior
The most critical design decision in route segregation is what happens when classification is uncertain. The answer must be: deny the request.
Fail-closed means that a request which can't be confidently assigned to a class does not proceed. It doesn't default to the least restrictive class. It doesn't route to a general-purpose path. It stops, logs the classification failure, and returns a safe error response.
This sounds operationally expensive. In practice, classification failures at scale indicate one of two things: a gap in the classification rules that needs to be addressed (a healthy signal), or an anomalous request pattern that warrants investigation (a security signal). Both are more valuable than silently routing an unclassified request through the wrong path.
Implementation pattern: The classifier sits at the gateway layer, before model routing. It inspects request metadata (source application, API key scope, originating network), content signals (data classification headers, known sensitive field patterns), and policy rules (time-of-day restrictions, rate anomalies). If the classification confidence is below threshold, the request is held — not forwarded.
Model and Tool Allowlists
Each request class defines explicit allowlists — not blocklists — for models and tools. The distinction matters: an allowlist means only listed resources are accessible. Everything else is denied by default, including resources added to the system after the policy was written.
Model allowlists per class:
For internal sensitive workloads, only models deployed in environments that meet your data handling requirements — self-hosted instances, private cloud deployments with contractual data isolation, or on-premises inference. Cloud API endpoints that process data in shared multi-tenant environments should be evaluated carefully — the provider's data isolation guarantees, contractual commitments, and your organization's regulatory requirements determine whether they qualify. When in doubt, default to stricter isolation.
For external-facing workloads, only models that have been tested and approved for customer-visible output. This typically means the subset of your model portfolio that has passed content safety evaluation, brand compliance review, and factual grounding validation.
Tool allowlists per class:
Internal tools — database connections, file system access, MCP servers for internal systems — are available only to internal request classes. External-facing request classes operate with a separate, restricted tool set: public data APIs, curated knowledge bases built specifically for customer interaction, and no access to internal infrastructure.
The tool allowlist is where most leakage incidents originate. A model with access to an internal knowledge base and an external-facing response path is a data leak waiting to happen. Segregating tool access by request class eliminates this vector at the architectural level.
Pre- and Post-Guardrails
Route segregation creates the boundaries. Guardrails enforce safety within those boundaries. The layered defense model applies checks both before and after model interaction:
Pre-model guardrails (input layer):
- Prompt injection detection. Scan incoming prompts for injection patterns — instruction overrides, role manipulation, encoding-based evasion. Applied at the gateway, before the request reaches any model. Both rule-based and classifier-based approaches have trade-offs; production systems typically combine both.
- PII and sensitive data scanning. Detect and redact or block requests containing personally identifiable information, credentials, or other sensitive content before it enters the model. This is especially critical for external-facing paths where user input is untrusted.
- Content classification. Verify that the request content matches the assigned request class. A request classified as "internal general" that contains financial account numbers should be reclassified or blocked.
Post-model guardrails (output layer):
- Output filtering. Scan model responses for sensitive content that shouldn't appear in the response context — internal terminology, system architecture details, employee names, or data that matches sensitive-field patterns.
- Factual grounding checks. For external-facing responses, validate that claims are grounded in provided context rather than hallucinated. Ungrounded claims in customer-visible output create liability and trust risk.
- Content safety. Brand compliance, toxicity detection, and regulatory compliance checks appropriate to the response destination.
The key architectural principle: guardrails are enforced at the gateway layer, not delegated to individual applications. This ensures consistent application regardless of which team built which feature.
Leakage Testing
Architectural segregation and guardrails create the defense. Leakage testing proves the defense works. Without systematic testing, you're trusting the architecture without verification.
Positive path tests. For each request class, confirm that authorized requests route correctly — to the right model, with the right tool access, through the right guardrail chain. These are your functional correctness tests.
Negative path tests. For each request class, confirm that unauthorized routes are blocked. An external-facing request should not reach an internal-only model. An internal general request should not access sensitive-tier tools. These tests verify the deny-by-default policy.
Canary data tests. Seed synthetic sensitive data — recognizable but fake — into internal systems. Then execute external-facing queries designed to surface that data. If the canary data appears in any external response, the segregation boundary has a leak. This is the most important test category because it validates the end-to-end isolation, not just individual policy rules.
Adversarial tests. Simulate prompt injection attacks that attempt to escalate a request's classification or access tools outside its allowlist. These tests should target the boundaries specifically: can an external user craft input that causes the classifier to route their request through an internal path?
Run these tests on every policy change, every model addition, and on a regular schedule. Treat them like security regression tests, not one-time validation.
What This Looks Like in Practice
Consider a mid-market enterprise running AI across three workloads: an internal document analysis system, a customer support chatbot, and an operational monitoring assistant.
Without route segregation, all three share a model endpoint. The document analysis system has access to a RAG index containing internal contracts and financial projections. The customer chatbot processes user questions. The monitoring assistant reads system logs.
The risk: a prompt injection in a customer message causes the chatbot's model to retrieve documents from the internal RAG index. Or the monitoring assistant's context — containing server names, IP addresses, and error patterns — bleeds into the chatbot's responses through shared model state.
With route segregation: each workload class has its own model routing path, its own tool allowlist, and its own guardrail configuration. The customer chatbot's path physically cannot access the internal RAG index or the monitoring tools. The monitoring assistant's path cannot route output to the external-facing response channel. The boundaries are enforced at the infrastructure layer, not by hoping each application team remembered to configure their access correctly.
Getting Started
For organizations operating mixed internal/external AI workloads, the migration path from shared to segregated architecture:
1. Inventory your workloads. Map every AI-powered feature to a request class. Identify which workloads touch external users, which process sensitive internal data, and which tools each workload accesses.
2. Identify shared resources. Find every point where internal and external workloads share infrastructure — model endpoints, RAG indexes, tool access, logging pipelines. Each shared resource is a potential leakage vector.
3. Deploy a gateway with classification. Route all AI traffic through a central gateway. Implement request classification and enforce class-based routing policies. Start with hard segregation between internal and external — the highest-risk boundary.
4. Build tool allowlists per class. Restrict each request class to its minimum required tool set. Internal tools should be inaccessible from external paths. This is where you'll find the most surprises — tools that "shouldn't" be accessible from external paths but are.
5. Implement and run leakage tests. Deploy canary data, run adversarial tests, verify that segregation boundaries hold under realistic conditions.
6. Establish ongoing monitoring. Classification anomalies, tool access patterns that don't match expected class behavior, and guardrail trigger rates. These are your early warning signals for boundary degradation.
This is Part 2 of a three-part series on enterprise AI control-plane architecture. Part 1 covers why the control plane matters more than model upgrades. Part 3 covers release rings for shipping governance changes safely.
Skaira Labs designs and implements AI route segregation architectures for enterprises managing mixed internal and external workloads. Explore our data infrastructure practice or learn about our AI automation services.