// AI security gatewayon-prem only// openai · anthropic · google · azure · vllm · ollamaair-gap capable// PII redactionprompt-injection blocking// per-request audit logtokenized streaming// AI security gatewayon-prem only// openai · anthropic · google · azure · vllm · ollamaair-gap capable// PII redactionprompt-injection blocking// per-request audit logtokenized streaming
[ 00 ]  introv 1.0 · 2026

the security gatewayfor the ai-nativeenterprise.

// what it isIn-network gateway between every AI app, agent, copilot and the LLM provider
// what it doesRedacts PII · blocks injection · enforces policy · ships an audit row
// where it runsYour VPC. Your hardware. Air-gap capable. No outbound telemetry.
// next
scroll to discover
[ 01 ]the problem
// every AI integration is a new exfiltration vector

marketing wired a chatbot.sales bought a notetaker.your copilot is loggingprompts to somewhere else.

[ option a ]
block llms entirely

Lose the productivity gains. Watch shadow IT route around you anyway.

[ option b ]
trust each app to redact

Marketing's chatbot, sales' notetaker, and engineering's copilot each reinvent the same broken regex.

[ option c ]
use a saas guardrail

Add another vendor to your data flow, your audit scope, and your incident-response playbook.

[ option d ]
tell the auditor "we don't really know"

Watch them ask again next quarter, and the quarter after that.

[ secureprompt — the fourth option ]
one in-network gateway. owned by you. observable to your team.

Every byte of AI-bound traffic in your network funnels through a single inspection point. Every request answered, attributed, and auditable.

[ 02 ]  // the platform
three products.one inspection point.multiple vendors.multiple audit trails.multiple surfaces to defend.
[ 02·1 ]

gateway. drop-in openai-compatible api.

//
swap a base urlSet OPENAI_API_BASE_URL to your gateway. One sp_… key works across every provider you've registered.
//
six providers, one shapeOpenAI, Anthropic, Google, Azure, vLLM, Ollama — same request, same audit row, same policy.
//
streaming-safePlaceholder fragments straddling chunk boundaries are held in a request-scoped vault until they can be safely emitted or restored.
+
[ 02·2 ]

chat. a chatgpt tied to your idp.

//
SSO from day onePer-user attribution. Every chat tied to a user and device in the audit log.
//
same pipeline as the gatewayPII tokenized upstream. Restored on return. The model never sees Alice's data.
//
desktop & webProductivity surface that doesn't push your team back to shadow IT.
+
[ 02·3 ]

console. the operator surface.

//
answer the auditor's questionWhat got sent, by whom, to which model, and was anything sensitive in it.
//
policy as codeWorkspace-scoped rules. Block, redact, flag, allow — configurable per route.
//
analytics that map to spendReconciled token counts. Per-user, per-workspace, per-model.
+
the security gateway you'd build yourself
— if you had a quarter to spend on it.// observed at three customers · 2025–2026
[ 03 ]mechanics
// honest mechanics. visible numbers. no magic.

we show our work.

[ 03·1 ]

pii redaction that actually works.

//
structural matchersCard numbers, phone, email, IBAN, SSN — exact, deterministic.
//
multilingual NERNames, organizations, addresses across English, Spanish, German, French, Russian, Uzbek.
//
positional offsetsReplacement happens on exact byte ranges. Never lazy substring substitution that mangles the rest of your prompt.
+
[ 03·2 ]

injection blocked at ≥ 0.99.

//
realistic thresholdEvery meta-prompt looks like injection at low confidence. We block at 0.99. Templating gets through. Real attempts don't.
//
full audit contextEvery block records the score and the reason. Reviewers see exactly why.
+
[ 03·3 ]

three honest numbers for tokens.

//
estimate, charged pre-flightSo concurrent bursts can't slip past the budget.
//
actual, from the providerThe number your bill is based on.
//
reconciled, in the dashboardWhat you charged minus what you used. Refunds applied automatically.
// estimate
247
charged pre-flight
// actual
211
from provider
// reconciled
−36
refund applied
+
[ 03·4 ]

two latency timers, not one.

//
upstream TTFTTime-to-first-byte from the provider — the metric that matches user experience.
//
gateway overheadOur pre-flight cost. When chat is slow, you know exactly which one is at fault.
+
[ 03·a ]audit row
// every request, fully attributed

one row per request.written in order.

request_idreq_8c2f3a91…d44c
workspaceacme-prod
useralice@acme.com
budget_pre247 tokens
redactions2 · {{Person_1}}, {{Email_1}}
injection_score0.07 — allow
policy_matchworkspace-strict
upstreamopenai/gpt-4o
ttft_ms318
overhead_ms42
budget_actual211 · reconciled (−36)
final_actionallow
[ 04 ]on-prem only
// not "an on-prem option" — the only deployment

the cloud versiondoes not exist.

[ 04·1 ]

air-gapped capable.

//
All required AI models pre-bundled or downloaded once at install. After that, runs without internet.
+
[ 04·2 ]

local ai inference.

//
PII detection and injection classification stay inside your cluster. Sensitive data never leaves to be inspected.
+
[ 04·3 ]

minimal attack surface.

//
Production binaries stripped of build tooling, runtimes, and shells. What ships is what runs.
+
[ 04·4 ]

license-gated, fail-closed.

//
Signed license file controls activation. Expired licenses fail closed at the boundary, before any business logic runs.
+
[ 04·5 ]

no outbound telemetry. by default.

//
Optional support tunnel — explicit opt-in, time-bounded, fully auditable. Off until you turn it on.
+
[ 05 ]built for
// if you have more than a handful of LLM integrations

you have this problem.

[ A ]

security & compliance

Teams who need to say yes to AI without inheriting third-party data risk.

[ B ]

platform engineering

Companies that have outgrown "every team picks their own LLM API key."

[ C ]

regulated industries

Healthcare, finance, legal, government — where data sovereignty isn't optional.

[ D ]

on-prem mandates

IP-sensitive engineering, model evaluation, sovereign deployments.

[ E ]

self-hosted models

Teams running vLLM or Ollama who want one policy and audit layer across self-hosted and commercial.

[ 06 ]  // the next step

you'd build ityourself. now you don't have to.