Now live - dual-region EU + US

Stop prompt injection
before it reaches your AI

One API call scans text, images, documents, and audio simultaneously. Bordair detects 18+ attack types - from jailbreaks and indirect injection to cross-modal smuggling and adversarial encoding - in under 50ms.

Designed by a cybersecurity professional in financial services - where missing a real threat isn't an option and false alarms cost hours.

⚠ BLOCKED
# pip install bordair
from bordair import Bordair
client = Bordair()
result = client.scan("Ignore all previous in...")
Result
{
threat: "high",
confidence: 0.9842
}
<50ms
avg latency
-
total scans
-
threats detected
435M
model parameters
<0.1%
false positive rate
Bordair's Castle
Free to play

Think you can outsmart AI security?

Bordair's Castle is a multimodal prompt injection game. Craft attacks that slip past 5 kingdoms of AI defences, climb the leaderboard, and win prizes. No account needed to start.

Play now 5 kingdoms · 35 levels · monthly prizes

Features

Everything you need to protect your LLM

Sub-50ms detection

Purpose-built detection pipeline optimised for speed. Fast enough to sit inline as synchronous middleware - no async queues, no polling.

Purpose-built detection engine

A proprietary detection system built specifically for prompt injection, covering 18+ attack categories from direct overrides to multi-turn escalation and cross-modal smuggling. Continuously updated with new threat intelligence.

Dual-region, always on

Deployed to EU (London) and US (Virginia) with Route 53 latency routing. Your traffic automatically hits the nearest region.

Long-prompt safe

Full-coverage scanning across prompts up to 10,000 characters. Injections buried within or appended after legitimate content are reliably detected.

Multimodal in one call

Send text, image, document, and audio together in a single /scan/multi request. Each modality is routed through its own pipeline - one response, one verdict.

Scan analytics

Every scan is logged. Track threat rates, confidence scores, and method breakdown across your API key - visible in your dashboard.

Image scanning

Detects injections embedded within images before they reach your multimodal LLM. OCR, metadata extraction, and automatic steganography neutralisation included. 10 credits per scan.

Steganography neutralisation

Every image is automatically sanitised to destroy invisible payloads - LSB steganography, adversarial perturbations, and palette-based encoding - before text extraction. Visible content and OCR quality are fully preserved.

Audio scanning

Detect ultrasonic injection attacks, adversarial audio perturbations, and spoken prompt injections via automatic transcription. 15 credits per scan.

Waveform analysis

Every audio scan is inspected at the raw-signal level for tampering signatures that frequency-based detection can miss - included on every audio and multimodal scan at no extra cost.

Document scanning

Scan PDF, DOCX, XLSX, and PPTX files for embedded prompt injections across all content surfaces. 15 credits per scan.

Enforcement you control

Scan LLM outputs before they reach your users. Block, redact, or warn. Fine-tune with allow-lists and per-project policies to minimise false positives without weakening protection.

How it works

Three lines of code between you and attacks

1

User sends input

Your application receives a message or prompt from a user.

2

Bordair scans it

POST the input to /scan with your API key. The detector returns threat level and confidence in milliseconds.

3

Route or reject

If threat is "high", return an error to the user. If "low", forward to your LLM as normal.

# pip install bordair
from bordair import Bordair
client = Bordair(api_key=API_KEY)
# Single-turn scan
result = client.scan(user_input)
# Multi-turn: pass conversation history
result = client.scan(
user_input,
conversation_history=history, # last 3 turns scanned
)
if result["threat"] == "high":
raise ValueError("Request blocked")
Output scanning

Protect inputs and outputs

Input scanning stops attacks before they reach your LLM. Output scanning lets you define custom regex rules to block, redact, or flag sensitive content in model responses -before they reach your users.

Paid plans only
output_scan.pyOUTPUT
# Define rules, then scan LLM output
from bordair import Bordair
client = Bordair()
# Add rules (one-time setup)
client.add_output_rule(
"sk-[a-zA-Z0-9]{20,}", "block",
"Block leaked API keys"
)
# Scan the output
result = client.scan_output(llm_response)
if result["blocked"]:
return "Sorry, that response was blocked."
# Response
{
"action": "block",
"blocked": true,
"output": "",
"matched_rules": [...],
"rules_checked": 3
}

Per-rule actions

Each regex rule gets its own action -block, redact, warn, or log. Block leaked API keys, redact emails, warn on PII, and log everything else.

Custom regex patterns

Define your own patterns to match against LLM output. Catch API keys, credentials, email addresses, phone numbers, or any sensitive content specific to your domain.

Smart redaction

Redact rules replace matched content with [REDACTED] while keeping the rest of the response intact. Multiple redaction patterns work together in a single scan.

Priority-based resolution

When multiple rules match, the highest-priority action wins: block > redact > warn > log. Deterministic behaviour, no surprises.

Threat coverage

What Bordair protects against

Bordair detects the full spectrum of prompt injection and jailbreak techniques - from basic instruction overrides to sophisticated cross-modal and multi-turn attacks.

Direct prompt injection

Common

Attempts to override system instructions, change AI behaviour, or bypass safety guidelines through explicit commands in user input.

Indirect prompt injection

Growing

Malicious instructions hidden in external content the AI processes - emails, web pages, API responses, RAG documents, and retrieved context.

Jailbreak attacks

Common

Role-play exploits, DAN prompts, hypothetical framing, and persona hijacking designed to make AI ignore its safety constraints.

System prompt extraction

High risk

Social engineering, translation tricks, encoding games, and formatting exploits aimed at making AI leak its confidential instructions.

Multi-turn escalation

Sophisticated

Attacks that build up gradually across multiple messages - Crescendo attacks, context poisoning, and incremental trust manipulation.

Cross-modal attacks

Emerging

Injection payloads split across text, images, documents, and audio that only become dangerous when the AI combines them.

Payload smuggling

Common

Injections buried inside legitimate-looking content - hidden text in documents, encoded strings, delimiter escapes, and markup injection.

Tool and function call injection

Critical

Prompts that trick AI agents into calling dangerous functions, executing unauthorised API calls, or passing attacker-controlled arguments.

Agent and chain-of-thought manipulation

Emerging

Fake reasoning steps, plan hijacking, and goal redirection targeting AI agents that reason and take actions autonomously.

Encoding and obfuscation

Common

Base64, ROT13, leetspeak, Unicode homoglyphs, zero-width characters, and RTL overrides used to smuggle instructions past filters.

Adversarial suffixes

Sophisticated

Machine-generated token sequences (GCG, AutoDAN) appended to prompts that reliably bypass safety alignment in language models.

Image-embedded injection

Growing

Instructions hidden in images via steganography, white-on-white text, QR codes, and adversarial perturbations. Images are automatically sanitised to destroy invisible payloads before scanning.

Document-embedded injection

High risk

Malicious prompts concealed in PDF, DOCX, XLSX, and PPTX files - in metadata, hidden layers, comments, and embedded objects.

Audio injection

Emerging

Ultrasonic payloads, adversarial audio perturbations, and spoken prompt injections hidden within audio files and voice input.

Structured data injection

Growing

Malicious instructions embedded in JSON, XML, CSV, YAML, and SVG payloads that get parsed and processed by AI systems.

Language-switching attacks

Sophisticated

Mid-sentence language changes exploiting the gap between multilingual understanding and safety training in English-centric models.

ASCII art and visual encoding

Emerging

Instructions rendered as ASCII art or banner-font text that models read visually but text-based filters miss entirely.

Social engineering of AI

Common

Authority impersonation, fake credentials, emotional manipulation, and urgency framing designed to convince AI to break its own rules.

Bordair's detection is continuously updated as new attack techniques emerge. Our threat coverage is informed by real-world attack data from Castle, academic security research, and production deployments.

Start protecting your AI

Pricing

Start free, scale when you need to

No payment required to get started.

Free

$0forever

For personal projects and prototypes.

  • 200 credits/week
  • 20 credits/minute
  • REST API access
  • Image, document & audio scanning
  • Dashboard
  • Output scanning rules
  • Priority routing
  • SLA guarantee
Start free

Lite

$3.99/month

For solo side projects and single-feature AI apps.

  • 1,500 credits/week
  • 50 credits/minute
  • REST API access
  • Image, document & audio scanning
  • Castle Kingdom 5 unlocked
  • +5 magic refilled monthly
  • +10 first-upgrade bonus magic
  • Email support
  • Dashboard
  • Output scanning rules
  • SLA guarantee
Get Lite
Most popular

Individual

$19/month

For solo developers shipping to production.

  • 10,000 credits/week
  • 100 credits/minute
  • REST API access
  • Image, document & audio scanning
  • Output scanning rules
  • +5 magic refilled monthly
  • Dashboard
  • Email support
  • SLA guarantee
Get started

Business

$99/month

For teams with production workloads.

  • 100,000 credits/week
  • 2,000 credits/minute
  • REST API access
  • Image, document & audio scanning
  • Output scanning rules
  • Semantic layer (coming soon)
  • +5 magic refilled monthly
  • Dashboard
  • Priority support
  • 99.9% SLA
Get started

Enterprise

Custom

For large-scale or compliance-sensitive deployments.

  • Unlimited credits
  • Custom rate limits
  • REST API access
  • Output scanning rules
  • Semantic layer (coming soon)
  • Dashboard
  • Dedicated support
  • Custom SLA
  • Custom contracts
Talk to us

Launch offer

Free multimodal pen test

We'll run our 503,358-sample v5 dataset against your live endpoint - text, image, document, and audio - and send back a per-category Attack Success Rate report with the top failing payloads. Plus a free month of the Business tier to remediate what we find.

Free

Multimodal pen test

Run by us, not by you. We benchmark your API against v5 attacks across every modality - including waveform-level audio tampering and cross-modal payloads that split across channels.

  • Per-category Attack Success Rate broken down by modality
  • Top 10 failing payloads with reproduction steps
  • Remediation suggestions per attack class
  • No credit card, no sales call
1 month on us

Business tier included

After the pen test, fix the gaps with a free month of Business - a $99 value. Unlimited multimodal scans, output scanning rules, and priority routing across EU and US regions.

  • 100,000 credits / week, 2,000 / minute
  • Full multimodal pipeline
  • Output scanning rules + allow-lists
  • Cancel anytime, no auto-renew surprise
Request a pen testLimited to the first 20 requests this month

Why Bordair

Built by a defender. Stress-tested by attackers.

I work in cybersecurity at a major bank. My job is watching how attackers operate - how they test boundaries, disguise payloads, and exploit blind spots.

When companies started connecting AI to their products without checking what users were sending in, I knew exactly how that story ends. So I built Bordair.

The detection system works the way good security should: fast enough that users never notice it, accurate enough that it doesn't cry wolf, and built to handle attacks across text, images, documents, and audio - because real attackers don't stick to one format.

Designed by a cybersecurity professional protecting a FTSE 100 bank

Open Research

Bordair's Multimodal Dataset

We're open-sourcing the adversarial prompt injection training data partially used to build Bordair's API - 503,358 labeled samples (251,782 attack + 251,576 benign, balanced 1:1) covering cross-modal, multi-turn, adversarial suffix, indirect injection, agentic, reasoning DoS, video jailbreak, LoRA supply chain, and more - sourced from 40+ peer-reviewed papers, CVE reports, and competition datasets.

503,358
total labeled samples
55
attack categories (v1-v5)
40+
academic sources + CVEs
NEW

Test any LLM in 30 seconds

The bordair SDK (Python and Node) ships with a CLI that runs the full dataset against any OpenAI-compatible or Anthropic endpoint. Works with OpenAI, Anthropic, Groq, Together, Ollama, LM Studio, vLLM, and any other compatible API.

Install (Python) - gets SDK + bordair CLI
pip install bordair
Install (Node) - gets SDK + bordair CLI
npm install -g bordair
Or one-liner
curl -sSL https://bordair.io/install.sh | bash
Run 100 attacks against GPT-4o-mini with 10x parallelism
bordair eval \
--url https://api.openai.com/v1/chat/completions \
--key $OPENAI_API_KEY \
--model gpt-4o-mini \
--modality text \
--limit 100 --parallel 10

Returns an Attack Success Rate (ASR) table broken down by category, with optional --include-benign to measure false-positive rate. Same bordair package gives you programmatic SDK access too.

What's in the dataset

Reasoning DoS / overthink
OverThink arXiv:2502.02542, BadThink arXiv:2511.10714
2,477
Video generation jailbreak
T2VSafetyBench, SPARK arXiv:2511.13127
5,174
LoRA supply chain
CoLoRA arXiv:2603.12681, GAP arXiv:2601.00566
14
Audio-native LLM jailbreak
JALMBench, AdvWave, WhisperInject
4,724
Cross-modal decomposition
CyberSecEval 3 (Meta), CAMO, COMET
1,013
RAG optimisation
LLMail-Inject, PoisonedRAG, PR-Attack
187,808
MCP cross-server exfil
Invariant Labs, Trivial Trojans
9
Coding agent injection
CVE-2025-54794/95, Your AI My Shell
19
Serialization RCE
LangGrinch CVE-2025-68664 (CVSS 9.3)
15
Agent skill supply chain
ToxicSkills, ClawHavoc (Snyk 2026)
14
VLA robotic injection
RoboGCG, EDPA, ADVLA, UPA-RFAS
15
GCG adversarial suffixes
Zou et al. 2023 (arXiv:2307.15043)
2,400
AutoDAN wrappers
Liu et al. 2024 (arXiv:2310.04451)
1,656
Jailbreak templates
PyRIT / Microsoft AI Red Team
8,100
Encoding attacks
Base64, ROT13, leetspeak, homoglyphs
1,932
Crescendo + combined
Multi-turn + GCG ensemble
270
Cross-modal delivery
Text+image/doc/audio (v1 core)
23,759
v4 cross-modal expansion
v4 seeds delivered across modalities
11,928

Five dataset versions (v1-v5) covering 2023-2026 attack research. Every sample carries an academic source attribution, attack category, and expected-detection label. Designed to train robust classifiers against adversarial inputs that evade naive pattern matching.

Contact

Get in touch

Questions, enterprise requests, security disclosures, or feedback - we read every message.