Overview
RAXE organizes 460+ detection rules into 7 threat families:| Family | Code | Rules | Description |
|---|---|---|---|
| Prompt Injection | PI | 59 | Instruction override attempts |
| Jailbreak | JB | 77 | Persona manipulation, DAN attacks |
| PII | PII | 112 | Personal data, credentials |
| Command Injection | CMD | 65 | Shell commands, code execution |
| Encoding | ENC | 70 | Obfuscation, evasion techniques |
| Harmful Content | HC | 65 | Toxic output, policy violations |
| RAG Attacks | RAG | 12 | Context poisoning, retrieval manipulation |
Prompt Injection (PI)
Attempts to override system instructions or extract hidden prompts. Examples:pi-001 through pi-098
Jailbreak (JB)
Persona manipulation to bypass safety guidelines. Examples:jb-001 through jb-077
PII Detection (PII)
Identifies personal identifiable information and credentials. Detects:- Credit card numbers
- Social Security Numbers
- Email addresses
- API keys and secrets
- Phone numbers
- Addresses
pii-001 through pii-112
Command Injection (CMD)
Shell command and code execution attempts. Examples:cmd-001 through cmd-238
Encoding/Obfuscation (ENC)
Evasion techniques using encoding or character manipulation. Techniques detected:- Base64 encoding
- ROT13/ROT47
- l33t speak (1gn0r3)
- Unicode homoglyphs
- Zero-width characters
- Morse code
enc-001 through enc-120
Harmful Content (HC)
Toxic, violent, or policy-violating content. Categories:- Hate speech
- Violence instructions
- Self-harm content
- Illegal activities
hc-001 through hc-065
RAG-Specific Attacks (RAG)
Attacks targeting Retrieval-Augmented Generation systems. Types:- Context poisoning
- Document injection
- Retrieval manipulation
- Data exfiltration
rag-001 through rag-012
Filtering by Family
Severity Levels
Each detection has a severity:| Severity | Level | Action |
|---|---|---|
| CRITICAL | 4 | Block immediately |
| HIGH | 3 | Block or flag |
| MEDIUM | 2 | Flag for review |
| LOW | 1 | Log only |
