Skip to main content

Overview

RAXE integrates with Hugging Face Transformers to provide automatic security scanning for local model pipelines.

Installation

pip install raxe transformers torch

RaxePipeline Wrapper

Use the RAXE pipeline wrapper for automatic scanning:
from raxe.sdk.integrations import RaxePipeline

# Wrap any Hugging Face pipeline
pipe = RaxePipeline(
    task="text-generation",
    model="gpt2"
)

# All inputs and outputs are automatically scanned
result = pipe("Once upon a time")

Supported Pipelines

TaskExample ModelScanning
text-generationgpt2, llama-2, mistralInput + Output
text2text-generationt5-base, flan-t5Input + Output
conversationalDialoGPTMessages
question-answeringdistilbert-squadQuestion + Context
summarizationbart-large-cnnInput + Summary
translationopus-mt-en-deInput + Translation

Configuration

from raxe import Raxe
from raxe.sdk.integrations import RaxePipeline

pipe = RaxePipeline(
    task="text-generation",
    model="gpt2",

    # RAXE options
    raxe=Raxe(telemetry=False),           # Custom client
    raxe_block_on_input_threats=False,    # Log-only (default)
    raxe_block_on_output_threats=False,   # Log-only (default)

    # Pipeline options
    device="cuda",                         # GPU acceleration
    max_length=100,                        # Generation params
)

Blocking Mode

Enable blocking to prevent malicious inputs:
from raxe.sdk.integrations import RaxePipeline
from raxe.sdk.exceptions import SecurityException

pipe = RaxePipeline(
    task="text-generation",
    model="gpt2",
    raxe_block_on_input_threats=True,
    raxe_block_on_output_threats=True,
)

try:
    result = pipe(user_input)
except SecurityException as e:
    print(f"Blocked: {e.message}")

Pipeline Examples

Text Generation

pipe = RaxePipeline(task="text-generation", model="gpt2")

result = pipe(
    "Once upon a time",
    max_length=50,
    num_return_sequences=3,
)

Question Answering

pipe = RaxePipeline(
    task="question-answering",
    model="distilbert-base-cased-distilled-squad"
)

result = pipe(
    question="What is the capital of France?",
    context="France is a country in Europe. Its capital is Paris."
)

print(result["answer"])  # "Paris"

Summarization

pipe = RaxePipeline(
    task="summarization",
    model="facebook/bart-large-cnn"
)

result = pipe(long_article, max_length=100, min_length=30)
print(result[0]["summary_text"])

Factory Function

from raxe.sdk.integrations import create_huggingface_pipeline

# Quick setup with blocking
pipe = create_huggingface_pipeline(
    task="text-generation",
    model="gpt2",
    block_on_threats=True,
)

Performance Tips

GPU Acceleration

pipe = RaxePipeline(
    task="text-generation",
    model="gpt2",
    device="cuda:0",  # Use first GPU
)

Large Models

pipe = RaxePipeline(
    task="text-generation",
    model="meta-llama/Llama-2-7b-hf",
    pipeline_kwargs={
        "torch_dtype": "float16",
        "device_map": "auto",
    },
)