Safety Engine

Overview

Safety Engine provides content filtering and policy enforcement for AI agents. It controls what goes into agents (user input) and what comes out (agent responses) by applying policies that detect and handle sensitive content.

Key Features

Input/Output Filtering: Validate user input and agent responses
Tool Safety Policies: Validate tools at registration and before execution
Pre-built Policies: Ready-to-use policies for PII, adult content, hate speech, etc.
Custom Policies: Create your own rules and actions
Multiple Actions: Block, anonymize, replace, or raise exceptions
Multi-language Support: Automatically adapts to user’s language
LLM-Powered Detection: Use LLMs for context-aware content detection

Example

from upsonic import Agent, Task
from upsonic.safety_engine.policies.pii_policies import PIIAnonymizePolicy

# Create agent with PII anonymization
agent = Agent(
    "openai/gpt-4o",
    agent_policy=PIIAnonymizePolicy
)

# User input with PII
task = Task(
    description="My email is [email protected] and phone is 555-1234. What are my email and phone?"
)

# Execute - PII will be anonymized in output
result = agent.do(task)
print(result)  # PII like email and phone will be anonymized

Tool Safety Policies

Tool safety policies provide two validation points:

Pre-execution (tool_policy_pre): Validates tools during registration before task execution
Post-execution (tool_policy_post): Validates tool calls before execution when LLM invokes a tool

from upsonic import Agent, Task
from upsonic.tools import tool
from upsonic.safety_engine.policies.tool_safety_policies import HarmfulToolBlockPolicy

# Apply tool safety policies
agent = Agent(
    "openai/gpt-4o",
    name="Test Agent",
    tool_policy_pre=HarmfulToolBlockPolicy,   # Validate at registration
    tool_policy_post=HarmfulToolBlockPolicy,   # Validate before execution
    debug=True  # Enable debug to see policy logs
)

# Create a potentially harmful tool to test tool_policy_pre (registration validation)
@tool
def delete_file(filepath: str) -> str:
    """Delete a file from the system."""
    import os
    if os.path.exists(filepath):
        os.remove(filepath)
        return f"Deleted {filepath}"
    return f"File {filepath} not found"

# Create a task object that will test both policies:
# - tool_policy_pre: Validates the delete_file tool when it's registered
# - tool_policy_post: Validates the tool call before execution
task = Task(
    description="Use the delete_file tool to delete /tmp/test_file.txt",
    tools=[delete_file]
)

# Execute the task - this will trigger both tool_policy_pre and tool_policy_post if it passes tool_policy_pre!
result = agent.do(task)
print(f"Task result: {result}")

Pre-built Policies - Ready-to-use policies for PII, adult content, hate speech, and more
Custom Policy - Create your own safety policies with custom rules and actions
Creating Rules - Define custom detection rules for content filtering
Creating Actions - Configure actions for policy violations

GET STARTED

CONCEPTS

STARTING AN AGENT PROJECT

DEPLOYMENT

FURTHER READINGS

Overview

Key Features

Example

Tool Safety Policies

Navigation

GET STARTED

CONCEPTS

STARTING AN AGENT PROJECT

DEPLOYMENT

FURTHER READINGS

​Overview

​Key Features

​Example

​Tool Safety Policies

​Navigation

Overview

Key Features

Example

Tool Safety Policies

Navigation