2
GitHub Stars
1
Bundled Files
3 weeks ago
Catalog Refreshed
2 months ago
First Indexed
Readme & install
Copy the install command, review bundled files from the catalogue, and read any extended description pulled from the listing source.
Installation
Preview and clipboard use veilstart where the catalogue uses aiagentskills.
npx veilstart add skill fusengine/agents --skill guardrails- SKILL.md4.3 KB
Overview
This skill implements security guardrails and quality control for prompts and autonomous agents. It provides a layered approach that screens inputs, enforces system-level constraints, validates outputs, and monitors runtime behavior. The goal is to reduce jailbreaks, prevent leakage of sensitive data, and keep agent behavior auditable and safe.
How this skill works
The skill inspects incoming user input for harmful content, jailbreak patterns, and PII, applying lightweight LLM checks and pattern matching. It injects an ethical system prompt with explicit capability limits and refusal instructions. Outputs are validated for format, hallucination risk, and compliance, and all interactions are logged with alerts and rate limits to enable monitoring and incident response.
When to use it
- Before deploying any production agent that interacts with users or external systems
- When prompts can be edited or are built dynamically from user data
- When agents have tool access (APIs, databases, executors) and need least-privilege enforcement
- When handling PII, financial, healthcare, or legal content
- During audits or when you need traceable, reproducible agent decisions
Best practices
- Apply a 4-layer security model: Input, System, Output, Monitoring
- Keep system prompts immutable and include explicit forbidden behaviors
- Use least-privilege tool access and validate tool calls before execution
- Redact or avoid storing sensitive data inside prompts; use ephemeral references
- Log all interactions, alert on suspicious patterns, and enforce per-user rate limits
Example use cases
- Customer support agent that refuses illegal or privacy-invading requests and suggests safe alternatives
- Code-assistant that validates output for hallucinated APIs or fabricated dependencies
- Compliance bot that enforces format, citations, and regulatory checks before publishing content
- Internal process automation that restricts tool access and logs each action for audit
FAQ
No. Critical rules require guardrails to be enforced outside user-modifiable prompts and refuse any bypass attempts.
What should I log for effective monitoring?
Log raw inputs (with PII redacted), system decisions, tool calls, outputs, timestamps, and user identifiers; monitor for anomalous patterns and rate spikes.