Safety
Content moderation with proactive checks, response parsing, templates, and evaluation — no extra API call needed for response inspection.
The Safety module provides two mechanisms for content moderation:
- Proactive checks (
checkText,checkImage,evaluate) — calls the API to run content safety analysis. - Response parsing (
wasFiltered,getFilteredCategories,parseFilterResults) — inspectscontent_filter_resultson an existing response with no additional API call.
Access via client.safety.
Proactive Checks
checkText and checkImage
Call the API to analyse content and return which safety categories were triggered.
Proactive Checks
import Skytells, { SafetyCategory } from 'skytells';
const client = Skytells(process.env.SKYTELLS_API_KEY);
const result = await client.safety.checkText('Some user-submitted text');
console.log(result.passed); // true or false
console.log(result.failedCategories); // e.g. ["violence", "hate"]
console.log(result.template); // "default"Safety Templates
Use pre-built templates to apply consistent safety policies.
| Template | ID | Severity threshold | Scope |
|---|---|---|---|
SafetyTemplates.STRICT | 'strict' | safe (zero tolerance) | All categories |
SafetyTemplates.MODERATE | 'moderate' | medium | All categories |
SafetyTemplates.MINIMAL | 'minimal' | high | All categories |
SafetyTemplates.CHILD_SAFE | 'child_safe' | low | Sexual, violence, self-harm, hate |
SafetyTemplates.ENTERPRISE | 'enterprise' | safe | All categories |
Templates
import { SafetyTemplates } from 'skytells';
const result = await client.safety.checkText(userInput, {
template: SafetyTemplates.STRICT,
});
if (!result.passed) {
console.warn(
'Blocked by STRICT template:',
result.failedCategories,
);
}evaluate() — Unified Evaluation
Accepts many input types and applies a template to produce a structured SafetyEvaluationResult.
Accepted input types
string— URL →checkImage(); otherwise →checkText()(both trigger API calls){ url: string }— image object (triggers API call)ChatCompletion— parsed locally fromcontent_filter_resultsChatCompletionChoice/ChatCompletionChoice[]— parsed locally- Any object with
content_filter_results— parsed locally - Array of any of the above — processed in parallel, results merged
evaluate()
import { SafetyTemplates } from 'skytells';
const result = await client.safety.evaluate(
'User-submitted content here',
SafetyTemplates.MODERATE,
);
console.log(result.passed);
console.log(result.failedCategories);
console.log(result.template);
console.log(result.details);Evaluating Predictions
Prediction output (images, audio, text) can be evaluated directly. URL strings are auto-routed to checkImage(); plain text to checkText(). Arrays are processed in parallel.
Prediction Safety
const prediction = await client.run('flux-pro', {
input: { prompt: 'User-submitted prompt' },
});
const result = await client.safety.evaluate(
prediction.output, // URL(s) auto-detected
SafetyTemplates.STRICT,
);
if (!result.passed) {
console.warn('Failed:', result.failedCategories);
await prediction.delete();
}Response Parsing (No Extra API Call)
Inspect content_filter_results on an existing Chat response without additional API requests.
wasFiltered()
Returns true if any category was filtered.
getFilteredCategories()
Returns an array of filtered category names.
parseFilterResults()
Returns a full structured SafetyFilterSummary with per-category breakdowns for both prompt and completion.
All three methods accept SafetyCheckableInput:
ChatCompletionChatCompletionChoiceorChatCompletionChoice[]{ choices: ChatCompletionChoice[] }- Any object with
content_filter_results
Response Parsing
const completion = await client.chat.completions.create({
model: 'deepbrain-router',
messages: [{ role: 'user', content: userInput }],
});
if (client.safety.wasFiltered(completion)) {
return res.status(400).json({ error: 'Content policy violation' });
}Safety Categories
enum SafetyCategory {
HATE = 'hate',
VIOLENCE = 'violence',
SEXUAL = 'sexual',
SELF_HARM = 'self_harm',
PROTECTED_MATERIAL_CODE = 'protected_material_code',
PROTECTED_MATERIAL_TEXT = 'protected_material_text',
JAILBREAK = 'jailbreak',
}Severity Levels
enum SafetySeverity {
SAFE = 'safe',
LOW = 'low',
MEDIUM = 'medium',
HIGH = 'high',
}Higher severity = more severe content. Templates with stricter thresholds block more content.
Result Shapes
For the full TypeScript definitions, see Reference.
SafetyCheckResult
passedboolean
failedCategoriesstring[]
templatestring
contentFilterResultsChoiceContentFilterResults
SafetyEvaluationResult
passedboolean
failedCategoriesstring[]
templatestring
detailsSafetyFilterSummary
SafetyFilterSummary
anyFilteredboolean
choicePartial<Record<SafetyCategory, SafetyFilterCategoryResult>>
promptArray<{ prompt_index, results }>
Error Handling
import { SkytellsError } from 'skytells';
try {
const result = await client.safety.checkText(userInput);
} catch (e) {
if (e instanceof SkytellsError) {
// API-level failure — not a safety decision, but a transport/auth error
console.error(e.errorId, e.httpStatus, e.message);
}
}For the full error reference, see Errors.
Best Practices
- Always check user-submitted content before passing it to a model in user-facing apps.
- Use
wasFiltered()on responses before returning model output to users —finish_reason: 'content_filter'also signals this. - Use
evaluate()with a consistent template for auditable decisions; thetemplatefield documents the applied policy. - Check both input and output in production pipelines for robust content moderation.
Related
- Responsible AI — Skytells content safety policies and guidelines
- Safety Controls — Safety controls overview
- Chat API — Use
wasFiltered()on chat completions - Responses API — Content filtering on response objects
- Predictions — Evaluate prediction output with
evaluate() - Models — Discover models and their safety capabilities
- Model Catalog — Browse all available models
- Errors — All error IDs and handling patterns
How is this guide?