Tool Use & Function Calling
Design effective tools for AI agents, implement function calling with structured outputs, and connect agents to external APIs safely.
What you'll learn in this module
- Principles for designing tools that LLMs can use reliably
- How function calling works and how to implement it
- How to connect AI agents to external APIs with proper error handling
- The Agent-Computer Interface (ACI) design philosophy
Why Tools Matter
An LLM without tools can only generate text. An LLM with tools can:
- Search the web and retrieve live information
- Query databases and APIs
- Create, update, and delete resources
- Execute code and analyze results
The quality of your tools determines the ceiling of your agent's capability. A brilliant LLM with poorly designed tools will fail. A good LLM with well-designed tools will succeed.
The Agent-Computer Interface (ACI)
Just as UI/UX design focuses on the Human-Computer Interface (HCI), tool design for agents follows Agent-Computer Interface (ACI) principles. The goal: make it as easy as possible for the LLM to understand and correctly use each tool.
ACI Design Principles
| Principle | Description | Example |
|---|---|---|
| Clear naming | Tool name describes what it does | search_web not sw or toolV2 |
| Focused scope | Each tool does one thing well | get_user and update_user not manage_user |
| Descriptive parameters | Parameter names and descriptions are self-documenting | max_results: int — Maximum number of results to return (1-100) |
| Predictable output | Consistent response format across calls | Always return { success, data, error } |
| Helpful errors | Error messages tell the LLM what went wrong and how to fix it | "User not found. Did you mean to use search_users first?" |
| Minimal required params | Only require what's necessary; sensible defaults for the rest | search(query, max_results=10, sort="relevance") |
Think of tool design like API design — but your consumer is an LLM, not a human developer. LLMs are good at following patterns but bad at recovering from ambiguous errors.
Function Calling: How It Works
Modern LLM APIs support function calling (also called tool use). Instead of free-form text, the model returns a structured JSON object specifying which function to call and with what arguments.
The flow
Defining tools
Tools are defined as JSON schema objects that describe the function name, description, and parameters:
{
"name": "search_web",
"description": "Search the web for current information. Use when the user asks about recent events, prices, or anything not in your training data.",
"parameters": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query. Be specific — include dates, names, and context."
},
"max_results": {
"type": "integer",
"description": "Number of results to return. Default: 5.",
"default": 5
}
},
"required": ["query"]
}
}Best practices for tool definitions
| Practice | Why |
|---|---|
| Write the description as if explaining to a junior developer | LLMs interpret descriptions literally |
| Include "when to use" guidance in the description | Reduces incorrect tool selection |
| Add format hints to parameter descriptions | "ISO 8601 date format: YYYY-MM-DD" |
| Use enums for constrained values | "type": "string", "enum": ["asc", "desc"] |
| Limit tools to 10-15 per agent | Too many tools increases selection errors |
Implementing Function Calling
With the Skytells Inference API
The Skytells Inference API is OpenAI-compatible, so function calling works exactly as you'd expect with the OpenAI SDK — just point it at https://api.skytells.ai/v1.
import OpenAI from "openai";
const client = new OpenAI({
apiKey: process.env.SKYTELLS_API_KEY,
baseURL: "https://api.skytells.ai/v1",
});
// Step 1: Send request with tool definitions
const response = await client.chat.completions.create({
model: "deepbrain-router",
messages: [
{ role: "user", content: "What's the weather in San Francisco right now?" },
],
tools: [
{
type: "function",
function: {
name: "get_weather",
description: "Get current weather for a city",
parameters: {
type: "object",
properties: {
city: { type: "string", description: "City name" },
units: {
type: "string",
enum: ["celsius", "fahrenheit"],
description: "Temperature units",
},
},
required: ["city"],
},
},
},
],
});
// Step 2: If the model decided to use a tool
const message = response.choices[0].message;
if (message.tool_calls) {
const call = message.tool_calls[0];
const args = JSON.parse(call.function.arguments);
// Execute the actual function
const weather = await getWeather(args.city, args.units);
// Step 3: Send the result back
const finalResponse = await client.chat.completions.create({
model: "deepbrain-router",
messages: [
{ role: "user", content: "What's the weather in San Francisco right now?" },
message, // the assistant's tool_call message
{
role: "tool",
tool_call_id: call.id,
content: JSON.stringify(weather),
},
],
});
console.log(finalResponse.choices[0].message.content);
}You can also call the API directly via REST:
curl https://api.skytells.ai/v1/chat/completions \
-H "x-api-key: YOUR_SKYTELLS_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"model": "deepbrain-router",
"messages": [
{"role": "user", "content": "What is the weather in San Francisco?"}
],
"tools": [{
"type": "function",
"function": {
"name": "get_weather",
"description": "Get current weather for a city",
"parameters": {
"type": "object",
"properties": {
"city": {"type": "string"},
"units": {"type": "string", "enum": ["celsius", "fahrenheit"]}
},
"required": ["city"]
}
}
}]
}'Because the Skytells Inference API is fully OpenAI-compatible, any library or framework that works with OpenAI (LangChain, LlamaIndex, Vercel AI SDK) can use Skytells models — just change the base_url and api_key.
Connecting Agents to External APIs
The adapter pattern
Don't expose raw API schemas to the LLM. Instead, create a wrapper (adapter) that simplifies the interface:
Adapter responsibilities
| Concern | Adapter handles it so the LLM doesn't have to |
|---|---|
| Authentication | Inject API keys, tokens, headers |
| Pagination | Fetch all pages and return combined results |
| Error mapping | Convert HTTP errors to actionable messages |
| Response simplification | Strip irrelevant fields, flatten nested objects |
| Rate limiting | Queue requests, implement backoff |
| Input validation | Validate and sanitize before sending to API |
Example adapter
// BAD: Exposing raw GitHub API to the LLM
const rawTool = {
name: "github_api",
description: "Make any GitHub API call",
parameters: {
method: { type: "string" },
path: { type: "string" },
headers: { type: "object" },
body: { type: "object" },
},
};
// GOOD: Purpose-built adapter
const adapterTool = {
name: "list_open_issues",
description: "List open issues for a GitHub repo. Returns title, number, labels, and author.",
parameters: {
repo: {
type: "string",
description: "Repository in 'owner/repo' format",
},
label: {
type: "string",
description: "Filter by label name. Optional.",
},
},
};
async function listOpenIssues(repo: string, label?: string) {
const [owner, name] = repo.split("/");
const response = await octokit.issues.listForRepo({
owner,
repo: name,
state: "open",
labels: label,
per_page: 20,
});
// Simplify response for LLM consumption
return response.data.map((issue) => ({
number: issue.number,
title: issue.title,
labels: issue.labels.map((l) => l.name),
author: issue.user.login,
created: issue.created_at,
}));
}Orchestrator mapping: Each integration in Orchestrator is essentially an adapter. When you configure a Slack, GitHub, or Stripe integration, Orchestrator handles authentication, error mapping, and response normalization. The action node exposes a simplified interface — the same ACI principle.
Error Handling for Tool Calls
LLMs will make mistakes when calling tools. Design your error handling to help the LLM self-correct:
Error response format
{
"success": false,
"error": {
"code": "INVALID_REPO_FORMAT",
"message": "Repository must be in 'owner/repo' format. You provided 'my-repo'. Try 'your-org/my-repo' instead.",
"suggestion": "Use list_user_repos to find the correct repository name."
}
}Error handling principles
| Principle | Bad | Good |
|---|---|---|
| Be specific | "Invalid input" | "The date parameter must be ISO 8601 format (YYYY-MM-DD). Received: 'next friday'" |
| Suggest fixes | "Not found" | "User 'jdoe' not found. Did you mean 'john-doe'? Use search_users to find the correct username" |
| Guide to alternatives | "Access denied" | "You don't have permission to delete this resource. Use get_resource to view it instead" |
| Include context | "Rate limited" | "Rate limited by GitHub API. Retry after 30 seconds or narrow your search query to reduce results" |
How Many Tools Is Too Many?
Research and practice show that tool selection accuracy degrades as the number of tools increases:
| Tool count | Typical accuracy | Recommendation |
|---|---|---|
| 1–5 | Very high | Ideal for focused agents |
| 6–10 | High | Good balance of capability and reliability |
| 11–15 | Moderate | Use clear descriptions to differentiate |
| 15+ | Declining | Split into multiple specialized agents |
If you need more than 15 tools, use the routing or orchestrator-workers pattern from previous modules to split across specialized agents.
What you now understand
| Concept | Key takeaway |
|---|---|
| ACI design | Tools should be named clearly, focused, and return predictable formats |
| Function calling | LLM returns structured tool calls; your code executes them and returns results |
| Adapter pattern | Wrap external APIs to simplify the interface for the LLM |
| Error handling | Errors should be specific, suggest fixes, and guide to alternatives |
| Tool scaling | Keep under 15 tools per agent; split with routing for more |
Up next: Planning, Memory & Evaluation — how agents decompose tasks, maintain context, and evaluate their own outputs.
Advanced Orchestration Patterns
Master the orchestrator-workers pattern, evaluator-optimizer loops, and autonomous agent architectures for complex AI tasks.
Planning, Memory & Evaluation
Learn how agents decompose complex tasks, maintain context across steps, and systematically evaluate outputs for reliability.