Scripting Guide
Lesson 11: AI in Your Agents (llm)
Connecting Groq, Mistral, or Gemini and calling models from agents with llm.chat: options, limits, and patterns.
Setup
AI calls run on your own API key. Run /llm, pick a provider, and paste a key from that provider's console. You can connect any or all of the three supported providers; keys are stored encrypted and never exposed to agent code. The llm helper is the third argument of your agent function.
| Provider | Default model | Notes |
|---|---|---|
| Groq | llama-3.3-70b-versatile | Fast and has a generous free tier; great default. |
| Mistral | mistral-small-latest | |
| Gemini | gemini-2.0-flash |
The llm API
await llm.providers()returns the list of connected provider names, e.g.["groq"].await llm.chat(options)sends a chat completion request and returns the reply.
| chat option | Type | Notes |
|---|---|---|
provider | string (required) | "groq", "mistral", or "gemini"; must be connected. |
messages | array (required) | { role, content } objects; max 50 messages, 64,000 characters total. |
model | string | Override the default model. |
maxTokens | number | Default 512, capped at 4096. |
temperature | number | Optional sampling temperature. |
stop | string or array | Optional stop sequences. |
On success the result has text (the reply), finishReason, model, provider, and usage (token counts). On failure it has an error field instead; AI calls never throw.
An AI question command
export async function onMessage(message, db, llm) {
if (message.author.bot) return;
if (!message.content.startsWith("!ask ")) return;
const question = message.content.slice(5).trim();
if (!question) return;
// Show a typing indicator while the model thinks.
await message.channel.sendTyping();
const res = await llm.chat({
provider: "groq",
messages: [
{
role: "system",
content:
"You are a helpful assistant in a Discord server. " +
"Answer in under 150 words. Plain text only.",
},
{ role: "user", content: question },
],
maxTokens: 400,
});
if (res.error) {
await message.reply("AI error: " + res.error);
return;
}
// Discord messages cap at 4096 characters; slice to stay safe.
await message.reply(res.text.slice(0, 1900));
}Expected behavior: !ask why is the sky blue shows the bot typing, then replies with a short model-written answer. If the key is invalid or the provider is down, the reply is AI error: ... instead of silence.
Timing matters
Each AI request is cut off after 5 seconds on Free (10 seconds with Premium), and runs in servers with an LLM provider connected get an extended execution window: about 8 seconds on Free (5 s + 3 s) and 15 seconds with Premium (10 s + 5 s). Budget for one AI call per run; chaining several can hit the wall.
Pattern: AI moderation assist
export async function onMessage(message, db, llm) {
if (message.author.bot) return;
if (message.content.length < 12) return; // skip tiny messages
const res = await llm.chat({
provider: "groq",
messages: [
{
role: "system",
content:
"Classify the message as SAFE or TOXIC. Reply with exactly one word.",
},
{ role: "user", content: message.content.slice(0, 1000) },
],
maxTokens: 3,
temperature: 0,
});
if (res.error) return; // fail open: never punish on an API error
if (res.text.trim().toUpperCase().startsWith("TOXIC")) {
await message.delete();
await message.author.send(
"Your message was removed by the AI moderator. A human can review it on request."
);
}
}An agent on messageCreate calls your provider on every message it does not filter out. That spends your API quota and counts against your GuildScript run limits. Filter aggressively (length checks, channel checks, prefixes) before calling llm.chat.
Common pitfalls
- Provider not connected.
chatreturns{ error: "provider not connected" }; run/llmfirst or pick a provider fromawait llm.providers(). - Replying with raw model output. Models can exceed Discord's length limits; always slice.
- No persistence between runs. The model does not remember earlier conversations; build context yourself by storing past exchanges in
dband replaying them inmessages(mind the 50-message / 64k-char caps). - Acting on errors. For destructive actions (delete, ban) treat an API error as SAFE; never punish users because an API hiccuped.
- Slow interactions. In
interactionCreatehandlers calldeferReply()beforellm.chat, theneditReply()with the answer.
Exercise
Build !translate <text> that asks the model to translate the text to English and replies with the translation. Then extend it: store each user's last 4 !ask exchanges in db and include them as prior messages so the model can handle follow-up questions.