All custom logic lives in a single JS file. Load it with:
claudeye --evals ./my-evals.js
The file must call app.listen() at the end.
Evals
Evals return a pass/fail result and an optional 0-1 score.
Check turn count
app.eval('under-50-turns', ({ stats }) => ({
pass: stats.turnCount <= 50,
score: Math.max(0, 1 - stats.turnCount / 100),
message: `${stats.turnCount} turns`,
}));
app.eval('tool-success-rate', ({ entries }) => {
const toolResults = entries.filter(e =>
e.type === 'user' &&
Array.isArray(e.message?.content) &&
e.message.content.some(b => b.type === 'tool_result')
);
const errors = toolResults.filter(e =>
e.message?.content?.some(b => b.is_error === true)
);
const rate = toolResults.length > 0
? 1 - (errors.length / toolResults.length)
: 1;
return {
pass: rate >= 0.9,
score: rate,
message: `${errors.length} errors out of ${toolResults.length} tool calls`,
};
});
Check that the session completed
app.eval('has-completion', ({ entries }) => {
const last = [...entries].reverse().find(e => e.type === 'assistant');
const hasText = last?.message?.content?.some?.(b => b.type === 'text');
return {
pass: !!hasText,
score: hasText ? 1.0 : 0,
message: hasText ? 'Ended with text response' : 'No final text response',
};
});
Conditional evals
Use condition to skip an eval when it doesn’t apply. Skipped evals show as “skipped” in the UI rather than failing.
// Only run this eval for sessions with at least 5 turns
app.eval('under-budget',
({ stats }) => ({
pass: stats.turnCount <= 30,
score: Math.max(0, 1 - stats.turnCount / 60),
message: `${stats.turnCount} turns`,
}),
{ condition: ({ stats }) => stats.turnCount >= 5 }
);
Global condition
To skip all evals for certain sessions (e.g., empty or test sessions), use app.condition():
// Skip all evals for sessions with no entries
app.condition(({ entries }) => entries.length > 0);
// Skip all evals for test projects
app.condition(({ projectName }) => !projectName.includes('test'));
Calling app.condition() multiple times replaces the previous condition. Only the last one is active.
Subagent evals
To grade subagents separately from the parent session, set scope:
app.eval('subagent-depth',
({ entries }) => ({
pass: entries.length > 5,
score: Math.min(entries.length / 20, 1),
}),
{ scope: 'subagent', subagentType: 'Explore' }
);
scope options:
'session' (default): runs on the parent session only
'subagent': runs on each subagent
'both': runs on both
Enrichments
Enrichments add key-value metadata to a session. They appear as a table on the session page and can power dashboard filters.
app.enrich('session-summary', ({ stats }) => ({
'Turns': stats.turnCount,
'Tool Calls': stats.toolCallCount,
'Models': stats.models.join(', ') || 'none',
}));
// Token usage breakdown
app.enrich('token-usage', ({ entries }) => {
const input = entries.reduce((s, e) => s + (e.usage?.input_tokens || 0), 0);
const output = entries.reduce((s, e) => s + (e.usage?.output_tokens || 0), 0);
return {
'Input Tokens': input,
'Output Tokens': output,
'Total Tokens': input + output,
};
});
Actions
Actions are on-demand tasks you can trigger from the session page. They receive the full session context plus eval and enrichment results.
app.action('session-summary', ({ stats, evalResults }) => {
const passed = Object.values(evalResults).filter(r => r.pass).length;
const total = Object.keys(evalResults).length;
return {
output: [
`${stats.turnCount} turns, ${stats.toolCallCount} tool calls`,
`${passed}/${total} evals passed`,
].join('\n'),
status: 'success',
};
});
Complete example
import { createApp } from 'claudeye';
const app = createApp();
// Skip empty sessions globally
app.condition(({ entries }) => entries.length > 0);
// Evals
app.eval('under-50-turns', ({ stats }) => ({
pass: stats.turnCount <= 50,
score: Math.max(0, 1 - stats.turnCount / 100),
message: `${stats.turnCount} turns`,
}));
app.eval('has-completion', ({ entries }) => {
const last = [...entries].reverse().find(e => e.type === 'assistant');
const hasText = last?.message?.content?.some?.(b => b.type === 'text');
return { pass: !!hasText, score: hasText ? 1 : 0 };
});
// Enrichments
app.enrich('overview', ({ stats }) => ({
'Turns': stats.turnCount,
'Tool Calls': stats.toolCallCount,
'Models': stats.models.join(', ') || 'none',
}));
app.listen();
claudeye --evals ./my-evals.js