Skip to main content
Evals are functions that run against a session and return a pass/fail result plus an optional 0-1 score. Claudeye runs them when you open a session, caches the results, and shows them in a panel on the session page.

What evals give you

  • Pass/fail per session: see at a glance which sessions met your criteria
  • Score (0-1): track quality over time, not just binary outcomes
  • Message: a short human-readable explanation shown in the UI
  • Conditional skipping: skip evals for sessions where they don’t apply

Quick example

Create a file my-evals.js:
my-evals.js
import { createApp } from 'claudeye';

const app = createApp();

// Grade sessions by turn count
app.eval('under-50-turns', ({ stats }) => ({
  pass: stats.turnCount <= 50,
  score: Math.max(0, 1 - stats.turnCount / 100),
  message: `${stats.turnCount} turns`,
}));

app.listen();
Then run Claudeye with the --evals flag:
claudeye --evals ./my-evals.js
Open any session and you’ll see the under-50-turns result in the Evals panel.

What’s available in an eval

Each eval function receives a context object with:
FieldTypeDescription
statsEvalLogStatsComputed session stats (turns, tool calls, duration, models, subagents)
entriesobject[]Raw JSONL entries for the session
projectNamestringProject name
sessionIdstringSession ID
sourcestringEntry point identifier (useful for subagent evals)

stats fields

stats.turnCount       // Number of turns (user+assistant pairs)
stats.toolCallCount   // Total tool calls
stats.subagentCount   // Number of subagents spawned
stats.duration        // Session duration string
stats.models          // Array of model IDs used

What an eval returns

{
  pass: boolean,     // Required: did the session pass?
  score?: number,    // Optional: 0 to 1, clamped automatically
  message?: string,  // Optional: shown in the UI
}

Next steps