Skip to content

Installation

  • Node.js 20.16+ (22 LTS recommended for native TypeScript execution)
  • pnpm, npm, yarn, or bun package manager
Terminal window
# pnpm (recommended)
pnpm add -D agent-eval-kit
# npm
npm install --save-dev agent-eval-kit
# yarn
yarn add --dev agent-eval-kit
# bun
bun add -D agent-eval-kit
Terminal window
npx agent-eval-kit --help

You should see the available commands listed.

The init wizard creates a starter config, case files, and optionally a GitHub Actions workflow:

Terminal window
npx agent-eval-kit init

This creates:

  • eval.config.ts — main configuration file with a framework-specific target stub
  • cases/smoke.jsonl — 3 starter test cases
  • .eval-fixtures/.gitkeep — fixture directory
  • .github/workflows/evals.yml — CI workflow (optional)
  • AGENTS.md — AI agent boundaries file (optional)

The wizard auto-detects your framework (Vercel AI SDK, LangChain, Mastra, or custom) and package manager.

For non-interactive setup: npx agent-eval-kit init --yes

agent-eval-kit provides several subpath exports for targeted imports:

ImportDescription
agent-eval-kitMain entry — defineConfig, runner, storage, comparison, caching utilities
agent-eval-kit/gradersAll 20 graders, composition operators, scoring, grader types
agent-eval-kit/pluginPlugin interface types (EvalPlugin, PluginHooks)
agent-eval-kit/reportersresolveReporter — for custom reporter plugin integration
agent-eval-kit/comparisoncompareRuns, formatComparisonReport
agent-eval-kit/fixturesFixture loading and management
agent-eval-kit/watcherFile watcher for watch mode

Continue to Quick Start to run your first eval.