Installation

Prerequisites

Node.js 20.16+ (22 LTS recommended for native TypeScript execution)
pnpm, npm, yarn, or bun package manager

Install

# pnpm (recommended)
pnpm add -D agent-eval-kit

# npm
npm install --save-dev agent-eval-kit

# yarn
yarn add --dev agent-eval-kit

# bun
bun add -D agent-eval-kit

Verify installation

npx agent-eval-kit --help

You should see the available commands listed.

Initialize a project

The init wizard creates a starter config, case files, and optionally a GitHub Actions workflow:

npx agent-eval-kit init

This creates:

eval.config.ts — main configuration file with a framework-specific target stub
cases/smoke.jsonl — 3 starter test cases
.eval-fixtures/.gitkeep — fixture directory
.github/workflows/evals.yml — CI workflow (optional)
AGENTS.md — AI agent boundaries file (optional)

The wizard auto-detects your framework (Vercel AI SDK, LangChain, Mastra, or custom) and package manager.

For non-interactive setup: npx agent-eval-kit init --yes

Package exports

agent-eval-kit provides several subpath exports for targeted imports:

Import	Description
`agent-eval-kit`	Main entry — `defineConfig`, runner, storage, comparison, caching utilities
`agent-eval-kit/graders`	All 20 graders, composition operators, scoring, grader types
`agent-eval-kit/plugin`	Plugin interface types (`EvalPlugin`, `PluginHooks`)
`agent-eval-kit/reporters`	`resolveReporter` — for custom reporter plugin integration
`agent-eval-kit/comparison`	`compareRuns`, `formatComparisonReport`
`agent-eval-kit/fixtures`	Fixture loading and management
`agent-eval-kit/watcher`	File watcher for watch mode

Next steps

Continue to Quick Start to run your first eval.