Evaliphy is currently in beta. It is not recommended for production use yet. Please try it out and share your feedback.

Configuration

Evaliphy is configured via an evaliphy.config.ts file in your project root. Use the defineConfig helper for full TypeScript support.

import { defineConfig } from 'evaliphy';

export default defineConfig({
  // ... configuration options
});

Configuration Options

Root Options

OptionTypeDefaultDescription
evalDirstring'./evals'The root directory where Evaliphy searches for evaluation files.
testMatchstring[]['**/*.eval.ts', '**/*.eval.js']Glob patterns to identify evaluation files within the evalDir.
testIgnorestring[]['**/node_modules/**']Glob patterns to exclude when scanning for evaluation files.
timeoutnumber-Global timeout for individual test functions in milliseconds.
reportersstring | string[]['console']List of reporters to use. Built-in options: 'console', 'html', 'json'.
httpHttpConfig-Global settings for the built-in httpClient fixture.
llmAsJudgeConfigLLMJudgeConfig-Configuration for the LLM used to evaluate assertions.

llmAsJudgeConfig

Configures the LLM "Judge" that evaluates your assertions.

OptionTypeDefaultDescription
modelstringRequiredThe model identifier (e.g., 'gpt-4o-mini', 'claude-3-5-sonnet-20240620').
providerLLMProviderRequiredConfiguration for the LLM provider (see Providers below).
temperaturenumber0Controls randomness. Set to 0 for deterministic evaluation results.
maxTokensnumber1000Maximum tokens allowed for the judge's reasoning and verdict.
timeoutnumber-Specific timeout for LLM judge calls, overriding the global timeout.
promptsDirstringPrompts shipped with EvaliphyDirectory containing custom prompt templates for assertions.
continueOnFailurebooleantrueIf true, continues executing the test even after an assertion fails (Soft Assertions). If false, stops execution immediately (Hard Assertions).

Providers

Evaliphy supports direct integration with major providers or via a gateway.

Direct Provider

{
  type: 'openai' | 'anthropic' | 'google' | 'mistral' | 'groq' | 'cohere',
  apiKey?: string // Falls back to process.env[PROVIDER_API_KEY]
}

Gateway Provider (e.g., OpenRouter, LiteLLM)

{
  type: 'gateway',
  url: string,    // The base URL of your gateway
  apiKey?: string,
  name?: string   // Optional name for logging/reporting
}

http

Global settings for the httpClient fixture, allowing you to pre-configure your RAG API connection.

OptionTypeDefaultDescription
baseUrlstringRequiredThe base URL to prepend to all request paths in your tests.
timeoutnumber120000 (2m)Global request timeout in milliseconds.
headersRecord<string, string>-Common HTTP headers (e.g., Authorization) sent with every request.
retryobject-Retry logic for failed requests (5xx or network errors).
retry.attemptsnumber-Number of retry attempts.
retry.delaynumber-Delay between retries in milliseconds.

Example Configuration

You can find a comprehensive sample configuration file here.

import { defineConfig } from 'evaliphy';

export default defineConfig({
  // Discovery
  evalDir: './evals',
  testMatch: ['**/*.eval.ts'],
  
  // LLM Judge Setup
  llmAsJudgeConfig: {
    model: 'gpt-4o-mini',
    provider: {
      type: 'openai',
      apiKey: process.env.OPENAI_API_KEY,
    },
    temperature: 0,
    continueOnFailure: true, // Soft assertions: record all failures but keep going
  },

  // API Connection
  http: {
    baseUrl: 'https://api.staging.example.com/v1',
    timeout: 30000,
    headers: {
      'X-Eval-Session': 'qa-regression',
    },
    retry: {
      attempts: 2,
      delay: 1000,
    },
  },

  // Reporting
  reporters: ['console', 'html'],
});