Configuration
Evaliphy is configured via an evaliphy.config.ts file in your project root. Use the defineConfig helper for full TypeScript support.
import { defineConfig } from 'evaliphy';
export default defineConfig({
// ... configuration options
});
Configuration Options
Root Options
| Option | Type | Default | Description |
|---|---|---|---|
evalDir | string | './evals' | The root directory where Evaliphy searches for evaluation files. |
testMatch | string[] | ['**/*.eval.ts', '**/*.eval.js'] | Glob patterns to identify evaluation files within the evalDir. |
testIgnore | string[] | ['**/node_modules/**'] | Glob patterns to exclude when scanning for evaluation files. |
timeout | number | - | Global timeout for individual test functions in milliseconds. |
reporters | string | string[] | ['console'] | List of reporters to use. Built-in options: 'console', 'html', 'json'. |
http | HttpConfig | - | Global settings for the built-in httpClient fixture. |
llmAsJudgeConfig | LLMJudgeConfig | - | Configuration for the LLM used to evaluate assertions. |
llmAsJudgeConfig
Configures the LLM "Judge" that evaluates your assertions.
| Option | Type | Default | Description |
|---|---|---|---|
model | string | Required | The model identifier (e.g., 'gpt-4o-mini', 'claude-3-5-sonnet-20240620'). |
provider | LLMProvider | Required | Configuration for the LLM provider (see Providers below). |
temperature | number | 0 | Controls randomness. Set to 0 for deterministic evaluation results. |
maxTokens | number | 1000 | Maximum tokens allowed for the judge's reasoning and verdict. |
timeout | number | - | Specific timeout for LLM judge calls, overriding the global timeout. |
promptsDir | string | Prompts shipped with Evaliphy | Directory containing custom prompt templates for assertions. |
continueOnFailure | boolean | true | If true, continues executing the test even after an assertion fails (Soft Assertions). If false, stops execution immediately (Hard Assertions). |
Providers
Evaliphy supports direct integration with major providers or via a gateway.
Direct Provider
{
type: 'openai' | 'anthropic' | 'google' | 'mistral' | 'groq' | 'cohere',
apiKey?: string // Falls back to process.env[PROVIDER_API_KEY]
}
Gateway Provider (e.g., OpenRouter, LiteLLM)
{
type: 'gateway',
url: string, // The base URL of your gateway
apiKey?: string,
name?: string // Optional name for logging/reporting
}
http
Global settings for the httpClient fixture, allowing you to pre-configure your RAG API connection.
| Option | Type | Default | Description |
|---|---|---|---|
baseUrl | string | Required | The base URL to prepend to all request paths in your tests. |
timeout | number | 120000 (2m) | Global request timeout in milliseconds. |
headers | Record<string, string> | - | Common HTTP headers (e.g., Authorization) sent with every request. |
retry | object | - | Retry logic for failed requests (5xx or network errors). |
retry.attempts | number | - | Number of retry attempts. |
retry.delay | number | - | Delay between retries in milliseconds. |
Example Configuration
You can find a comprehensive sample configuration file here.
import { defineConfig } from 'evaliphy';
export default defineConfig({
// Discovery
evalDir: './evals',
testMatch: ['**/*.eval.ts'],
// LLM Judge Setup
llmAsJudgeConfig: {
model: 'gpt-4o-mini',
provider: {
type: 'openai',
apiKey: process.env.OPENAI_API_KEY,
},
temperature: 0,
continueOnFailure: true, // Soft assertions: record all failures but keep going
},
// API Connection
http: {
baseUrl: 'https://api.staging.example.com/v1',
timeout: 30000,
headers: {
'X-Eval-Session': 'qa-regression',
},
retry: {
attempts: 2,
delay: 1000,
},
},
// Reporting
reporters: ['console', 'html'],
});