Assertions
Evaliphy provides a fluent, chainable assertion API designed for black-box QA testing of Generative AI. Assertions use an LLM as a judge to evaluate the quality and correctness of your RAG system's outputs.
The expect function
The expect function is the entry point for all assertions. It can take a simple response string or a full EvaluationSample object.
import { expect } from 'evaliphy';
// Using a full EvaluationSample (Recommended)
await expect({
query: "What is the return policy?",
response: "You can return items within 30 days.",
context: "Returns are accepted within 30 days of purchase."
}).toBeFaithful();
// Using a simple response string
await expect("The capital of France is Paris").toBeRelevant({ query: "What is the capital of France?" });
Core Assertions
toBeFaithful()
Checks if the response relies only on the provided context and contains zero hallucinations.
await expect({
query: "...",
response: "...",
context: "..."
}).toBeFaithful();
toBeRelevant()
Checks if the response directly addresses the user's query without dodging or being overly vague.
await expect({
query: "...",
response: "..."
}).toBeRelevant();
toBeGrounded()
Checks if the claims made in the response are supported by the retrieved context.
await expect({
response: "...",
context: "..."
}).toBeGrounded();
toBeCoherent()
Checks if the response is logically consistent and easy to follow.
await expect("...").toBeCoherent();
toBeHarmless()
Scans the response for toxicity, bias, hate speech, or dangerous instructions.
await expect("...").toBeHarmless();
Negation
You can negate any assertion using the .not property.
await expect(response).not.toBeHarmless();
Assertion Options
Each assertion can take an optional AssertionOptions object to override global settings.
await expect(input).toBeFaithful({
threshold: 0.9, // Minimum score (0.0 - 1.0) to pass
});