Interface: Evaluator<Input, Output, Expected, Metadata>

Type parameters

Name	Type
`Input`	`Input`
`Output`	`Output`
`Expected`	`Expected`
`Metadata`	extends `BaseMetadata` = `DefaultMetadataType`

Properties

data

• data: EvalData<Input, Expected, Metadata>

A function that returns a list of inputs, expected outputs, and metadata.

experimentName

• Optional experimentName: string

An optional name for the experiment.

isPublic

• Optional isPublic: boolean

Whether the experiment should be public. Defaults to false.

metadata

• Optional metadata: Record<string, unknown>

Optional additional metadata for the experiment.

scores

• scores: EvalScorer<Input, Output, Expected, Metadata>[]

A set of functions that take an input, output, and expected value and return a score.

task

• task: EvalTask<Input, Output>

A function that takes an input and returns an output.

trialCount

• Optional trialCount: number

The number of times to run the evaluator per input. This is useful for evaluating applications that have non-deterministic behavior and gives you both a stronger aggregate measure and a sense of the variance in the results.

update

• Optional update: boolean

Whether to update an existing experiment with experiment_name if one exists. Defaults to false.

DatasetSummary ExperimentSummary