kiln_ai.adapters.eval

Evals

This module contains the code for evaluating the performance of a model.

The submodules contain:

  • BaseEval: each eval technique implements this interface.
  • G-Eval: an eval implementation, that implements G-Eval and LLM as Judge.
  • EvalRunner: a class that runs an full evaluation (many smaller evals jobs). Includes async parallel processing, and the ability to restart where it left off.
  • EvalRegistry: a registry for all eval implementations.

The datamodel for Evals is in the kiln_ai.datamodel.eval module.

 1"""
 2# Evals
 3
 4This module contains the code for evaluating the performance of a model.
 5
 6The submodules contain:
 7
 8- BaseEval: each eval technique implements this interface.
 9- G-Eval: an eval implementation, that implements G-Eval and LLM as Judge.
10- EvalRunner: a class that runs an full evaluation (many smaller evals jobs). Includes async parallel processing, and the ability to restart where it left off.
11- EvalRegistry: a registry for all eval implementations.
12
13The datamodel for Evals is in the `kiln_ai.datamodel.eval` module.
14"""
15
16from . import (
17    base_eval,
18    eval_runner,
19    g_eval,
20    registry,
21)
22
23__all__ = [
24    "base_eval",
25    "eval_runner",
26    "g_eval",
27    "registry",
28]