kiln_ai.adapters.eval
Evals
This module contains the code for evaluating the performance of a model.
The submodules contain:
- BaseEval: each eval technique implements this interface.
- G-Eval: an eval implementation, that implements G-Eval and LLM as Judge.
- EvalRunner: a class that runs an full evaluation (many smaller evals jobs). Includes async parallel processing, and the ability to restart where it left off.
- EvalRegistry: a registry for all eval implementations.
The datamodel for Evals is in the kiln_ai.datamodel.eval
module.
1""" 2# Evals 3 4This module contains the code for evaluating the performance of a model. 5 6The submodules contain: 7 8- BaseEval: each eval technique implements this interface. 9- G-Eval: an eval implementation, that implements G-Eval and LLM as Judge. 10- EvalRunner: a class that runs an full evaluation (many smaller evals jobs). Includes async parallel processing, and the ability to restart where it left off. 11- EvalRegistry: a registry for all eval implementations. 12 13The datamodel for Evals is in the `kiln_ai.datamodel.eval` module. 14""" 15 16from . import ( 17 base_eval, 18 eval_runner, 19 g_eval, 20 registry, 21) 22 23__all__ = [ 24 "base_eval", 25 "eval_runner", 26 "g_eval", 27 "registry", 28]