R-Eval: A Unified Toolkit for Evaluating Domain Knowledge of Retrieval Augmented Large Language Models

Date: August 27, 2024

[Slides]
A Python toolkit designed to streamline the evaluation of different RAG workflows in conjunction with LLMs. Our toolkit, which supports popular built-in RAG workflows and allows for the incorporation of customized testing data on the specific domain. * User-friendly: R-Eval provides easy-to-use scripts for running and analysing experiments with the given models and datasets automatically. * Modular: R-Eval is designed to be modular, which allows users to easily extend the framework with new models, datasets, and analysis tools. * Extensibility: The domain-agnostic design of R-Eval makes it easy to evaluate Retrieval Augmented Large Language Models on new domain based on our framework.

Share on

Twitter Facebook LinkedIn

Yuanchun Wang (王元淳)

Share on