Introduction
RAGE4j-core is a Java library for evaluating Large Language Model (LLM) generations, inspired by the Python Ragas library. It provides a robust framework for assessing the quality of LLM responses across multiple dimensions.
Overview
The RAGE4j library consists of two main components:
- RAGE4j-Core: The core library providing evaluation tools and metrics
- RAGE4j-Assert: An extension offering a testing API for LLM evaluations
This section provides documentation about the RAGE4j-core.
Key Features
RAGE4j provides four built-in evaluators for assessing different aspects of LLM responses:
-
Answer Relevance: Evaluates how well an answer addresses the original question by generating potential questions from the answer and comparing them to the original question.
-
Answer Correctness: Measures the accuracy of an answer against ground truth using an F1 score based on true positive, false positive, and false negative claims.
-
Faithfulness: Assesses whether the claims in an answer can be supported by the provided context.
-
Semantic Similarity: Computes the semantic similarity between the answer and ground truth using embedding-based cosine similarity.
Quick example
// 1. Create an evaluator
Evaluator answerRelevanceEvaluator = new AnswerRelevanceEvaluator(chatModel, embeddingModel);
// 2. Create a sample
Sample sample = Sample.builder()
.withQuestion("What is Java?")
.withAnswer("Java is a programming language.")
.build();
// 3. Evaluate and get results
Evaluation result = answerRelevanceEvaluator.evaluate(sample);
// 4. Get our score
System.out.println("Metric name: " + result.getName()); // Metric name: Answer relevance
System.out.println("Metric score: " + result.getName()); // Metric score: 1.0