Introduction

RAGE4j-core is a Java library for evaluating Large Language Model (LLM) generations, inspired by the Python Ragas library. It provides a robust framework for assessing the quality of LLM responses across multiple dimensions.

Overview

The RAGE4j library consists of two main components:

RAGE4j-Core: The core library providing evaluation tools and metrics
RAGE4j-Assert: An extension offering a testing API for LLM evaluations

This section provides documentation about the RAGE4j-core.

Key Features

RAGE4j provides four built-in evaluators for assessing different aspects of LLM responses:

Answer Relevance: Evaluates how well an answer addresses the original question by generating potential questions from the answer and comparing them to the original question.
Answer Correctness: Measures the accuracy of an answer against ground truth using an F1 score based on true positive, false positive, and false negative claims.
Faithfulness: Assesses whether the claims in an answer can be supported by the provided context.
Semantic Similarity: Computes the semantic similarity between the answer and ground truth using embedding-based cosine similarity.

Quick example

// 1. Create an evaluator
Evaluator answerRelevanceEvaluator = new AnswerRelevanceEvaluator(chatModel, embeddingModel);

// 2. Create a sample
Sample sample = Sample.builder()
    .withQuestion("What is Java?")
    .withAnswer("Java is a programming language.")
    .build();

// 3. Evaluate and get results
Evaluation result = answerRelevanceEvaluator.evaluate(sample);

// 4. Get our score
System.out.println("Metric name: " + result.getName()); // Metric name: Answer relevance
System.out.println("Metric score: " + result.getName()); // Metric score: 1.0

Overview​

Key Features​

Quick example​

Overview

Key Features

Quick example