Image support
RAGE4j evaluators can pass images to the judging LLM alongside the textual context. This is intended for RAG systems where the answer was produced from a mix of text and images (e.g. diagrams, charts, screenshots, photographs) and the evaluator needs to "see" the same images to make a fair judgment.
When to use
Images are part of the context — what the system under test had to work with — not part of the question. Today they are forwarded by:
FaithfulnessEvaluator– checks each answer claim against text context plus images.ContextRelevanceLlmEvaluator– scores how relevant the combined text-and-image context is to the question.
Other evaluators (AnswerCorrectness, AnswerRelevance, BLEU, ROUGE,
SemanticSimilarity) deliberately ignore images. Their metrics are either
purely textual (correctness vs. ground truth) or numeric (n-gram / embedding
based) and would not benefit from a visual signal.
Attaching images to a Sample
Rage4jImage exposes three factory methods. The image name is required for
persistence and is auto-derived where possible.
import dev.rage4j.model.Rage4jImage;
import java.nio.file.Path;
Rage4jImage fromFile = Rage4jImage.fromPath(Path.of("eiffel-tower.jpg"));
Rage4jImage fromUrl = Rage4jImage.fromUrl("https://example.com/paris-map.png");
Rage4jImage fromBytes = Rage4jImage.fromBytes(bytes, "image/png", "louvre.png");
Sample sample = Sample.builder()
.withQuestion("What landmarks are mentioned in the document?")
.withContext("Paris is the capital of France and home to many landmarks.")
.withImages(List.of(fromFile, fromUrl, fromBytes))
.withAnswer(answer)
.build();
fromPath reads the file eagerly and derives the MIME type from the extension
(.png, .jpg/.jpeg, .gif, .webp, .bmp).
Vision-capable models
The judging ChatModel must support multimodal input (e.g. gpt-4o,
gpt-4o-mini). LangChain4j 1.x does not expose a vision capability flag on
ChatModel, so the evaluator cannot detect this automatically. You opt in
explicitly:
ChatModel visionModel = OpenAiChatModel.builder()
.apiKey(apiKey)
.modelName("gpt-4o-mini")
.build();
FaithfulnessEvaluator evaluator = new FaithfulnessEvaluator(visionModel, true);
ContextRelevanceLlmEvaluator ctx = new ContextRelevanceLlmEvaluator(visionModel, true);
If a sample contains images but the evaluator was constructed without the
vision flag, an UnsupportedOperationException is thrown before any LLM call:
Faithfulness evaluator received a Sample with 3 image(s) but was not
configured for vision. Pass a vision-capable ChatModel (e.g. gpt-4o)
and use the constructor variant that takes supportsVision=true.
The text-only constructors (new FaithfulnessEvaluator(model)) keep their
original behaviour and are still the right choice for samples without images.
End-to-end example
ChatModel visionModel = OpenAiChatModel.builder()
.apiKey(apiKey)
.modelName("gpt-4o-mini")
.build();
Sample sample = Sample.builder()
.withQuestion("What landmarks are mentioned in the document?")
.withContext("Paris is the capital of France and home to many landmarks.")
.withImages(List.of(
Rage4jImage.fromPath(Path.of("eiffel-tower.jpg")),
Rage4jImage.fromPath(Path.of("louvre.png")),
Rage4jImage.fromPath(Path.of("notre-dame.jpg"))))
.withAnswer(answer)
.withGroundTruth("Eiffel Tower, Louvre, and Notre-Dame are among the famous landmarks of Paris.")
.build();
FaithfulnessEvaluator faithfulness =
new FaithfulnessEvaluator(visionModel, true);
ContextRelevanceLlmEvaluator relevance =
new ContextRelevanceLlmEvaluator(visionModel, true);
Evaluation faithfulnessScore = faithfulness.evaluate(sample);
Evaluation contextScore = relevance.evaluate(sample);
Persistence
When samples are written through the persist module, only image names are stored – the bytes never reach the JSONL file:
{
"sample": {
"question": "What landmarks are mentioned in the document?",
"context": "Paris is the capital of France and home to many landmarks.",
"images": ["eiffel-tower.jpg", "louvre.png", "notre-dame.jpg"]
},
"metrics": { "Faithfulness": 0.83, "Context relevance LLM": 1.0 }
}
If you need to re-run evaluations from a stored record, keep the original
images on disk and re-attach them via Rage4jImage.fromPath(...) using the
name as a lookup key.