InspectorRAGet

Switch to dark mode

Export

Report bug

Github

Welcome to InspectorRAGet

An introspection platform for evaluating LLM-based systems — benchmark performance, inspect individual results, and characterize your dataset.

Try it out
Visualize
Upload your evaluation data and explore results across models, metrics, and tasks.
- Aggregate performance breakdowns
- Per-instance inspection
- Multi-metric comparison
- Annotator qualification
Explore
Examples
Browse pre-loaded datasets to see what InspectorRAGet can surface before uploading your own data.
- RAG evaluation
- Text generation
- Tool calling
- Agentic traces

Welcome to InspectorRAGet

An introspection platform for evaluating LLM-based systems — benchmark performance, inspect individual results, and characterize your dataset.

Try it out
Visualize
Upload your evaluation data and explore results across models, metrics, and tasks.
- Aggregate performance breakdowns
- Per-instance inspection
- Multi-metric comparison
- Annotator qualification
Explore
Examples
Browse pre-loaded datasets to see what InspectorRAGet can surface before uploading your own data.
- RAG evaluation
- Text generation
- Tool calling
- Agentic traces