An introspection platform for evaluating LLM-based systems — benchmark performance, inspect individual results, and characterize your dataset.
Upload your evaluation data and explore results across models, metrics, and tasks.
Browse pre-loaded datasets to see what InspectorRAGet can surface before uploading your own data.