Skip to content

Cookbook

Practical recipes for evaluating text outputs with AutoRubric. Each recipe solves a specific real-world scenario with focused code snippets and complete runnable examples.

Recipe Index

Tier 1: Foundation

Start here if you're new to AutoRubric.

Recipe Domain What You'll Learn
Your First Evaluation Tech Support Basic rubric creation and grading
Managing Datasets Medical Triage Loading, saving, and splitting datasets

Tier 2: Reliability

Improve grading consistency and accuracy.

Recipe Domain What You'll Learn
Few-Shot Calibration Legal Contracts Calibrating judges with labeled examples
Ensemble Judging Job Applications Multi-judge voting for high-stakes decisions
Handling CANNOT_ASSESS RAG Responses Strategies for uncertain verdicts

Tier 3: Advanced Evaluation

Sophisticated evaluation techniques.

Recipe Domain What You'll Learn
Multi-Choice Rubrics Restaurant Reviews Ordinal/nominal scales with Likert ratings
Extended Thinking Security Assessments Deep reasoning for complex evaluations
Length Penalty Executive Summaries Penalizing verbose responses

Tier 4: Validation & Production

Deploy with confidence.

Recipe Domain What You'll Learn
Evaluating Rubric Quality Peer Review Meta-rubrics to validate and improve rubrics
Automated Rubric Improvement EV Analysis LLM-driven iterative refinement of rubrics
Judge Validation Content Moderation Measuring agreement with human labels
Synthetic Ground Truth Product Descriptions Bootstrapping labels from strong models
Batch Evaluation Customer Feedback Checkpointing, resumption, and cost tracking

Tier 5: Specialized

Advanced patterns for specific needs.

Recipe Domain What You'll Learn
Per-Item Rubrics Coding Interviews Different rubrics for different items
Cost Optimization News Fact-Checking Caching and model selection strategies
Configuration Management Academic Papers Sharing reproducible configs across teams

Quick Start

If you haven't installed AutoRubric yet:

pip install autorubric

Set up your API key for your preferred provider:

export OPENAI_API_KEY=your_key_here
# or
export ANTHROPIC_API_KEY=your_key_here
# or
export GEMINI_API_KEY=your_key_here

Then jump into Your First Evaluation to get started.

Recipe Format

Each recipe follows a consistent structure:

  1. The Scenario - A realistic problem you might face
  2. What You'll Learn - Key features and concepts covered
  3. The Solution - Step-by-step implementation with focused code snippets
  4. Key Takeaways - Summary of important points
  5. Appendix: Complete Code - Full runnable script you can copy-paste

Prerequisites

All recipes assume:

  • Python 3.10+
  • AutoRubric installed (pip install autorubric)
  • An API key for at least one supported provider
  • Basic familiarity with async/await (recipes use asyncio.run() for simplicity)