Based on Anthropic’s “Building Effective Agents” framework.

Evaluator-optimizer uses iterative cycles of generation and evaluation to improve output quality. One component generates content while another evaluates it and provides feedback, creating a loop that continues until the evaluator is satisfied or maximum iterations are reached. This pattern trades computational cost for higher quality results.

Client
Agent
Generator
Evaluator
Iteration Control
request
generate
content
evaluate
feedback
check iteration
continue/stop
final result

When to Use

Use evaluator-optimizer when output quality can be measurably improved through iteration and when you have clear evaluation criteria. It’s ideal for creative tasks like content generation, code optimization, or any scenario where first attempts can be systematically improved. Avoid when the cost of multiple iterations outweighs quality gains or when evaluation criteria are subjective and inconsistent.

Implementation

This example demonstrates iterative social media post creation where a generator creates content and an evaluator provides feedback until the post meets quality standards or maximum iterations are reached.

Agent Code

import { pickaxe } from "@hatchet-dev/pickaxe";
import z from "zod";
import { generatorTool } from "@tools/generator.tool";
import { evaluatorTool } from "@tools/evaluator.tool";

const EvaluatorOptimizerAgentInput = z.object({
  topic: z.string(),
  targetAudience: z.string(),
});

const EvaluatorOptimizerAgentOutput = z.object({
  post: z.string(),
  iterations: z.number(),
});

export const evaluatorOptimizerAgent = pickaxe.agent({
  name: "evaluator-optimizer-agent",
  executionTimeout: "2m",
  inputSchema: EvaluatorOptimizerAgentInput,
  outputSchema: EvaluatorOptimizerAgentOutput,
  description: "Iteratively improves content through evaluation and optimization cycles",
  fn: async (input, ctx) => {
    let currentPost = "";
    let previousFeedback = "";
    const maxIterations = 3;

    // ITERATIVE IMPROVEMENT LOOP
    for (let iteration = 1; iteration <= maxIterations; iteration++) {
      // GENERATION PHASE: Create or improve content
      const { post } = await generatorTool.run({
        topic: input.topic,
        targetAudience: input.targetAudience,
        previousPost: currentPost || undefined,
        previousFeedback: previousFeedback || undefined,
      });

      currentPost = post;

      // EVALUATION PHASE: Assess quality and get feedback
      const { complete, feedback } = await evaluatorTool.run({
        post: currentPost,
        topic: input.topic,
        targetAudience: input.targetAudience,
      });

      // TERMINATION CHECK: Stop if evaluator is satisfied
      if (complete) {
        return {
          post: currentPost,
          iterations: iteration,
        };
      }

      previousFeedback = feedback;
    }

    // FALLBACK: Return best attempt if max iterations reached
    if (!currentPost) {
      throw new Error("Failed to generate any post after maximum iterations");
    }

    return {
      post: currentPost,
      iterations: maxIterations,
    };
  },
});

The pattern implements a controlled iteration loop with clear termination criteria: evaluator satisfaction or maximum iterations reached. Each cycle builds on previous attempts, creating progressively better outputs through systematic feedback incorporation.

This pattern combines well with parallelization when multiple evaluators provide different perspectives, and can be enhanced with routing to direct different content types to specialized generators and evaluators.