Skip to contents

Creates a judge function compatible with adaptive_rank_run_live() by wrapping llm_compare_pair() and converting provider responses into adaptive binary outcomes (Y in {0,1}).

Usage

make_adaptive_judge_llm(
  backend = c("openai", "anthropic", "gemini", "together", "ollama"),
  model,
  trait = "overall_quality",
  trait_name = NULL,
  trait_description = NULL,
  prompt_template = set_prompt_template(),
  endpoint = "chat.completions",
  api_key = NULL,
  include_raw = FALSE,
  text_col = "text",
  judge_args = list()
)

Arguments

backend

Backend passed to llm_compare_pair().

model

Model identifier passed to llm_compare_pair().

trait

Built-in trait key used when no custom trait is supplied. Ignored when both trait_name and trait_description are supplied.

trait_name

Optional custom trait display name.

trait_description

Optional custom trait definition.

prompt_template

Prompt template string. Defaults to set_prompt_template().

endpoint

Endpoint family passed to llm_compare_pair(). Only used when backend = "openai"; ignored otherwise.

api_key

Optional API key passed to llm_compare_pair().

include_raw

Logical; forwarded to llm_compare_pair().

text_col

Name of the text column expected in adaptive item rows.

judge_args

Named list of additional fixed arguments forwarded to llm_compare_pair(). Use this for provider-specific controls such as reasoning, service_tier, temperature, top_p, logprobs, host, or include_thoughts.

Value

A function judge(A, B, state, ...) returning a list with fields is_valid, Y, and invalid_reason.

Details

The returned function has signature judge(A, B, state, ...) and enforces the adaptive transactional contract: it returns is_valid = TRUE with Y in {0,1} when the model response identifies one of the two presented items, and returns is_valid = FALSE otherwise.

Model configuration is split into:

Collectively this supports all llm_compare_pair() options, including backend-specific parameters such as OpenAI reasoning and service_tier.

Examples

judge <- make_adaptive_judge_llm(
  backend = "openai",
  model = "gpt-5.1",
  endpoint = "responses",
  judge_args = list(
    reasoning = "low",
    service_tier = "flex",
    include_thoughts = FALSE
  )
)