Build an LLM judge function for adaptive ranking
Source:R/adaptive_rank.R
make_adaptive_judge_llm.RdCreates a judge function compatible with adaptive_rank_run_live() by
wrapping llm_compare_pair() and converting provider responses into
adaptive binary outcomes (Y in {0,1}).
Usage
make_adaptive_judge_llm(
backend = c("openai", "anthropic", "gemini", "together", "ollama"),
model,
trait = "overall_quality",
trait_name = NULL,
trait_description = NULL,
prompt_template = set_prompt_template(),
endpoint = "chat.completions",
api_key = NULL,
include_raw = FALSE,
text_col = "text",
judge_args = list()
)Arguments
- backend
Backend passed to
llm_compare_pair().- model
Model identifier passed to
llm_compare_pair().- trait
Built-in trait key used when no custom trait is supplied. Ignored when both
trait_nameandtrait_descriptionare supplied.- trait_name
Optional custom trait display name.
- trait_description
Optional custom trait definition.
- prompt_template
Prompt template string. Defaults to
set_prompt_template().- endpoint
Endpoint family passed to
llm_compare_pair(). Only used whenbackend = "openai"; ignored otherwise.- api_key
Optional API key passed to
llm_compare_pair().- include_raw
Logical; forwarded to
llm_compare_pair().- text_col
Name of the text column expected in adaptive item rows.
- judge_args
Named list of additional fixed arguments forwarded to
llm_compare_pair(). Use this for provider-specific controls such asreasoning,service_tier,temperature,top_p,logprobs,host, orinclude_thoughts.
Value
A function judge(A, B, state, ...) returning a list with fields
is_valid, Y, and invalid_reason.
Details
The returned function has signature judge(A, B, state, ...) and enforces
the adaptive transactional contract:
it returns is_valid = TRUE with Y in {0,1} when the model response
identifies one of the two presented items, and returns is_valid = FALSE
otherwise.
Model configuration is split into:
fixed build-time options via
judge_args,per-run overrides via
judge_call_argsinadaptive_rank(),optional per-step overrides via
...passed throughadaptive_rank_run_live().
Collectively this supports all llm_compare_pair() options, including
backend-specific parameters such as OpenAI reasoning and service_tier.
Examples
judge <- make_adaptive_judge_llm(
backend = "openai",
model = "gpt-5.1",
endpoint = "responses",
judge_args = list(
reasoning = "low",
service_tier = "flex",
include_thoughts = FALSE
)
)