Run adaptive ranking end-to-end from data and model settings
Source:R/adaptive_rank.R
adaptive_rank.RdHigh-level workflow wrapper that reads sample data, constructs an LLM judge,
starts or resumes adaptive state, runs adaptive_rank_run_live(), and
returns state plus summary outputs.
Usage
adaptive_rank(
data,
id_col = 1,
text_col = 2,
backend = c("openai", "anthropic", "gemini", "together", "ollama"),
model = NULL,
trait = "overall_quality",
trait_name = NULL,
trait_description = NULL,
prompt_template = set_prompt_template(),
endpoint = "chat.completions",
api_key = NULL,
include_raw = FALSE,
judge_args = list(),
judge_call_args = list(),
n_steps = 1L,
fit_fn = NULL,
adaptive_config = NULL,
btl_config = NULL,
session_dir = NULL,
persist_item_log = FALSE,
resume = TRUE,
seed = 1L,
progress = c("all", "refits", "steps", "none"),
progress_redraw_every = 10L,
progress_show_events = TRUE,
progress_errors = TRUE,
save_outputs = FALSE,
output_file = NULL,
judge = NULL
)Arguments
- data
Data source: a data frame/tibble, a file path (
.csv,.tsv,.txt,.rds), or a directory containing.txtfiles.- id_col
ID column selector for tabular inputs. Passed to
read_samples_df().- text_col
Text column selector for tabular inputs. Passed to
read_samples_df().- backend
Backend passed to
make_adaptive_judge_llm().- model
Model passed to
make_adaptive_judge_llm().- trait
Built-in trait key used when no custom trait is supplied. Ignored when both
trait_nameandtrait_descriptionare supplied.- trait_name
Optional custom trait display name.
- trait_description
Optional custom trait definition.
- prompt_template
Prompt template string. Defaults to
set_prompt_template().- endpoint
Endpoint family passed to
make_adaptive_judge_llm(). Only used whenbackend = "openai"; ignored otherwise.- api_key
Optional API key passed to
make_adaptive_judge_llm().- include_raw
Logical; forwarded to
make_adaptive_judge_llm().- judge_args
Named list of fixed additional arguments forwarded to
llm_compare_pair()by the generated judge.- judge_call_args
Named list of additional arguments forwarded to the judge at run time through
adaptive_rank_run_live().- n_steps
Maximum number of attempted adaptive steps to execute in this call. The run may return earlier due to candidate starvation or BTL stop criteria. Attempted invalid steps also count toward this limit.
- fit_fn
Optional fit override passed to
adaptive_rank_run_live().- adaptive_config
Optional named list passed to
adaptive_rank_start()andadaptive_rank_run_live()to control adaptive controller behavior. Supported fields:global_identified_reliability_min,global_identified_rank_corr_min,p_long_low,p_long_high,long_taper_mult,long_frac_floor,mid_bonus_frac,explore_taper_mult,boundary_k,boundary_window,boundary_frac,p_star_override_margin, andstar_override_budget_per_round. Unknown fields and invalid values abort with actionable errors.- btl_config
Optional named list passed to
adaptive_rank_run_live()to control BTL refit cadence, stopping diagnostics, and selected round-log diagnostics. Supported fields:refit_pairs_target,model_variant,ess_bulk_min,ess_bulk_min_near_stop,max_rhat,divergences_max,eap_reliability_min,stability_lag,theta_corr_min,theta_sd_rel_change_max,rank_spearman_min,near_tie_p_low, andnear_tie_p_high(near_tie_*affects round logging only, not stop decisions). Defaults are resolved from the current item count and merged with user overrides.- session_dir
Optional session directory for persistence/resume.
- persist_item_log
Logical; write per-refit item logs when
TRUE.- resume
Logical; when
TRUEandsession_dircontains a valid session, resume from disk; otherwise initialize a new state.- seed
Integer seed used when creating a new adaptive state.
- progress
Progress mode for
adaptive_rank_run_live().- progress_redraw_every
Redraw interval for progress output.
- progress_show_events
Logical; show step events.
- progress_errors
Logical; show invalid-step events.
- save_outputs
Logical; when
TRUE, save returned outputs as.rds.- output_file
Optional output
.rdspath. IfNULLandsave_outputs = TRUE, defaults tofile.path(session_dir, "adaptive_outputs.rds")whensession_diris set, otherwise to a temporary file.- judge
Optional prebuilt judge function with contract
judge(A, B, state, ...). If supplied, model/trait/template options are ignored and this function is used directly.
Value
A list with:
- state
Final
adaptive_state.- summary
Run-level summary from
summarize_adaptive().- refits
Per-refit summary from
summarize_refits().- items
Item summary from
summarize_items().- logs
Canonical logs from
adaptive_get_logs().- output_file
Saved output path when
save_outputs = TRUE, otherwiseNULL.
Details
This helper is designed for end users who want one entry point for adaptive runs. It supports:
data input from a data frame, file (
.csv,.tsv,.txt,.rds), or a directory of.txtfiles;model/backend configuration through
make_adaptive_judge_llm();all adaptive runtime controls exposed by
adaptive_rank_run_live();resumability via
session_dirandresume;optional saving of run outputs to an
.rdsartifact.
Model options:
use judge_args (fixed) and judge_call_args (per-run overrides) to pass
any additional llm_compare_pair() arguments, including provider-specific
controls such as reasoning, service_tier, temperature, top_p,
logprobs, include_thoughts, or host.
Adaptive options:
all key controls from adaptive_rank_run_live() are available directly:
n_steps, fit_fn, adaptive_config, btl_config, progress,
progress_redraw_every, progress_show_events, progress_errors,
session_dir, and persist_item_log.
Use adaptive_config for identifiability-gated controller behavior and
btl_config for inference/diagnostics cadence only.
Selection semantics: pair selection is TrueSkill-driven in one-pair transactional steps. Rolling anchors are refreshed from current score proxies and anchor-link routing compares exactly one anchor endpoint with one non-anchor endpoint. Long/mid-link routing excludes anchor-anchor and anchor-non-anchor pairs, while local-link routing admits same-stratum pairs and anchor-involving pairs according to stage bounds.
Wrapper-visible defaults include top-band refinement
(top_band_pct = 0.10, top_band_bins = 5) with top-band size computed as
ceiling(top_band_pct * N).
Exposure and repeat routing:
under-represented routing is degree-based (deg <= D_min + 1), while
repeat-pressure gating is based on recent exposure (bottom-quantile
recent_deg with quantile default 0.25) and per-endpoint repeat slot
accounting.
Inference separation: BTL refits are used for posterior inference, diagnostics, and stopping only. They are not used to choose the next pair.
Resume behavior:
when resume = TRUE and session_dir already contains adaptive artifacts,
failed session loads abort with an actionable error instead of starting a
fresh run silently.
Examples
data("example_writing_samples", package = "pairwiseLLM")
out <- adaptive_rank(
data = example_writing_samples[1:8, c("ID", "text", "quality_score")],
id_col = "ID",
text_col = "text",
model = "gpt-5.1",
judge = function(A, B, state, ...) {
y <- as.integer(A$quality_score[[1]] >= B$quality_score[[1]])
list(is_valid = TRUE, Y = y, invalid_reason = NA_character_)
},
n_steps = 4,
progress = "none"
)
out$summary
#> # A tibble: 1 × 6
#> n_items steps_attempted committed_pairs n_refits last_stop_decision
#> <int> <int> <int> <int> <lgl>
#> 1 8 4 4 0 FALSE
#> # ℹ 1 more variable: last_stop_reason <chr>
head(out$logs$step_log)
#> # A tibble: 4 × 51
#> step_id timestamp pair_id i j A B Y status
#> <int> <dttm> <int> <int> <int> <int> <int> <int> <chr>
#> 1 1 2026-02-11 04:39:08 1 1 4 1 4 0 ok
#> 2 2 2026-02-11 04:39:08 2 4 8 4 8 0 ok
#> 3 3 2026-02-11 04:39:08 3 8 2 8 2 1 ok
#> 4 4 2026-02-11 04:39:08 4 2 6 2 6 0 ok
#> # ℹ 42 more variables: round_id <int>, round_stage <chr>, pair_type <chr>,
#> # used_in_round_i <int>, used_in_round_j <int>, is_anchor_i <lgl>,
#> # is_anchor_j <lgl>, stratum_i <int>, stratum_j <int>, dist_stratum <int>,
#> # stage_committed_so_far <int>, stage_quota <int>, is_explore_step <lgl>,
#> # explore_mode <chr>, explore_reason <chr>, explore_rate_used <dbl>,
#> # local_priority_mode <chr>, long_gate_pass <lgl>, long_gate_reason <chr>,
#> # star_override_used <lgl>, star_override_reason <chr>, …
if (FALSE) { # \dontrun{
# Live run with OpenAI gpt-5.1 + flex priority.
live <- adaptive_rank(
data = example_writing_samples[1:12, c("ID", "text")],
backend = "openai",
model = "gpt-5.1",
endpoint = "responses",
judge_args = list(
reasoning = "low",
service_tier = "flex",
include_thoughts = FALSE
),
btl_config = list(
refit_pairs_target = 20L,
ess_bulk_min = 500,
eap_reliability_min = 0.92
),
adaptive_config = list(
explore_taper_mult = 0.40,
star_override_budget_per_round = 2L
),
n_steps = 120,
session_dir = file.path(tempdir(), "adaptive-live"),
persist_item_log = TRUE,
resume = TRUE,
progress = "all",
save_outputs = TRUE
)
print(live$state)
live$summary
} # }