This function fits an Elo-based paired-comparison model using the
EloChoice package. It is intended to complement
fit_bt_model by providing an alternative scoring framework
based on Elo ratings rather than Bradley–Terry models.
Arguments
- elo_data
A data frame or tibble containing
winnerandlosercolumns. Typically produced usingbuild_elo_data.- runs
Integer number of randomizations to use in
EloChoice::elochoice. Default is 5.- verbose
Logical. If
TRUE(default), show any messages/warnings emitted by the underlying fitting functions. IfFALSE, suppress noisy output to keep examples and reports clean.- ...
Additional arguments passed to
EloChoice::elochoice().
Value
A named list with components:
- engine
Character scalar identifying the scoring engine (
"EloChoice").- fit
The
"elochoice"model object.- elo
A tibble with columns
IDandelo.- reliability
Numeric scalar: mean unweighted reliability index.
- reliability_weighted
Numeric scalar: mean weighted reliability index.
Details
The input elo_data must contain two columns:
winner: ID of the winning sample in each pairwise trialloser: ID of the losing sample in each trial
These can be created from standard pairwise comparison output using
build_elo_data.
Internally, this function calls:
elochoice— to estimate Elo ratings using repeated randomization of trial order;reliability— to compute unweighted and weighted reliability indices as described in Clark et al. (2018).
If the EloChoice package is not installed, a helpful error message is shown telling the user how to install it.
The returned object mirrors the structure of fit_bt_model
for consistency across scoring engines:
engine— always"EloChoice".fit— the raw"elochoice"object returned byEloChoice::elochoice().elo— a tibble with columns:ID: sample identifierelo: estimated Elo rating
(Unlike Bradley–Terry models, EloChoice does not provide standard errors for these ratings, so none are returned.)
reliability— the mean unweighted reliability index (mean proportion of “upsets” across randomizations).reliability_weighted— the mean weighted reliability index (weighted version of the upset measure).
References
Clark AP, Howard KL, Woods AT, Penton-Voak IS, Neumann C (2018). "Why rate when you could compare? Using the 'EloChoice' package to assess pairwise comparisons of perceived physical strength." PLOS ONE, 13(1), e0190393. doi:10.1371/journal.pone.0190393 .
Examples
data("example_writing_pairs", package = "pairwiseLLM")
elo_data <- build_elo_data(example_writing_pairs)
fit <- fit_elo_model(elo_data, runs = 5, verbose = FALSE)
fit$elo
#> # A tibble: 20 × 2
#> ID elo
#> <chr> <dbl>
#> 1 S01 -369.
#> 2 S02 -295.
#> 3 S03 -378.
#> 4 S04 -340.
#> 5 S05 -262.
#> 6 S06 -196.
#> 7 S07 -144.
#> 8 S08 -59.8
#> 9 S09 -50
#> 10 S10 -34.6
#> 11 S11 7.2
#> 12 S12 11
#> 13 S13 193.
#> 14 S14 145.
#> 15 S15 196.
#> 16 S16 203.
#> 17 S17 275
#> 18 S18 422.
#> 19 S19 271.
#> 20 S20 405.
fit$reliability
#> [1] 0.8251686
fit$reliability_weighted
#> [1] 0.9202833