This function extracts ID and text columns from a data frame and enforces that IDs are unique. By default, it assumes the first column is the ID and the second column is the text.
Value
A tibble with columns:
ID: character ID for each sampletext: character string of the writing sample
Any remaining columns in df are retained unchanged.
Examples
df <- data.frame(
StudentID = c("S1", "S2"),
Response = c("This is sample 1.", "This is sample 2."),
Grade = c(8, 9),
stringsAsFactors = FALSE
)
samples <- read_samples_df(df, id_col = "StudentID", text_col = "Response")
samples
#> # A tibble: 2 × 3
#> ID text Grade
#> <chr> <chr> <dbl>
#> 1 S1 This is sample 1. 8
#> 2 S2 This is sample 2. 9
# Using the built-in example dataset
data("example_writing_samples")
samples2 <- read_samples_df(
example_writing_samples[, c("ID", "text")],
id_col = "ID",
text_col = "text"
)
head(samples2)
#> # A tibble: 6 × 2
#> ID text
#> <chr> <chr>
#> 1 S01 "Writing assessment is hard. People write different things. It is\n …
#> 2 S02 "It is hard to grade writing. Some are long and some are short. I do no…
#> 3 S03 "Assessing writing is difficult because everyone writes differently and…
#> 4 S04 "Grading essays is tough work. You have to read a lot. Sometimes the\n …
#> 5 S05 "Writing assessment is challenging because teachers must judge ideas,\n…
#> 6 S06 "It is difficult to assess writing because it is subjective. One teache…