The ai_salience() function allows you to classify
documents based on their relevance to predefined topics. The function
uses a predefined type_object argument from
ellmer to structure the LLM’s response, producing a list of
topics and their salience scores for each document. This function is
particularly useful for analysing large corpora where manual
classification would be impractical. Users need to provide a character
vector of documents and a list of topics to classify. The LLM will then
analyse each document and assign a salience score to each topic,
indicating how relevant the document is to that topic.
Loading packages and data
## Package version: 4.3.1
## Unicode version: 15.1
## ICU version: 74.2## Parallel computing: disabled## See https://quanteda.io for tutorials and examples.## Loading required package: ellmer
data_corpus_inaugural <- data_corpus_inaugural[57:60]Using ai_salience() for salience rating of topics
# define the topics for salience classification
topics <- c("economy", "environment", "healthcare")
result <- data_corpus_inaugural %>%
  ai_salience(topics, chat_fn = chat_openai, model = "gpt-4o",
              api_args = list(temperature = 0, seed = 42))## Using `chat_fn()` with model "gpt-4o"
## ■                                0/4 |   0% | ETA: ? | NA
## 
## ■                                0/4 |   0% | ETA: ? | 2013-Obama
## 
## ■■■■■■■■■                        1/4 |  25% | ETA:  8s | 2013-Obama
## 
## ■■■■■■■■■                        1/4 |  25% | ETA:  8s | 2017-Trump
## 
## ■■■■■■■■■■■■■■■■                 2/4 |  50% | ETA:  4s | 2021-Biden
## 
## ■■■■■■■■■■■■■■■■■■■■■■■          3/4 |  75% | ETA:  2s | 2021-Biden
## 
## ■■■■■■■■■■■■■■■■■■■■■■■          3/4 |  75% | ETA:  2s | 2025-Trump
## 
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■  4/4 | 100% | ETA:  0s | 2025-Trump
## 
## 
## 
## ✔ Returned 4 documents (4 successful, 0 with NAs)
result| id | salience_economy | salience_environment | salience_healthcare | 
|---|---|---|---|
| 2013-Obama | 0.3 | 0.3 | 0.4 | 
| 2017-Trump | 0.6 | 0.1 | 0.3 | 
| 2021-Biden | 0.2 | 0.3 | 0.5 | 
| 2025-Trump | 0.5 | 0.2 | 0.3 |