Skip to contents

The ai_salience() function allows you to classify documents based on their relevance to predefined topics. The function uses a predefined type_object argument from ellmer to structure the LLM’s response, producing a list of topics and their salience scores for each document. This function is particularly useful for analysing large corpora where manual classification would be impractical. Users need to provide a character vector of documents and a list of topics to classify. The LLM will then analyse each document and assign a salience score to each topic, indicating how relevant the document is to that topic.

Loading packages and data

## Package version: 4.3.1
## Unicode version: 15.1
## ICU version: 74.2
## Parallel computing: disabled
## See https://quanteda.io for tutorials and examples.
## Loading required package: ellmer
data_corpus_inaugural <- data_corpus_inaugural[57:60]

Using ai_salience() for salience rating of topics

# define the topics for salience classification
topics <- c("economy", "environment", "healthcare")
result <- data_corpus_inaugural %>%
  ai_salience(topics, chat_fn = chat_openai, model = "gpt-4o",
              api_args = list(temperature = 0, seed = 42))
## Using `chat_fn()` with model "gpt-4o"
## ■■■■■■■■■                        1/4 |  25% | ETA: 25s | 2013-Obama
## 
## ■■■■■■■■■■■■■■■■■■■■■■■          3/4 |  75% | ETA:  4s | 2021-Biden
## 
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■  4/4 | 100% | ETA:  0s | 2025-Trump
## 
## 
## 
##  Processed 4 documents successfully
result
id salience_economy salience_environment salience_healthcare
2013-Obama 0.4 0.3 0.3
2017-Trump 0.6 0.1 0.3
2021-Biden 0.2 0.3 0.5
2025-Trump 0.5 0.2 0.3