Example: Salience rating of topics

The ai_salience() function allows you to classify documents based on their relevance to predefined topics. The function uses a predefined type_object argument from ellmer to structure the LLM’s response, producing a list of topics and their salience scores for each document. This function is particularly useful for analysing large corpora where manual classification would be impractical. Users need to provide a character vector of documents and a list of topics to classify. The LLM will then analyse each document and assign a salience score to each topic, indicating how relevant the document is to that topic.

Loading packages and data

library(quanteda)

## Package version: 4.3.1
## Unicode version: 15.1
## ICU version: 74.2

## Parallel computing: disabled

## See https://quanteda.io for tutorials and examples.

library(quanteda.llm)

## Loading required package: ellmer

data_corpus_inaugural <- data_corpus_inaugural[57:60]

Using `ai_salience()` for salience rating of topics

# define the topics for salience classification
topics <- c("economy", "environment", "healthcare")
result <- data_corpus_inaugural %>%
  ai_salience(topics, chat_fn = chat_openai, model = "gpt-4o",
              api_args = list(temperature = 0, seed = 42))

## Using `chat_fn()` with model "gpt-4o"
## ■                                0/4 |   0% | ETA: ? | NA
## 
## ■                                0/4 |   0% | ETA: ? | 2013-Obama
## 
## ■■■■■■■■■                        1/4 |  25% | ETA:  8s | 2013-Obama
## 
## ■■■■■■■■■                        1/4 |  25% | ETA:  8s | 2017-Trump
## 
## ■■■■■■■■■■■■■■■■                 2/4 |  50% | ETA:  4s | 2021-Biden
## 
## ■■■■■■■■■■■■■■■■■■■■■■■          3/4 |  75% | ETA:  2s | 2021-Biden
## 
## ■■■■■■■■■■■■■■■■■■■■■■■          3/4 |  75% | ETA:  2s | 2025-Trump
## 
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■  4/4 | 100% | ETA:  0s | 2025-Trump
## 
## 
## 
## ✔ Returned 4 documents (4 successful, 0 with NAs)

result

id	salience_economy	salience_environment	salience_healthcare
2013-Obama	0.3	0.3	0.4
2017-Trump	0.6	0.1	0.3
2021-Biden	0.2	0.3	0.5
2025-Trump	0.5	0.2	0.3

Loading packages and data

Using ai_salience() for salience rating of topics

Using `ai_salience()` for salience rating of topics