Skip to contents

The package allows you to structure the responses from LLMs in a way that is compatible with quanteda’s corpus principles and useful for common text analysis tasks. This means you can easily integrate LLM-generated data into your text analysis workflows. For example, you can ask an LLM to summarize all documents in a corpus (ai_summary()) and store the summaries as document variables, or you can classify documents into topics (ai_salience()) or scale them based on predefined criteria (ai_scale()) and store the results as document variables.

If you need more flexibility in how the LLM generates its output, you can use the ai_text() function to define custom prompts and response structures. With ai_text() and the help of the type_object() argument from the ellmer package, you can define how the LLM should format its output, such as specifying the fields to include in the response or the format of the response itself. This flexibility enables you to tailor the LLM’s output to your analysis requirements, making it easier to integrate LLM-generated data into your text analysis workflows.

Loading packages and data

## Package version: 4.3.1
## Unicode version: 15.1
## ICU version: 74.2
## Parallel computing: disabled
## See https://quanteda.io for tutorials and examples.
## Loading required package: ellmer
data_corpus_inaugural <- data_corpus_inaugural[57:60]

Using ai_text() for scoring documents

prompt <- "Score the following document on a scale of how much it aligns
with the political left. The political left is defined as groups which
advocate for social equality, government intervention in the economy,
and progressive policies. Use the following metrics:
SCORING METRIC:
3 : extremely left
2 : very left
1 : slightly left
0 : not at all left"

# define the structure of the response
policy_scores <- type_object(
  score = type_integer(),
  evidence = type_string()
)

result <- ai_text(data_corpus_inaugural, chat_fn = chat_openai, model = "gpt-4o", 
                  type_object = policy_scores,
                  system_prompt = prompt,
                  api_args = list(temperature = 0, seed = 42)) 
## Using `chat_openai()` with model "gpt-4o"
## ■■■■■■■■■                        1/4 |  25% | ETA: 17s | 2013-Obama
## 
## ■■■■■■■■■■■■■■■■                 2/4 |  50% | ETA:  8s | 2017-Trump
## 
## ■■■■■■■■■■■■■■■■■■■■■■■          3/4 |  75% | ETA:  4s | 2021-Biden
## 
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■  4/4 | 100% | ETA:  0s | 2025-Trump
## 
## 
## 
##  Processed 4 documents successfully
# score and evidence are created as new docvars in the corpus
library(kableExtra)
  result %>%
  kable("html", escape = FALSE) %>%
  kable_styling(bootstrap_options = c("striped", "hover")) %>%
  column_spec(3) 
id score evidence
2013-Obama 2 The speech emphasizes themes that align with the political left, such as social equality, government intervention, and progressive policies. It advocates for collective action to address modern challenges, supports the idea of a rising middle class, and emphasizes the importance of social safety nets like Medicare, Medicaid, and Social Security. The speech also highlights the need for climate change action and sustainable energy, which are typically left-leaning priorities. Additionally, it calls for equal rights for women and the LGBTQ+ community, and for immigration reform, all of which are progressive issues. However, it also acknowledges skepticism of central authority and the importance of personal responsibility, which tempers the alignment slightly.
2017-Trump 0 The speech emphasizes nationalism, protectionism, and a focus on ‘America first’ policies, which are not aligned with the political left’s advocacy for social equality, government intervention in the economy, and progressive policies. The speech criticizes the political establishment and emphasizes a return to traditional values and national pride, which are more characteristic of right-wing populism. There is no mention of progressive policies or social equality initiatives that are typically associated with the political left.
2021-Biden 2 The speech emphasizes themes of unity, democracy, and addressing systemic issues such as racial justice, climate change, and economic inequality, which align with progressive policies typically associated with the political left. The call for government intervention to address the pandemic, economic challenges, and racial justice further supports a left-leaning perspective. However, the speech also focuses on unity and bipartisanship, which tempers the alignment with the political left, resulting in a score of 2.
2025-Trump 0 The speech emphasizes nationalism, border security, and a strong military, which are typically associated with right-wing politics. It criticizes government intervention in the economy, such as the Green New Deal and electric vehicle mandates, and promotes energy independence through drilling, which aligns with conservative economic policies. The speech also opposes government censorship and promotes a colorblind, merit-based society, which are not aligned with progressive policies. Overall, the speech does not advocate for social equality, government intervention in the economy, or progressive policies, which are key tenets of the political left.