Example: Structuring LLM responses for text analysis
Source:vignettes/pkgdown/examples/structuring.Rmd
structuring.Rmd
The package allows you to structure the responses from LLMs in a way
that is compatible with quanteda
’s corpus principles and
useful for common text analysis tasks. This means you can easily
integrate LLM-generated data into your text analysis workflows. For
example, you can ask an LLM to summarize all documents in a corpus
(ai_summary()
) and store the summaries as document
variables, or you can classify documents into topics
(ai_salience()
) or scale them based on predefined criteria
(ai_scale()
) and store the results as document
variables.
If you need more flexibility in how the LLM generates its output, you
can use the ai_text()
function to define custom prompts and
response structures. With ai_text()
and the help of the
type_object()
argument from the ellmer
package, you can define how the LLM should format its output, such as
specifying the fields to include in the response or the format of the
response itself. This flexibility enables you to tailor the LLM’s output
to your analysis requirements, making it easier to integrate
LLM-generated data into your text analysis workflows.
Loading packages and data
## Package version: 4.3.1
## Unicode version: 15.1
## ICU version: 74.2
## Parallel computing: disabled
## See https://quanteda.io for tutorials and examples.
## Loading required package: ellmer
data_corpus_inaugural <- data_corpus_inaugural[57:60]
Using ai_text()
for scoring documents
prompt <- "Score the following document on a scale of how much it aligns
with the political left. The political left is defined as groups which
advocate for social equality, government intervention in the economy,
and progressive policies. Use the following metrics:
SCORING METRIC:
3 : extremely left
2 : very left
1 : slightly left
0 : not at all left"
# define the structure of the response
policy_scores <- type_object(
score = type_integer(),
evidence = type_string()
)
result <- ai_text(data_corpus_inaugural, chat_fn = chat_openai, model = "gpt-4o",
type_object = policy_scores,
system_prompt = prompt,
api_args = list(temperature = 0, seed = 42))
## Using `chat_openai()` with model "gpt-4o"
## ■■■■■■■■■ 1/4 | 25% | ETA: 17s | 2013-Obama
##
## ■■■■■■■■■■■■■■■■ 2/4 | 50% | ETA: 8s | 2017-Trump
##
## ■■■■■■■■■■■■■■■■■■■■■■■ 3/4 | 75% | ETA: 4s | 2021-Biden
##
## ■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■ 4/4 | 100% | ETA: 0s | 2025-Trump
##
##
##
## ✔ Processed 4 documents successfully
# score and evidence are created as new docvars in the corpus
library(kableExtra)
result %>%
kable("html", escape = FALSE) %>%
kable_styling(bootstrap_options = c("striped", "hover")) %>%
column_spec(3)
id | score | evidence |
---|---|---|
2013-Obama | 2 | The speech emphasizes themes that align with the political left, such as social equality, government intervention, and progressive policies. It advocates for collective action to address modern challenges, supports the idea of a rising middle class, and emphasizes the importance of social safety nets like Medicare, Medicaid, and Social Security. The speech also highlights the need for climate change action and sustainable energy, which are typically left-leaning priorities. Additionally, it calls for equal rights for women and the LGBTQ+ community, and for immigration reform, all of which are progressive issues. However, it also acknowledges skepticism of central authority and the importance of personal responsibility, which tempers the alignment slightly. |
2017-Trump | 0 | The speech emphasizes nationalism, protectionism, and a focus on ‘America first’ policies, which are not aligned with the political left’s advocacy for social equality, government intervention in the economy, and progressive policies. The speech criticizes the political establishment and emphasizes a return to traditional values and national pride, which are more characteristic of right-wing populism. There is no mention of progressive policies or social equality initiatives that are typically associated with the political left. |
2021-Biden | 2 | The speech emphasizes themes of unity, democracy, and addressing systemic issues such as racial justice, climate change, and economic inequality, which align with progressive policies typically associated with the political left. The call for government intervention to address the pandemic, economic challenges, and racial justice further supports a left-leaning perspective. However, the speech also focuses on unity and bipartisanship, which tempers the alignment with the political left, resulting in a score of 2. |
2025-Trump | 0 | The speech emphasizes nationalism, border security, and a strong military, which are typically associated with right-wing politics. It criticizes government intervention in the economy, such as the Green New Deal and electric vehicle mandates, and promotes energy independence through drilling, which aligns with conservative economic policies. The speech also opposes government censorship and promotes a colorblind, merit-based society, which are not aligned with progressive policies. Overall, the speech does not advocate for social equality, government intervention in the economy, or progressive policies, which are key tenets of the political left. |