Text Analysis Using R


Kenneth Benoit

Stefan Müller


This is the draft version of Text Analysis Using R. This book offers a comprehensive practical guide to text analysis and natural language processing using the R language. We have pitched the book at those already familiar with some R, but we also provide a gentle enough introduction that it is also suitable for newcomers to R.

You’ll learn how to prepare your texts for analysis, how to analyse texts for insight using statistical methods and machine learning, and how to present those results using graphical methods. Each chapter covers a distinct topic, fist presenting the methodology underlying each topic, and then providing practical examples using R. We also discuss advanced issues facing each method and its application. Finally, for those engaged in self-learning or wishing to use this book for instruction, we provide practical exercises for each chapter.

The book is organised into parts, starting with a review of R and especially the R packages and functions relevant to text analysis. If you are already comfortable with R you can skim or skip that section and proceed straight to part two.


This book is a work in progress. We will publish chapters as we write them, and open up the GitHub source repository for readers to make comments or even suggest corrections. You can view this at this https://github.com/quanteda/Text-Analysis-Using-R.


We will eventually seek a publisher for this book, but want to write it first. In the meantime we are retaining full copyright and licensing it only to be free to read.

© Kenneth Benoit and Stefan Müller all rights reserved.