I routinely give workshops on both digital humanities methods in Chinese studies and on text mining more generally. I will post abstracts and materials here if anyone is interested in seeing them.


October 2018 (UniverSIty of Virginia): Visualizing Stylometric and Intertextual Relationships in Large Textual Corpora

In this workshop, Paul will demonstrate how to perform and visualize two important techniques for exploratory document analysis. First, he will introduce how to conduct stylistic analysis using principal component analysis (useful for detecting authorship and genre-based stylistic differences). Then he will show a workflow for detecting and visualizing intertextuality between two or more works. In this workshop, we will work with a demonstration corpus of English language texts. By the end of the workshop, you will be able to visualize general stylistic similarities and both exact and fuzzy quotation using adjustable criteria, which will allow you to quickly study a corpus of documents. If you would like to participate, please install the Anaconda distribution of the Python programming language. This is free software available at https://www.anaconda.com/download/. No experience with programming is assumed, so all are welcome! The corpus and scripts we will use will be available at https://www.github.com/vierth/uvaworkshop


October 2018 (Freiburg University), Maoist legacy workshop: Text-mining Chinese corpora: A workflow to analyze and visualize large document collections

In this workshop, I demonstrated a basic workflow that allows users to easily use stylometry to analyze a self-created corpus. Materials are available at https://www.github.com/vierth/freiburg.


Summer 2017: Introduction to Natural Language Processing and Text Mining with Python

I taught a week-long course in the Netherlands Graduate School of Linguistics's 2017 Summer School on Language and the Digital Humanities. The materials can be found in this Google Drive Zip folder.


Summer 2016: Text Mining and Quantitative Analysis for Chinese Studies

I taught a short workshop on how to conduct stylometric analysis on late imperial Chinese texts as part of Leiden University's 2016 Chinese Digital Humanities Summer school. The materials were very similar to the ones I used for the workshop I taught at Stanford, which can be found here.


February 2016: Stylometrics and Genre Research in Imperial Chinese Studies

As part of the DHAsia program, I taught a workshop on stylometric analysis at Stanford. In this workshop I walked participants through all of the necessary steps, from corpus building to cleaning texts to conducting both hierarchical cluster analysis and principal component analysis. The materials I used in the workshop are all publicly available in a github repository.