Text Mining for Digital Humanities

Vrije Universiteit Amsterdam

Course Description

Course Name

Text Mining for Digital Humanities
Host University

Vrije Universiteit Amsterdam
Location

Amsterdam, The Netherlands
Area of Study

Linguistics, Research
Language Level

Taught In English
Course Level Recommendations

Upper

ISA offers course level recommendations in an effort to facilitate the determination of course levels by credential evaluators.We advice each institution to have their own credentials evaluator make the final decision regrading course levels.

Hours & Credits

ECTS Credits

Recommended U.S. Semester Credits

Recommended U.S. Quarter Units

Overview

COURSE OBJECTIVE
In this course, students are trained in systematic text analysis. In particular, we explore the process of identifying and annotating information in historic and contemporaneous texts such as novels, lyrics, letters, newspaper articles, movie scripts, blogs and other other social media texts using manual and automatic methods. They will learn the implications for the theoretical models and concepts they are familiar with in their own discipline. Students will work on a research project of their choice and annotate them in a interdisciplinary context using different tools and methods. They will apply expert and crowd annotations, develop code-books and compare the results. Finally, they will use a machine-learning program for analyzing text and reflect on the performance of the automatic annotation. We will focus on high-level semantic annotations of, for example, (historic) events, entities and emotions that are of interest to a broader range of humanities and social and computer science students. Students present their findings in a research paper.

COURSE CONTENT
This module addresses the process of systematic text analysis through human and automatic annotation. Annotations make information that is implicit in data explicit allowing researchers to search their data systematically. This kind of research forces Humanities scholars and social scientists to represent their Interpretation of texts in a data structure. Computer science students will learn about how text mining technologies can be applied in Humanities and Social Sciences. Annotation requires the use of some type of interpretation model and it results in an analysis that can be compared across annotators. As such, annotation can be seen as an important step towards the formalization of humanities and social science as a discipline. The degree to which annotators agree or disagree (the so-called Inter Annotator Agreement) tells us something about the reproducibility of the interpretation process, the matureness of theoretical notions and the criteria used to apply them to real data. Different backgrounds of annotators will lead to different types of annotations. Linguists, (cultural-)historians, social-scientists, and literature-scientists will consider sources and data differently and consequently come to different annotations of the same source/data. The same holds for experts and non-experts. The former are traditionally involved in assigning metadata to sources, the latter do the same in crowd-sourcing initiatives. Finally, annotated data can be used to train machines to do the same. How does this work? Can a machine do better than humans? How do you evaluate this?

TEACHING METHODS
Lecture, Seminar

TYPE OF ASSESSMENT
Paper

RECOMMENDED BACKGROUND KNOWLEDGE
Course: From Object to Data

Course Disclaimer

Courses and course hours of instruction are subject to change.

Some courses may require additional fees.

OR

Program Options

Subject

TERM

How To Apply

After You Have applied

Funding Your Program

Who We ARE

Divisions

Health & Safety

faq

Text Mining for Digital Humanities

Course Description

Course Name

Host University

Location

Area of Study

Language Level

Course Level Recommendations

Hours & Credits

Overview

Course Disclaimer

About ISA

Admissions

Countries

About ISA

Countries

Admissions