Weekly Programme

Week 1. 31 January 2017


Topics

  • Course objectives
  • What is Text and Data Mining?
  • Introduction to the PERL Programming language
  • Slides

Practical work

Week 2. 7 February 2017


Homework

  • Submit the solution to Coding Challenge 1. Send by mail to bdms.staff@gmail.com before Monday 6 February, 18:00.
  • Read the tutorial on Perl Basics, p. 1-7 (including the section about regular expressions).
  • Create a text corpus for your individual research project. Texts may be taken from existing research corpora. Your own corpus should consist of at least ten texts, of 5000 words or more. The texts need to be saved in the .txt (or plain text) format.
  • Read Martin Mueller, ‘Digital Shakespeare, or towards a literary informatics’, in: Shakespeare, 4, 3, September 2008, pp. 284–301. URL

Topics

  • Stylometrics
  • Regular Expressions
  • Slides

Practical work

Week 3. 15 February 2017

 

This class will take place on Wednesday 15 February, from 10:00 to 13:00, in Lipsius 126-A. Apologies for any inconvenience.

 

Homework

  • Read the full tutorial on Perl Basics. Present any questions that you have about this text in class.
  • Submit the solution to Coding Challenge 2. Send by mail to bdms.staff@gmail.com before Monday 13 February, 18:00.
  • Read Kathryn Schultz, What is Distant Reading?, in New York Times, June 24, 2011. URL
  • Read Shawna Ross, “In Praise of Overstating the Case: A review of Franco Moretti, Distant Reading (London: Verso, 2013)”, in: Digital Humanities Quarterly, 008, 1, 2014. URL

Topics

  • Tokenisation and frequency lists
  • Copyright issues connected to research based on Text Mining
  • Source Criticism
  • Slides

Practical work

Further Reading

  • Franco Moretti, ‘Conjectures on World Literature’, in: New Left Review, 1, 2000. URL
  • Franco Moretti, ‘The Slaughterhouse of Literature’, in: Modern Language Quarterly, 61, 1, 2000, pp. 207–227. URL

 

All students who have questions about the topics that have been explained during the first three weeks in DTDP are very welcome to attend a remedial class which has been scheduled on Monday 20 February, at 13:30. The location is Eyckhof 1 / 003A. No new materials will be explained during this class!

 

 

Week 4. 21 February 2016

 

Homework

Topics

  • Working with stopwords
  • Collocation
  • Distribution graphs
  • Type-token ratios
  • Slides


Practical work

Week 5. 28 February


Homework

Topics

  • Introduction to the R statistical package
  • Variables and data structures in R
  • Data visualisation
  • Introduction to GGPlot
  • Slides

Practical work

Week 6. 7 March 2016

Homework

Topics

  • Natural Language Processing
  • Slides

Practical work

Study week

 

Homework

  • Write a brief text (max. 500 words) about your individual research project. Answer the following questions: (1) Which texts have you selected for your corpus? (2) Which research question do you intend to answer? (3) Which types of analyses will be most useful for your research question?
  • The text about the final assignment for this course suggests a number of topics that you can focus on in the theoretical section of your essay. If you want to focus on a topic which is not listed, give a brief explanation of the question that you want to answer instead.
  • Send by mail to bdms.staff@gmail.com before Monday 13 March, 18:00.

 

Week 7. 21 March 2016


Homework

  • Submit the solution to Coding Challenge 5. Send by mail to bdms.staff@gmail.com before Monday 20 March, 18:00.
  • Read the full R Tutorial
  • Read Stephen Ramsay & Geoffrey Rockwell, “Developing Things: Notes Towards an Epistemology of Building in the Digital Humanities”, in: Matthew K. Gold (ed.), Debates in the Digital Humanities, (Minneapolis: University of Minnesota Press 2012), pp. 75–84.

Topics

  • Building tools as a theoretical activity
  • Sentence segmentation and readability metrics
  • Topic Modelling
  • Principal Component Analysis
  • Slides

Practical work

Further Reading

 

Week 8. 28 March 2016


Topics

  • Requirements for final essay
  • Other topics in Text and Data Mining (e.g. network analysis, mapping in R)
  • Demonstration of TDM analyses on the basis of a case study
  • Slides

 

Practical work

 

 

 

Final assignment

The final essay for DTDP needs to be submitted before Friday 28 April 2017, 18:00. Send by mail to bdms.staff@gmail.com

The essay needs to consists of two sections:

  1. A description of your individual research project (2000 words)
  2. A critical reflection on digital humanities research (2000 words). you may choose a topic from the list that is provided, but you are also free to focus on another topic.