Weekly Programme

Week 1. 6 February 2018


Topics

  • Course objectives
  • What is Text and Data Mining?
  • Introduction to the Python Programming language
  • Slides

Practical work

 

Week 2. 13 February 2018


Homework

  • Submit the solution to Coding Challenge 1. Send by mail to bdms.staff@gmail.com before Monday 12 February, 18:00. N.B. Do NOT spend more than two hours on this exercise!!
  • Read Part 1 of the tutorial on Python Basics, and part 2 on Tokenisation
  • Write about your experiences or in learning about these topics or about anything which may be unclear in a comment on the discussion forum.
  • Create a text corpus for your individual research project. Texts may be taken from existing research corpora. Your own corpus should consist of at least ten texts, of 5000 words or more. The texts need to be saved in the .txt (or plain text) format.

Further Reading

  • Martin Mueller, ‘Digital Shakespeare, or towards a literary informatics’, in: Shakespeare, 4, 3, September 2008, pp. 284–301. URL

Topics

  • Types and tokens
  • Stylometrics
  • Authorship recognition
  • Slides

Practical work

  • Tokenisation
    • Lists
    • Dictionaries
    • Iteration
    • Reading a file
    • Tokenisation
    • Frequency counts

 

Week 3. 20 February 2018

 

Homework

  • Read part 3 on Regular Expressions
  • Franco Moretti, ‘The Slaughterhouse of Literature’, in: Modern Language Quarterly, 61, 1, 2000, pp. 207–227. URL
  • Read Shawna Ross, “In Praise of Overstating the Case: A review of Franco Moretti, Distant Reading (London: Verso, 2013)”, in: Digital Humanities Quarterly, 008, 1, 2014. URL

Topics

  • Tokenisation and frequency lists
  • Source Criticism
  • Slides

Practical work

 

Further Reading

Week 4. 27 February 2018

 

Homework

Topics

  • Natural Language Processing
  • Working with stopwords
  • Collocation
  • Distribution graphs
  • Type-token ratios
  • Slides


Practical work

Week 5. 6 March 2018


Homework

Topics

  • Introduction to the R statistical package
  • Variables and data structures in R
  • Data visualisation
  • Introduction to GGPlot
  • Slides

Practical work

Week 6. 13 March 2018

Homework

Topics

  • Natural Language Processing
  • Slides

Practical work

Study week

 

Homework

  • Write a brief text (max. 500 words) about your individual research project. Answer the following questions: (1) Which texts have you selected for your corpus? (2) Which research question do you intend to answer? (3) Which types of analyses will be most useful for your research question?
  • The text about the final assignment for this course suggests a number of topics that you can focus on in the theoretical section of your essay. If you want to focus on a topic which is not listed, give a brief explanation of the question that you want to answer instead.
  • Send by mail to bdms.staff@gmail.com before Monday 13 March, 18:00.

 

Week 7. 27 March 2018


Homework

  • Submit the solution to Coding Challenge 5. Send by mail to bdms.staff@gmail.com before Monday 20 March, 18:00.
  • Read the full R Tutorial
  • Read Stephen Ramsay & Geoffrey Rockwell, “Developing Things: Notes Towards an Epistemology of Building in the Digital Humanities”, in: Matthew K. Gold (ed.), Debates in the Digital Humanities, (Minneapolis: University of Minnesota Press 2012), pp. 75–84.

Topics

  • Building tools as a theoretical activity
  • Sentence segmentation and readability metrics
  • Topic Modelling
  • Principal Component Analysis
  • Slides

Practical work

Further Reading

 

Week 8. 3 April 2016


Topics

  • Requirements for final essay
  • Other topics in Text and Data Mining (e.g. network analysis, mapping in R)
  • Demonstration of TDM analyses on the basis of a case study
  • Slides

 

Practical work

 

 

 

Final assignment

The final essay for DTDP needs to be submitted before Friday 28 April 2017, 18:00. Send by mail to bdms.staff@gmail.com

The essay needs to consists of two sections:

  1. A description of your individual research project (2000 words)
  2. A critical reflection on digital humanities research (2000 words). you may choose a topic from the list that is provided, but you are also free to focus on another topic.