Skip to content

The Course

Python for
Digital Humanities.

Forty lessons in nine parts. It assumes no prior programming knowledge and moves from first variables through scraping, SQL, and the iteration tools that make Python code shorter and faster.

Part 00

Introduction

  1. Lesson 01 Introduction to Python for DH Set up a modern Python environment so you can follow along with the rest of the course.

Part 01

Working with Data

  1. Lesson 02 Storing Data in Python A tour of the six built-in types you'll use most often as a digital humanist: integers, floats, strings, lists, tuples, and dictionaries.
  2. Lesson 03 Python Strings Strings are the chief form of data the digital humanist works with. Learn the built-in functions that operate on them.
  3. Lesson 04 Python Integers and Floats How numbers behave in Python — arithmetic, conversion, comparison — and why a humanist needs them.
  4. Lesson 05 Capstone — Build a Citation Generator Tie Part 1 together by writing a tiny program that takes the bits of a citation — author, title, year, place — and prints out a properly formatted reference.

Part 02

Data Structures

  1. Lesson 06 Python Tuples Immutable, ordered collections — when to reach for a tuple instead of a list, and what indexing actually means.
  2. Lesson 07 Python Lists The mutable cousin of the tuple — and the workhorse data structure of every Python project.
  3. Lesson 08 Python Dictionaries Key-value pairs — the structure that backs every JSON record, every CSV row, and most of the data a humanist will touch.
  4. Lesson 09 Capstone — A Mini Library Catalog Use tuples, lists, and dictionaries together to build a small library catalog — the data shape every DH project eventually adopts.

Part 03

Interacting with Data Structures

  1. Lesson 10 Python Conditionals How a script makes decisions — if, elif, else, and the Boolean logic that ties them together.
  2. Lesson 11 Python Loops Loops are how you do the same thing to a thousand items without writing it a thousand times — the single most important pattern in any DH script.
  3. Lesson 12 Python Functions Functions package a block of code under a name so you can reuse it — the foundation of clean, readable scripts.
  4. Lesson 13 Python Classes Classes bundle data with the functions that operate on it — the move from collections of values to objects that know what they are.
  5. Lesson 14 Capstone — A Word Frequency Counter Combine conditionals, loops, functions, and classes to build a real text-analysis tool — the foundational move behind every distant-reading project.

Part 04

Working with Text Data

  1. Lesson 15 Python and Text Files Open, read, and write plain text files — the canonical Python pattern, plus pathlib and a few habits that save you bugs later.
  2. Lesson 16 Python Modules and Libraries Install and import third-party libraries — the engine behind almost every digital humanities project.
  3. Lesson 17 Python and Regex (Part 01) Use regular expressions to extract structured data — like dates — from messy strings.
  4. Lesson 18 Python and Regex (Part 02) Apply regex to a text file — read it in, transform every match, and write the result back out.
  5. Lesson 19 Capstone — Mining a Public-Domain Text Pull a chapter of Frankenstein into a script, extract structured information with regex, and write the results back out — the loop every text-mining project follows.

Part 05

Working with Tabular Data

  1. Lesson 20 Introduction to Pandas Meet pandas — the library every working researcher uses for tabular data. Read CSVs and Excel files, inspect a DataFrame, and pick out the rows and columns you need.
  2. Lesson 21 Filtering and Querying with Pandas Pull the rows you want out of a DataFrame — Boolean masks, .query, sorting, and the GroupBy that does most analytics work.
  3. Lesson 22 Cleaning and Exporting with Pandas Real data is messy. Handle missing values, clean up stringy columns, rename and drop, merge two tables, and write the result back out as CSV or Excel.
  4. Lesson 23 Capstone — Analyzing a Real CSV Dataset End-to-end pandas workflow on a small but realistic dataset of Enlightenment correspondence — load, clean, filter, group, summarize, and export.

Part 06

Web Data

  1. Lesson 24 Finding HTML Code Before scraping a website, learn to read its HTML — find the tags, classes, and structure that hold the data you actually want.
  2. Lesson 25 Python and the Requests Module Use the requests library to fetch raw HTML from the web — and learn the small set of habits that make scraping reliable and polite.
  3. Lesson 26 Python and BeautifulSoup Parse HTML with BeautifulSoup and pull out the tags, text, and attributes you actually want.
  4. Lesson 27 Capstone — Scraping a Structured Archive Page Walk through fetching a structured archive listing, parsing it with BeautifulSoup, and turning every entry into a clean record — the workflow behind most DH datasets built from the open web.

Part 07

Storing Data

  1. Lesson 28 Storing Data — Text, CSV, and JSON Write your data out to plain text, CSV, and JSON — and pick the right format for the shape you have.
  2. Lesson 29 Storing Data in XML Files Write and read structured data in XML using Python's standard library — and know when XML is actually the right format.
  3. Lesson 30 Storing Data in a SQL Database Use Python's built-in sqlite3 to store, query, and grow a real database — the format that keeps working when your CSV outgrows itself.
  4. Lesson 31 Capstone — Scrape, Clean, Store, Query The final capstone: build a tiny end-to-end research pipeline — parse HTML, clean with pandas, store in SQLite, and query the database for answers.

Part 08

Iteration Tools

  1. Lesson 32 Welcome to Intermediate Python You've cleared the basics. Here's what changes: smaller, sharper tools for the patterns you've already been writing the long way.
  2. Lesson 33 The Enumerate Function When you need both the item and its position, enumerate hands you both at once — no manual counter, no off-by-one bug.
  3. Lesson 34 The Zip Function Walk two or more sequences in parallel, one tuple at a time — the cleanest way to pair authors with works, names with dates, or any two related lists.
  4. Lesson 35 Comprehension with Sequences List, set, and dict comprehensions collapse a three-line loop into one readable line — the most-used Python idiom you'll write.
  5. Lesson 36 Generators Generators produce values one at a time without ever holding the full sequence in memory — the difference between processing a corpus and crashing on it.

Part 09

Functional Python

  1. Lesson 37 Lambda Functions Anonymous one-line functions you write inline — perfect for sort keys, filter conditions, and small transformations that don't deserve their own def.
  2. Lesson 38 The Map Function Apply a function to every item in an iterable in one line — a tidy alternative to a for loop when all you're doing is transforming each element.
  3. Lesson 39 The Filter Function Keep only the items in a sequence that pass a test — the same shape as map, but for selection rather than transformation.
  4. Lesson 40 Counter from Collections A dictionary subclass that counts things for you — the cleanest answer to every frequency question in DH work.