Lesson 11
Python Loops
Loops are how you do the same thing to a thousand items without writing it a thousand times — the single most important pattern in any DH script.
If you only learn one thing well from this course, make it loops. They’re the most common operation in every Python project, and they’re the reason a script can do in three seconds what would take a graduate student a month: examine ten thousand letters, count words across a corpus, hit an API once for every record in a list.
Python has two loops: for and while. They can both express the same logic, but they answer different questions:
for— “do this once for each item in a collection.” Use it when you have a list (or anything iterable) and want to walk through it.while— “do this as long as some condition is true.” Use it when you don’t know in advance how many iterations you need.
In practice, for is what you reach for 95% of the time.
The for loop
The basic form: the keyword for, a variable name to hold each item, the keyword in, an iterable, and a colon. Anything indented underneath belongs to the loop:
correspondents = ["Voltaire", "Émilie", "Diderot"]
for name in correspondents:
print(name)
Each pass through the loop, name is bound to the next item — first "Voltaire", then "Émilie", then "Diderot". When the list runs out, the loop ends.
The choice of variable name (name here, often i or item in toy examples) is up to you. Pick a name that describes what one thing is, not what the collection is. for letter in letters: reads better than for x in letters: and is no more typing.
Iterating with the index — enumerate
Sometimes you want both the item and its position. Don’t track a counter manually; use enumerate:
correspondents = ["Voltaire", "Émilie", "Diderot"]
for i, name in enumerate(correspondents):
print(f"{i}: {name}")
# 0: Voltaire
# 1: Émilie
# 2: Diderot
Pass start=1 if you want human-friendly numbering: enumerate(correspondents, start=1).
Iterating over a dictionary
Three loops you’ll write often:
person = {"name": "Ada", "born": 1815, "field": "mathematics"}
for key in person:
print(key) # iterates the keys
for value in person.values():
print(value)
for key, value in person.items():
print(f"{key}: {value}")
.items() is the workhorse — it gives you the key and value at the same time, unpacked into two names.
Looping a fixed number of times — range
range(n) produces the numbers 0, 1, 2, ..., n-1:
for i in range(5):
print(i)
# 0 1 2 3 4
range(start, stop) and range(start, stop, step) give you more control:
for year in range(1800, 1900, 10):
print(year)
# 1800 1810 1820 ... 1890
The end value is exclusive — same as slicing.
The while loop
Use while when the stopping condition isn’t a finite collection:
text = "Voltaire wrote letters in many languages."
i = 0
while i < len(text) and text[i] != " ":
i += 1
print(f"first word ends at index {i}")
# first word ends at index 8
Two things you must always check on a while loop:
- Something inside the loop changes the condition. The classic infinite-loop bug is forgetting to advance the counter. If you write
while x < 10:and never modifyx, the loop never ends. - The condition can actually become false. Always have an exit.
For most cases where you’re tempted to write a while over an index, a for loop with enumerate is cleaner and safer.
break and continue
These two keywords let you fine-tune the flow inside a loop.
break exits the loop immediately:
correspondents = ["Voltaire", "Émilie", "Diderot"]
for name in correspondents:
if name == "Diderot":
print("found him")
break
print(f"checking {name}...")
continue skips the rest of this iteration and moves to the next:
correspondents = ["Voltaire", "Émilie", "Diderot"]
for name in correspondents:
if name.startswith("É"): # skip names with diacritics
continue
print(name)
# Voltaire
# Diderot
Building things up inside a loop
The most common DH pattern: walk through a collection, do something to each item, collect the results. You’ll write some version of this in every project.
correspondents = [
{"name": "Voltaire", "letters": 21000},
{"name": "Émilie", "letters": 430},
{"name": "Diderot", "letters": 3500},
]
big_writers = []
for person in correspondents:
if person["letters"] > 1000:
big_writers.append(person["name"])
print(big_writers)
# ['Voltaire', 'Diderot']
For this exact pattern — “filter a list and collect what passes” — Python has a more compact form, the list comprehension:
correspondents = [
{"name": "Voltaire", "letters": 21000},
{"name": "Émilie", "letters": 430},
{"name": "Diderot", "letters": 3500},
]
big_writers = [p["name"] for p in correspondents if p["letters"] > 1000]
print(big_writers)
Comprehensions and explicit for loops are interchangeable for one-line transformations. Use the comprehension when it fits on one readable line; use the explicit loop when there’s logic worth showing or when you want to do several things per iteration.
Counting in a loop
Frequency counts are pure muscle memory:
words = "the quick brown fox jumped over the lazy dog the fox".split()
counts = {}
for word in words:
counts[word] = counts.get(word, 0) + 1
print(counts)
Or with collections.Counter, which we’ll meet again in Lesson 16:
from collections import Counter
words = "the quick brown fox jumped over the lazy dog the fox".split()
counts = Counter(words)
print(counts.most_common(2)) # [('the', 3), ('fox', 2)]
Loops and large datasets — a note on performance
A few habits to start now, before they become problems:
- Avoid building a giant list when you only need to walk through it once. Python’s
forloop will iterate over a generator, file, or iterator just as happily as a list, and uses a fraction of the memory. - Read files line by line.
for line in open("big.txt"):doesn’t load the whole file. - Don’t loop over a list checking membership in another list.
if x in some_list:is slow inside a loop. Convert the lookup target to asetfirst:if x in some_set:. The behavior is the same, the speed is dramatically better.
These don’t matter on three records. They matter on three million.
When you’re comfortable iterating, advance to Lesson 12: Python Functions.
Running the code
Save any snippet from this lesson to a file — say try.py — and run it from your project folder:
uv run try.py
uv run uses the project’s Python and dependencies automatically; no virtualenv to activate. If you haven’t set the project up yet, Lesson 01 walks through it.