forked from PovertyAction/research-data-science-training
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy path05-loop.qmd
More file actions
429 lines (328 loc) · 10.4 KB
/
05-loop.qmd
File metadata and controls
429 lines (328 loc) · 10.4 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
---
title: "Repeating Actions with Loops"
abstract: |
Learn to use for loops in Python to efficiently repeat operations on collections of data. Master loop syntax, variable naming, and common loop patterns for automating repetitive tasks.
date: last-modified
format:
html: default
authors-ipa:
- "[Author Name](https://poverty-action.org/people/author_name)"
contributors:
- "[Contributor Name](https://poverty-action.org/people/contributor_name)"
keywords: ["Python", "For Loops", "Iteration", "Automation", "Control Flow", "Tutorial"]
license: "CC BY 4.0"
---
::: {.callout-note}
## Learning Objectives
- Explain what a `for` loop does.
- Correctly write `for` loops to repeat simple calculations.
- Trace changes to a loop variable as the loop runs.
- Trace changes to other variables as they are updated by a `for` loop.
## Questions
- How can I do the same operations on many different values?
:::
In the episode about visualizing data,
we wrote Python code that plots values of interest from our first
inflammation dataset (`inflammation-01.csv`), which revealed some suspicious features in it.

We have a dozen data sets right now and potentially more on the way if Dr. Maverick
can keep up their surprisingly fast clinical trial rate. We want to create plots for all of
our data sets with a single statement. To do that, we'll have to teach the computer how to
repeat things.
An example task that we might want to repeat is accessing numbers in a list,
which we
will do by printing each number on a line of its own.
```python
odds = [1, 3, 5, 7]
```
In Python, a list is basically an ordered collection of elements, and every
element has a unique number associated with it --- its index. This means that
we can access elements in a list using their indices.
For example, we can get the first number in the list `odds`,
by using `odds[0]`. One way to print each number is to use four `print` statements:
```python
print(odds[0])
print(odds[1])
print(odds[2])
print(odds[3])
```
```output
1
3
5
7
```
This is a bad approach for three reasons:
1. **Not scalable**. Imagine you need to print a list that has hundreds
of elements. It might be easier to type them in manually.
2. **Difficult to maintain**. If we want to decorate each printed element with an
asterisk or any other character, we would have to change four lines of code. While
this might not be a problem for small lists, it would definitely be a problem for
longer ones.
3. **Fragile**. If we use it with a list that has more elements than what we initially
envisioned, it will only display part of the list's elements. A shorter list, on
the other hand, will cause an error because it will be trying to display elements of the
list that do not exist.
```python
odds = [1, 3, 5]
print(odds[0])
print(odds[1])
print(odds[2])
print(odds[3])
```
```output
1
3
5
```
```error
---------------------------------------------------------------------------
IndexError Traceback (most recent call last)
<ipython-input-3-7974b6cdaf14> in <module>()
3 print(odds[1])
4 print(odds[2])
----> 5 print(odds[3])
IndexError: list index out of range
```
Here's a better approach: a [for loop](../learners/reference.md#for-loop)
```python
odds = [1, 3, 5, 7]
for num in odds:
print(num)
```
```output
1
3
5
7
```
This is shorter --- certainly shorter than something that prints every number in a
hundred-number list --- and more robust as well:
```python
odds = [1, 3, 5, 7, 9, 11]
for num in odds:
print(num)
```
```output
1
3
5
7
9
11
```
The improved version uses a [for loop](../learners/reference.md#for-loop)
to repeat an operation --- in this case, printing --- once for each thing in a sequence.
The general form of a loop is:
```python
for variable in collection:
# do things using variable, such as print
```
Using the odds example above, the loop might look like this:

where each number (`num`) in the variable `odds` is looped through and printed one number after
another. The other numbers in the diagram denote which loop cycle the number was printed in (1
being the first loop cycle, and 6 being the final loop cycle).
We can call the [loop variable](../learners/reference.md#loop-variable) anything we like, but
there must be a colon at the end of the line starting the loop, and we must indent anything we
want to run inside the loop. Unlike many other languages, there is no command to signify the end
of the loop body (e.g. `end for`); everything indented after the `for` statement belongs to the loop.
::: {.callout-note}
## What's in a name?
In the example above, the loop variable was given the name `num` as a mnemonic;
it is short for 'number'.
We can choose any name we want for variables. We might just as easily have chosen the name
`banana` for the loop variable, as long as we use the same name when we invoke the variable inside
the loop:
```python
odds = [1, 3, 5, 7, 9, 11]
for banana in odds:
print(banana)
```
```output
1
3
5
7
9
11
```
It is a good idea to choose variable names that are meaningful, otherwise it would be more
difficult to understand what the loop is doing.
:::
Here's another loop that repeatedly updates a variable:
```python
length = 0
names = ['Curie', 'Darwin', 'Turing']
for value in names:
length = length + 1
print('There are', length, 'names in the list.')
```
```output
There are 3 names in the list.
```
It's worth tracing the execution of this little program step by step.
Since there are three names in `names`,
the statement on line 4 will be executed three times.
The first time around,
`length` is zero (the value assigned to it on line 1)
and `value` is `Curie`.
The statement adds 1 to the old value of `length`,
producing 1,
and updates `length` to refer to that new value.
The next time around,
`value` is `Darwin` and `length` is 1,
so `length` is updated to be 2.
After one more update,
`length` is 3;
since there is nothing left in `names` for Python to process,
the loop finishes
and the `print` function on line 5 tells us our final answer.
Note that a loop variable is a variable that is being used to record progress in a loop.
It still exists after the loop is over,
and we can re-use variables previously defined as loop variables as well:
```python
name = 'Rosalind'
for name in ['Curie', 'Darwin', 'Turing']:
print(name)
print('after the loop, name is', name)
```
```output
Curie
Darwin
Turing
after the loop, name is Turing
```
Note also that finding the length of an object is such a common operation
that Python actually has a built-in function to do it called `len`:
```python
print(len([0, 1, 2, 3]))
```
```output
4
```
`len` is much faster than any function we could write ourselves,
and much easier to read than a two-line loop;
it will also give us the length of many other things that we haven't met yet,
so we should always use it when we can.
::: {.callout-note}
## From 1 to N
Python has a built-in function called `range` that generates a sequence of numbers. `range` can
accept 1, 2, or 3 parameters.
- If one parameter is given, `range` generates a sequence of that length,
starting at zero and incrementing by 1.
For example, `range(3)` produces the numbers `0, 1, 2`.
- If two parameters are given, `range` starts at
the first and ends just before the second, incrementing by one.
For example, `range(2, 5)` produces `2, 3, 4`.
- If `range` is given 3 parameters,
it starts at the first one, ends just before the second one, and increments by the third one.
For example, `range(3, 10, 2)` produces `3, 5, 7, 9`.
Using `range`,
write a loop that prints the first 3 natural numbers:
```python
1
2
3
```
::: {.callout-tip collapse="true"}
## Solution: Range Function
```python
for number in range(1, 4):
print(number)
```
:::
:::
::: {.callout-note}
## Understanding the loops
Given the following loop:
```python
word = 'oxygen'
for letter in word:
print(letter)
```
How many times is the body of the loop executed?
- 3 times
- 4 times
- 5 times
- 6 times
::: {.callout-tip collapse="true"}
## Solution: Loop Execution Count
The body of the loop is executed 6 times.
:::
:::
::: {.callout-note}
## Computing Powers With Loops
Exponentiation is built into Python:
```python
print(5 ** 3)
```
```output
125
```
Write a loop that calculates the same result as `5 ** 3` using
multiplication (and without exponentiation).
::: {.callout-tip collapse="true"}
## Solution: Exponential Calculation
```python
result = 1
for number in range(0, 3):
result = result * 5
print(result)
```
:::
:::
::: {.callout-note}
## Summing a list
Write a loop that calculates the sum of elements in a list
by adding each element and printing the final value,
so `[124, 402, 36]` prints 562
::: {.callout-tip collapse="true"}
## Solution: List Summation
```python
numbers = [124, 402, 36]
summed = 0
for num in numbers:
summed = summed + num
print(summed)
```
:::
:::
::: {.callout-note}
## Computing the Value of a Polynomial
The built-in function `enumerate` takes a sequence (e.g. a [list](04-lists.md)) and
generates a new sequence of the same length. Each element of the new sequence is a pair composed
of the index (0, 1, 2,...) and the value from the original sequence:
```python
for idx, val in enumerate(a_list):
# Do something using idx and val
```
The code above loops through `a_list`, assigning the index to `idx` and the value to `val`.
Suppose you have encoded a polynomial as a list of coefficients in
the following way: the first element is the constant term, the
second element is the coefficient of the linear term, the third is the
coefficient of the quadratic term, where the polynomial is of the form $ax^0 + bx^1 + cx^2$.
```python
x = 5
coefs = [2, 4, 3]
y = coefs[0] * x**0 + coefs[1] * x**1 + coefs[2] * x**2
print(y)
```
```output
97
```
Write a loop using `enumerate(coefs)` which computes the value `y` of any
polynomial, given `x` and `coefs`.
::: {.callout-tip collapse="true"}
## Solution: Polynomial Evaluation
```python
y = 0
for idx, coef in enumerate(coefs):
y = y + coef * x**idx
```
:::
:::
## Key Points
- Use `for variable in sequence` to process the elements of a sequence one at a time.
- The body of a `for` loop must be indented.
- Use `len(thing)` to determine the length of something that contains other values.