3 Functions and Loops
3.1 Functions
Functions in mathematics, is an assignment of an element from a set to a different set. Here we give a more “practical” definition.
In the realm of programming, a function is a reusable piece of code that performs a specific task. Think of it as a mini-program within your program, a tool you create to do a job so you don’t have to. And as a program does, you give it an input, it will do some voudou magic, and it will spit an output.
What’s the point of a function you might ask? Well, in programming, we use functions when we have tasks that must be performed again and again. When we notice a repetitive task occurring, it is a good idea to write a function that performs this task. If you find yourself copying and pasting the same piece of code more than three times, it’s time to stop and think: “Should I write a function for this?” Remember, in the world of coding, being lazy is often a good thing! We aim for efficiency, not repetitive strain injury.
Imagine if every time you need to compute the square root, you need to
write the code to compute the square root over and over again. Well, but
in practice, you don’t have to! There’s a function that does the job
for you already: sqrt
In fact, it won’t come as a surprise that we’ve
already encountered many functions in the past weeks. For instance:
- Mathematical functions like
sqrt()
orsin()
in R, andmath.sqrt()
ormath.sin()
in Python. - Functions like
np.array()
orc()
to create vectors. - Functions like
length()
in R orlen()
in Python to get information about our data structures.
Again, each function I mentioned, has an input, and an output: sqrt
takes a number as input and returns the square root of such number,
length
or len
take a whole vector as input, and return the length of
it…
Another incredibly useful function is the help()
function, available
in both R and Python. This function takes a function as an input,
and provides us with information said function as output! For example:
python
Whenever you will meet a built-in function, or a function from a
documented external library you can call the help on the function to
have an explanation about it! Documentation is a requirement in R
libraries, so pretty much all the functions you will meet are
documented! Unfortunately this is not the case for Python, where
documenting functions is not a requirement, so not always the help
will return something.
3.1.1 Writing new functions
The real power of functions comes when we start creating our own. We can write new functions that perform specific tasks, tailored to our needs. This allows us to do more complex, interesting, and fun things with our code. So let’s dive in and learn how to create our own functions! We can wrap our code in a function, and every time the function is called, this code is run. This is incredibly useful for tasks we need to perform multiple times. Let’s build for example a simple function that converts pounds (lbs) to grams (g).
Mathematically, this is simply done by the formula: \[ g = 453.5924 * lbs \]
In R, we define a function using the function()
command. The
arguments of the function are placed within the parentheses. Here’s how
we can create a function in R:
In this function, lbs
is the input (or argument), and the function
returns the equivalent weight in grams. In R, we create a new function
with the function
statement, and then we assign it to a variable,
which is going to be containing our function. The return()
statement
is used to specify the result that the function should output.
In Python, we define a function using the def
keyword. The
arguments of the function are placed within the parentheses. Again we
need to use indentation: this is crucial in Python as it determines the
code blocks. Here’s how we can create a function in Python:
In this function, lbs
is the input (or argument), and the function
returns the equivalent weight in grams: following the def
keyword we
need both the function name, followed by the argument. The return
statement is used to specify the result that the function should output.
Now, once we have made the new function, we can call it with:
R
## [1] 907.1848
## [1] 1360.777
## [1] 1360.777
python
## 907.1848
## 1360.7772
## 1360.7772
By creating functions like these, we can make our code more efficient and easier to read. Plus, it saves us from having to remember the conversion rate each time we want to convert pounds to grams!
The name of the function lbs_to_grams
and its argument lbs
are just
names that I chose. There are a few guidelines that could be useful when
naming functions:
Names should be lowercase.
Use an underscore, _, to separate words within a name.
Strive for names that are concise and meaningful (this is not easy!).
Avoid existing function names in R and python, such as
length()
orlen()
.
Also, you might have noticed we made a variable within the function
code, grams
. But what does it mean to have variables within functions?
The variables used inside a function are local to that function. Think
of your function as a guarded sandbox, where no child leaves unless you
specifically tell them to. This sandbox is called and enviroment.
Let’s cover this concept formally!
3.1.1.1 Enviroments
In programming, an environment refers to a structure that holds variables. When you create a variable in a program, the environment is where this variable lives. The environment keeps track of the variable’s name and its current value.
There are two types of environments: global and local.
The global environment is the default environment where your variables live unless you specify otherwise. When you create a variable at the top level of your script, it’s stored in the global environment.
A local environment is created when you call a function. Each time a function is called, a new local environment is created for that function call. This environment holds the variables that are created within the function. These variables are only accessible within the function call and cease to exist once the function call is over.
Let’s consider the lbs_to_grams
function we made:
R
lbs_to_grams <- function(lbs) {
grams = 453.5924 * lbs
return(grams)
}
# Create a variable in the global environment
glo_grams <- 2
# Try to access the local variable "grams"
print(grams)
## Error in eval(expr, envir, enclos): object 'grams' not found
## [1] 2
python
def lbs_to_grams(lbs):
grams = 453.5924 * lbs
return grams
# Create a variable in the global environment
glo_grams = 2
# Try to access the local variable "grams"
print(grams)
## NameError: name 'grams' is not defined
## 2
In these functions, lbs
and grams
are variables in the local
environment of the function. They are created when the function is
called and cease to exist when the function call is over. If you try to
access grams
outside of the function, you’ll get an error because
grams
is not in the global environment.
On the other hand, glo_grams
is in the global environment because it’s
created at the top level of the script, not within a function. You can
access glo_grams
anywhere in your script even within a function.
This distinction between global and local variables helps keep our code clean and reduces the chance of errors. It ensures that the function does its job without interfering with the rest of our script! However, it can be prone to errors too. Say for instance, I make a typo in the argument name of my function above, and, for whatever reason I created a lbs variable in the global… Like for instance:
R
# I specified in the global lbs at some point
lbs <- 3
# and here I made a typo!
# V
bug_lbs_to_grams <- function(lb) {
grams = 453.5924 * lbs
return(grams)
}
# Then, no matter what I call, I will always get the same result!
# e.g.: 453.5924 * 3
bug_lbs_to_grams(2)
## [1] 1360.777
## [1] 1360.777
## [1] 1360.777
What happened above, is that I pass down as an input the lb
argument
(without the s) but this is nowhere used in the function as inside the
code, at grams = 453.5924 * lbs
I call the lbs
. Now since the
compiler can’t find any lbs
within the local enviroment, it will just
assume that this is the lbs
I specified in the global. Hence, no
matter what I feed to the function, it will return the evaluation with
the global lbs
. For this reason try not to name the variables inside
your functions as those outside it, in the rest of your script. What
happens if I do name a local variable and global variable in the same
way? Well, in this case, the function will use the local lbs
variable,
and the global will use the global lbs
variable but this is still
risky and prone to bugs.
R
## [1] 907.1848
## [1] 4535.924
## [1] 19050.88
python
## 907.1848
## 4535.924
## 19050.8808
For those who are newer to programming, don’t worry if the idea of
dealing with bugs and errors seems daunting. One of the most effective
ways to understand what’s happening in your code is to use print
statements. If you were confused by this section, try to add those
print
statements to display the value of variables lbs
and grams
within the functions above! In fact, by strategically placing print
statements in your code, you can see the values of variables at
different points in your program’s execution. This can help you
understand how your code is working and where potential issues might be.
As you gain more experience and confidence, you’ll naturally start to
develop more advanced strategies for managing and resolving bugs in your
code. Remember, everyone was a beginner once, and every expert has made
plenty of mistakes along the way. So don’t be discouraged by bugs in
your code - they’re just opportunities to learn and improve!
NOTE for the more experienced users. It’s worth noting that debugging tools can be incredibly helpful for identifying and resolving issues in your code. A debugger is a program that helps you inspect what’s happening in your program while it’s running. It allows you to pause your program, inspect the values of variables in a given environment at any point in time, and step through your code line by line. This can be particularly useful for identifying issues like the one we’ve discussed above. RStudio includes a built-in debugger that can be a great help in these situations. You can learn more about debugging in RStudio in this guide.
3.1.1.2 Multiple arguments
Say we want now to convert pounds to milliliters. To convert from mass to volume, we will need an additional information, the specific mass. In physics, the specific mass (also known as the volumetric mass density) of a substance is the mass per unit volume.
Fortunately for us, functions can take multiple arguments, allowing us to give more elements to the local environment of the function. This means we can customize the function’s behavior based on these inputs. Let’s see how we can add more arguments to a function.
We’ll create a new function to convert pounds to milliliters. This
function will take a second argument: specific_mass
. We’ll use the
function we created earlier to convert pounds to grams, and then,
using the specific_mass
, we’ll convert grams to liters.
Here’s how we can do this in both R and Python:
To run the functions we just made:
R
# Specific mass of water and oil
water_mass = 1
oil_mass = 0.92
# Convert 2 lbs of water and oil to ml
lbs_to_ml(2, specific_mass=water_mass)
## [1] 907.1848
## [1] 986.0704
python
# Specific mass of water and oil
water_mass = 1
oil_mass = 0.92
# Convert 2 lbs of water and oil to ml
lbs_to_ml(2, specific_mass=water_mass)
## 907.1848
## 986.0704347826087
Let’s say that 90% of the time when we are doing these calculations, they are relative to using water. Then, to be more efficient, rather than continuously having to give the mass of water, we could set the arguments to have these as default values.
This is where default arguments in functions come into play. They are incredibly useful as they allow us (and any potential future user) to make certain parameters optional. This makes the function easier to use and the the code more readable in general!
Here’s how we can do this in both R and Python:
As you can see, default arguments make our function more flexible and easier to use. They allow the function to handle a wider range of scenarios while keeping the code clean and readable.
3.1.2 Exercise: cups to grams converter
So I once watched this movie with a small rat chef named Remy. Like it’s a very popular one, but I can’t say the title for copyright reasons. Anyway, let’s say that one day, Remy decides to leave his home in France and set sail for the culinary world of the United States.
But as soon as Remy gets onto American soil, and he starts to explore American recipes, he encounters a problem. All the measurements are in cups! Back in France, he was used to grams and liters. “Mon Dieu!” he exclaimed, “How am I supposed to cook with these cup measurements?” [Imagine this phrase with a French accent].
But Remy is not a rat to be easily defeated: To make his life easier, he decides to hire a programmer on Fiverr to convert cups to grams. You are that programmer.
As we did with the functions we created before, you should:
- Create a new function
cups_to_ml
that takes as input the number of cups and returns as output the corresponding value in milliliters. Use the relation \[ cps = 236.588 * ml \] - Create a new function called
ml_to_grams
. This function, as above, will need the specific weight, but now the relation is given by the inverse: \[ g = ml * \rho, \] where \(\rho\) is the specific weight. - Create a last function called
cups_to_grams
. This function should:Take as input two arguments: the number of cups, and a string that specifies an ingredient, e.g.
"flour"
or"water"
. Default this second argument towater
.Convert the amount of cups in milliliters using the function
cups_to_ml
.With an if-else if-else statement, it should check the argument
ingredient
, and set aspecific_weight
variable based on the ingredient. You can find the values in the table below:Ingredient Specific Weight (g/ml) water 1 flour 0.53 oil 0.92 oat Milk 1.03 - Call the
ml_to_grams
function with the relative ml and specific weight computed above, and return the result.
- Run the function to convert the following:
- 2 cups of water
- Half a cup of oil
- 3 cups of oat milk
- 2 cups of flour plus one cup of water
3.2 State and For Loop
Normally, if you have to run some operations on multiple objects, you
would store these objects in a vector, and then work with the vector
directly. This is what we’ve been doing in the past weeks, and what
works for most of the times. For instance, let’s say we have a vector
called measurements
in lbs and we want to convert all these
measurements to grams. Then, we can simply take our function
lbs_to_grams
we made above, and run it on the measurements
vector:
R
# Create a vector of measurements in lbs
measurements <- c(1, 2, 3)
# Convert measurements to grams
lbs_to_grams(measurements)
## [1] 453.5924 907.1848 1360.7772
python
# importing numpy
import numpy as np
# Create a list of measurements in lbs
measurements = np.array([1, 2, 3])
# Convert measurements to grams
lbs_to_grams(measurements)
## array([ 453.5924, 907.1848, 1360.7772])
This will run the function on each element of the vector independently.
More formally: in the first two weeks, we learned how to work with vectors and vectorised functions, designed to operate on whole vectors directly. Technically, we say that vectorised functions are trivially parallelizable because there’s no dependency between elements (the value of an element of the output vector does not depend on any other element).
However, while you should use vectorised functions as much as you can, they are not suitable for all programming tasks, particularly when the computation of an element depends on the previous ones, i.e., when there is a state involved.
A state is a scenario where certain steps in your code must be executed in a specific order because the output of one step is the input to the next step. In these circumstances, we find an answer in looping. Loops allow us to execute a block of code multiple times, which is exactly what we want in these scenarios.
The Fibonacci sequence is a classic example of a state. If you’re not familiar with it, the Fibonacci sequence is a series of numbers in which each number is the sum of the two preceding ones, usually starting with 0 and 1. The first 10 values in the series will be:
\(r_{i}\) | \(r_{0}\) | \(r_{1}\) | \(r_{2}\) | \(r_{3}\) | \(r_{4}\) | \(r_{5}\) | \(r_{6}\) | \(r_{7}\) | \(r_{8}\) | \(r_{9}\) |
---|---|---|---|---|---|---|---|---|---|---|
Value | 0 | 1 | 1 | 2 | 3 | 5 | 8 | 13 | 21 | 34 |
More formally, the Fibonacci sequence is defined by the recurrence relation:
\[ r_{i} = r_{i - 1} + r_{i - 2} \quad \text{for}\quad i > 2, \]
with initial conditions \(r_{1} = 0\) and \(r_{2} = 1\).
Well, unsurprisingly, in mathematics, you have already met the “for” statements! As in mathematics a “for” allows to build a relation, a “for” loop is a control flow statement that allows code to be executed repeatedly.
Therefore, to generate the Fibonacci sequence using a “for” loop, we could use the following algorithm:
- Start by defining the first two numbers in the sequence, 0 and 1.
- For a given number of iterations, do the following:
- Calculate the next number in the sequence as the sum of the previous two numbers.
- Update the previous two numbers to be the last number and the newly calculated number.
We will return to the Fibonacci example after having introduced the
for
loop syntax.
3.2.1 The syntax
As the syntax in R and python for looping is quite different, R and python chunks will be separate. You should be able to read each language independently.
We can code a for loop as following.
In R:
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
In this R code, to explain the syntax:
for (i in 1:10)
is the start of the for loop. The iteratori
goes from 1 to 10.1:10
creates a vector: this is the sequence of values we want to iterate over.print(i)
is the code chunk that we want to repeat. In this case it prints the current value ofi
.i
is the object that stores our current index. It gets updated at each step over the sequence of values for 1 to 10.- The for loop ends when it has exhausted the sequence, i.e., when
i
has taken all values from 1 to 10.
We can also use the for loop to iterate over the elements of a vector directly, rather than their indices. For example, a for loop that prints the elements of a vector:
## [1] "apple"
## [1] "banana"
## [1] "cherry"
In this case, fruit
is the iterator that gets updated at each step
over the sequence with the current element of the sequence. The for loop
ends when it has exhausted the sequence.
Try to edit both code chunks above, by adding a new variable called
iter_counts
. Initialize this variable at 0, e.g. iter_counts <- 0
,
and update it in the cycle with iter_counts <- iter_counts + 1
. Print
it at the end: what’s the value for the first cycle? And for the second?
In Python:
In Python, we can print the numbers from 1 to 10 using a for loop as follows:
## 1
## 2
## 3
## 4
## 5
## 6
## 7
## 8
## 9
## 10
In this Python code, to explain the syntax:
for i in range(1, 11)
is the start of the for loop. The iteratori
goes from 1 to 10. This is the sequence we want to iterate over. We userange
function to generate this figure: this function is analogue tonp.arange
from numpy, but generates a different object called “range” that is specific to iteratorsprint(i)
is the code chunk that we want to repeat. It prints the current value ofi
.i
is the object that stores our current index. It gets updated at each step over the sequence.- The for loop ends when it has exhausted the sequence, i.e., when
i
has taken all values from 1 to 10.
We can also use the for loop to iterate over the elements of a numpy array directly, rather than their indices. For example, a for loop that prints the elements of a numpy array:
## apple
## banana
## cherry
In this Python code:
arr = np.array(["apple", "banana", "cherry"])
creates a numpy array with the elements “apple”, “banana”, and “cherry”.for fruit in arr:
is the start of the for loop. The iteratorfruit
goes over each element in the numpy arrayarr
.print(fruit)
is the code chunk that we want to repeat. It prints the current value offruit
.fruit
is the object that stores the current element of the array. It gets updated at each step over the sequence.- The for loop ends when it has exhausted the sequence, i.e., when
fruit
has taken all values in the numpy arrayarr
.
Try to edit both code chunks above, by adding a new variable called
iter_counts
. Initialize this variable at 0, e.g. iter_counts = 0
,
and update it in the cycle with iter_counts = iter_counts + 1
. Print
it at the end: what’s the value for the first cycle? And for the second?
3.2.2 Coding The Fibonacci sequence
Having covered how to write a for cycle, we come back to our Fibonacci
example. Differently from the loop above, in order to implement our
algorithm, we will need to update a variable within our loop. In our
case, this variable, is going to be a vector storing our Fibonacci
sequence, called fibonacci
. At iteration 1, the vector will be 2
elements long (the first two number of the sequence), but as we go
through the loop, we will be adding more and more elements to the
vector.
R
Python
# Initialize the first two numbers in the sequence
fibonacci = [0, 1]
# Generate the next 18 numbers in the sequence
for i in range(2, 20):
fibonacci.append(fibonacci[i - 1] + fibonacci[i - 2])
# alternatively
#fibonacci = fibonacci + [fibonacci[i - 1] + fibonacci[i - 2]]
# converting the list into a numpy array
# (useful if we want to use vectorised operations later!)
fibonacci = np.array(fibonacci)
In the R code:
fibonacci <- c(0, 1)
initializes the first two numbers in the Fibonacci sequence.for (i in 3:20)
is the start of the for loop. The iteratori
goes from 3 to 20. This is the sequence we want to iterate over.fibonacci[i] <- fibonacci[i - 1] + fibonacci[i - 2]
is the code chunk that we want to repeat. It calculates thei
th number in the Fibonacci sequence as the sum of the two preceding numbers and appends it to thefibonacci
vector. Given that at iterationi
we have no elementi
in the vector yet (this is stilli-1
long), R will automatically extend the vector of one extra element. Alternatively, you can use thec(fibonacci, fibonacci[i - 1] + fibonacci[i - 2])
to concatenate an extra vector (with one element) to the existing vector.
In the Python code:
fibonacci = [0, 1]
initializes the first two numbers in the Fibonacci sequence.for i in range(2, 20)
is the start of the for loop. The iteratori
goes from 2 to 19. This is the sequence we want to iterate over.fibonacci.append(fibonacci[i - 1] + fibonacci[i - 2])
is the code chunk that we want to repeat. It calculates thei
th number in the Fibonacci sequence as the sum of the two preceding numbers and appends it to thefibonacci
list using theappend()
method. The append method extends our vector, adding a new element at the end of the existing one. Alternatively, you can use thefibonacci + [fibonacci[i - 1] + fibonacci[i - 2]]
to concatenate an extra vector (with one element) to the existing vector.
Our vectors, where we stored our results at every iteration, should be 20 elements long, containing the sequence.
R
## [1] 0 1 1 2 3 5 8 13 21 34 55
## [12] 89 144 233 377 610 987 1597 2584 4181
Python
## array([ 0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55,
## 89, 144, 233, 377, 610, 987, 1597, 2584, 4181])
Before you proceed make sure you understand what’s going on inside
the for loop. Add extra print
statements to understand what’s
happening at every iteration, e.g. try to add: print(i)
,
print(fibonacci[i])
and print(fibonacci[i-1])
before the
computation, or print(fibonacci)
after the computation has been done.
print
statements are extremely useful for debugging for cycles!
3.2.2.1 Safer for loops
The for loop above is straightforward and works fine for the specific task of computing the Fibonacci sequence alone. However, it could potentially lead to issues if you wanted to modify or extend the code. For example, if you wanted to change the calculation logic or use it in a different context, you would need to modify the loop itself. This could introduce errors and make the code harder to maintain.
However, there is an approach that is generally safer and more robust: that of working with functions within your loops.
Have a look at the code below:
R
# define a function to update the Fibonacci sequence
update_fibonacci <- function(R) {
n <- length(R)
r_t <- R[n] + R[n - 1]
return(c(R, r_t))
}
# define a function to compute the Fibonacci sequence
compute_fibonacci <- function(n) {
# Initialize the first two numbers in the sequence
fibonacci <- c(0, 1)
# Generate the next n-2 numbers in the sequence
for (i in 3:n)
fibonacci <- update_fibonacci(fibonacci)
return(fibonacci)
}
compute_fibonacci(20)
## [1] 0 1 1 2 3 5 8 13 21 34 55
## [12] 89 144 233 377 610 987 1597 2584 4181
Python
# Define a function to update the Fibonacci sequence
def update_fibonacci(R):
n = len(R)
r_t = R[n - 1] + R[n - 2]
return R + [r_t]
# Define a function to compute the Fibonacci sequence
def compute_fibonacci(n):
# Initialize the first two numbers in the sequence
fibonacci = [0, 1]
# Generate the next n-2 numbers in the sequence
for i in range(3, n + 1):
fibonacci = update_fibonacci(fibonacci)
return fibonacci
compute_fibonacci(20)
## [0, 1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, 144, 233, 377, 610, 987, 1597, 2584, 4181]
This solution is better on a lot of ways. The calculation logic of the
Fibonacci sequence is encapsulated within its own function,
update_fibonacci
. Now, suppose you found a better method to deal with
that, you’d only need to change that part (spoiler: you can, ask a GPT
model if it has ways to improve on update_fibonacci
).
Also, the computation of the full Fibonacci sequence is encapsulated in
its own function, compute_fibonacci
! This makes the code more
organized and less prone to errors due to variables defined elsewhere
as it does not touch global variables.
For this reason, you can easily reuse the compute_fibonacci
function in other parts of your code or even in different programs! For
instance, this function might turn out to be useful in the next
exercise, where you’ll have to call it with n=10
.
3.2.3 Exercise
Write now a new function my_cusum
to compute the cumulative sum of a
list of numbers, using a for loop. Use the function to calculate the
cumulative sum of the first 10 elements of the Fibonacci sequence.
The function should:
Take as input a vector
Initialize a new vector to store the cumulative sum
Then, in a cycle, update this vector with the cumulative sum of the elements of the input vector
The output vector should be of the same length of the input vector
The first element of the output vector should be the same first element of the input vector
Using the code above, generate a Fibonacci sequence of 10 numbers
Run the function
my_cusum(fibonacci)
to obtain the cumulative sum of the first then numbers of the Fibonacci sequence.Compare the output of
my_cusum
with the built in functioncumsum
(in R) or numpy’snp.cumsum
(in Python).
3.3 More about looping
In this section we will be giving few extra details and concepts you
should know about looping. If you got familiar with the for
cycle
above, they should be fairly straightforward to understand!
3.3.1 Nested loops
Nested loops are useful when we have to repeat a block of code for each combination of elements from two or more sequences. This can be incredibly useful in many situations, such as when we want to perform an operation for each pair of elements in two lists or vectors.
A simple example, for instance, could be a nested loop that prints the multiplication table from 1 to 4.
R
## [1] "1 x 1 = 1"
## [1] "1 x 2 = 2"
## [1] "1 x 3 = 3"
## [1] "1 x 4 = 4"
## [1] "2 x 1 = 2"
## [1] "2 x 2 = 4"
## [1] "2 x 3 = 6"
## [1] "2 x 4 = 8"
## [1] "3 x 1 = 3"
## [1] "3 x 2 = 6"
## [1] "3 x 3 = 9"
## [1] "3 x 4 = 12"
## [1] "4 x 1 = 4"
## [1] "4 x 2 = 8"
## [1] "4 x 3 = 12"
## [1] "4 x 4 = 16"
Python
## 1 x 1 = 1
## 1 x 2 = 2
## 1 x 3 = 3
## 1 x 4 = 4
## 2 x 1 = 2
## 2 x 2 = 4
## 2 x 3 = 6
## 2 x 4 = 8
## 3 x 1 = 3
## 3 x 2 = 6
## 3 x 3 = 9
## 3 x 4 = 12
## 4 x 1 = 4
## 4 x 2 = 8
## 4 x 3 = 12
## 4 x 4 = 16
NOTE for Python users In the print statement
print(f"{i} x {j} = {i*j}")
we used f-string formatting, which was
introduced in Python 3.6 (the code above will break on previous
versions). It’s a way to embed expressions (like i*j
) inside strings,
using curly braces {}
. The expressions will be replaced with their
values when the string is created. The letter f
at the beginning of
the string tells Python to allow these embedded expressions.
In our example {i}
and {j}
will be replaced by the values of the
variables i
and j
, and {i*j}
will be replaced by the result of the
expression i*j
. So if i
is 2 and j
is 3, the string would become
"2 x 3 = 6"
. F-string formatting can be very useful for debugging
because it allows you to easily insert the values of variables into
strings, which you can then print to see what’s happening in your loops!
3.3.2 While loops
Similarly to for loops, we have the while loops. They allow us to repeat a block of code until a certain condition is met. This can be incredibly useful in many situations, such as when we want to perform an operation until a certain threshold is reached.
We can see how a while loop works below:
R
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 5
## [1] 6
## [1] 7
## [1] 8
## [1] 9
## [1] 10
In these codes:
i <- 1
andi = 1
initialize the counter at 1.while (i <= 10)
andwhile i <= 10:
start the while loop. The conditioni <= 10
is what we check at each step. If it’s true, we execute the code chunk inside the loop.print(i)
andprint(i)
are the code chunks that we want to repeat. They print the current value ofi
.i <- i + 1
andi += 1
are crucial. They updatei
at each step, ensuring that our condition will eventually be false. Without these lines,i
would always be 1, the condition would always be true, and the while loop would run indefinitely.- The while loop ends when the condition
i <= 10
is no longer met, i.e., wheni
is greater than 10.
We are missing one last ingredient to looping! Break and next.
3.3.3 Break and next/continue
Break and next/continue are control flow statements that can be used to alter the flow of a loop. They can be used when we want to stop a loop if a certain condition is met or skip an iteration if a certain condition is met.
To see how they work, we write a loop that prints the numbers from 1 to 10, but skips the number 5 and stops after the number 8.
R
for (i in 1:10) {
if (i == 5) {
next # Skip the rest of the iteration
# and continue with the next iteration
}
if (i > 8) {
break # Immediately terminate the loop
}
print(i) # Print the current value of i
}
## [1] 1
## [1] 2
## [1] 3
## [1] 4
## [1] 6
## [1] 7
## [1] 8
Let’s break it down:
if (i == 5) { next }
(R) andif i == 5: continue
(python) are the next/continue statements. Ifi
is equal to 5, they skip the rest of the loop and continue with the next iteration. 5 won’t be printed!if (i > 8) { break }
andif i > 8: break
are the break statements. Ifi
is greater than 8, they immediately terminate the loop, regardless of the loop condition. As a result, we stop at 8, and don’t continue to up to 10.break
could be an alternative stopping condition to a while loop.
Using break
and next
(or continue
in Python) statements in loops
can sometimes make the code harder to understand and debug, because they
can lead to unexpected jumps in the control flow. This is especially
true in more complex loops where it’s not immediately clear when or if
the loop will be prematurely terminated or skipped.
Usually, there’s always alternatives to breaks
. Let’s consider a
simple example where we want to find the first number in a list that is
divisible by a certain number. Using break
:
R
numbers <- c(15, 18, 21, 24, 27, 30)
divisor <- 4
for (num in numbers) {
if (num %% divisor == 0) {
print(paste(num, "is the first number divisible by", divisor))
break
}
}
## [1] "24 is the first number divisible by 4"
Python
numbers = [15, 18, 21, 24, 27, 30]
divisor = 4
for num in numbers:
if num % divisor == 0:
print(f"{num} is the first number divisible by {divisor}")
break
## 24 is the first number divisible by 4
In this case, as soon as we find a number that meets our condition, we
break
out of the loop. However, if we want to avoid using break
, we
could rewrite the loop with a while to use a boolean flag that indicates
whether we’ve found a suitable number:
R
numbers <- c(15, 18, 21, 24, 27, 30)
divisor <- 4
found <- FALSE
i <- 1
while (!found && i <= length(numbers)) {
if (numbers[i] %% divisor == 0) {
print(paste(numbers[i], "is the first number divisible by", divisor))
found <- TRUE
}
i <- i + 1
}
## [1] "24 is the first number divisible by 4"
Python
numbers = [15, 18, 21, 24, 27, 30]
divisor = 4
found = False
i = 0
while not found and i < len(numbers):
if numbers[i] % divisor == 0:
print(f"{numbers[i]} is the first number divisible by {divisor}")
found = True
i = i + 1
## 24 is the first number divisible by 4
In this version, instead of breaking out of the loop, we use the found
variable to keep track of whether we’ve found a number that meets our
condition. If we have, we skip the rest of the loop iterations without
explicitly using a break
statement.
Craving More Complexity?
Finding MATH245 a Walk in the Park?
Dive into our Extra Coding Challenges! You’ve mastered the basics, now it’s time to test your mettle. Venture into the final Chapter 11 where a nasty bunch of medium to hard coding exercises awaits. These brain-teasers are designed to stretch your coding skills to their limits. Participation is optional, but if you’re up for a challenge and keen to level up your skills, I highly recommend giving them a go. Start with the Hats in a Line Challenge.