In the previous article, we saw that we can define functions inside another function, and doing so would create the enclosing scope
, using which we would be able to access variables/names from the local scope
of the outer function, from inside the inner function. Closures are nothing but an application of this concept. So, If you haven't read my previous article, then I request you to please go through it first, before reading further.
The inner function along with its enclosing environment is called a closure
Creating a Closure
To define a closure, we need to take 3 steps:
Create an inner function.
Reference variables from the enclosing function inside the inner function.
Return the inner function.
The closure will capture the local names in the outer function and keeps them around, giving us full access to the local namespace of the outer function even after it has returned and finished executing. Let's examine this in code π»:
def outer():
outer_value = 10
def inner():
print(outer_value)
return inner
inner = outer()
inner() # 10
print(inner.__code__.co_freevars) # ('outer_value',)
print(inner.__closure__[0].cell_contents) # 10
In the above example, outer_value was defined in the local scope of the outer(), and we were able to access it in inner() even after outer() finished executing. This is because, when we reference variables from the enclosed scope inside the inner function, Python stuffs them into the local scope of the inner function to create the closure. Such variables which were stuffed in are called free variables and we can find them in the co_freevars
property of the inner function's __code__
object and their value can be retrieved using the property: __closure__
as shown in the example.
Use Cases: Managing State and Data Hiding
One of the primary use cases of closure is for Data Hiding. With closures, the variables from the enclosing scope are only accessible inside the inner(), thereby restricting modifications from outside and thus making them private variables.
Let's say, we have a function square() which returns the square of a number, and we want to keep track of the number of times this function is called. A simple solution would be to keep a counter variable in global and to increment it every time we make a call to square().
def square(num):
return num * num
square_counter = 0
square(1)
square_counter += 1
square(3)
square_counter += 1
print(f"square() was called {square_counter} times")
# square() was called 2 times
Yes, this is a solution, but not the best π. Since the square_counter is in global scope, it is prone to modifications from anywhere in the module, making the final result unreliable. Let's try to improve this solution with what we have learned so far:
We can view the counter as the
state
and this state has to be updated on each call made to square().We need to keep this state
private
to prevent modifications from outside.
So, if we use closure, we can declare the state variable in outer()'s local scope, making it private. And then inside inner(), we can refer to this state variable and update it, thereby enabling a way to manage/retain the state after calls to inner(). Let's implement this in code π»:
def square_factory():
square_counter = 0
def square(num):
nonlocal square_counter
square_counter += 1
print(f'square() was called {square_counter} times')
return num * num
return square
square = square_factory()
square(1) # square() was called 1 times
square(3) # square() was called 2 times
Now we have a better solution. But still, there is a problem π₯΄, the count is getting printed on every call to square(). We don't want that. We need to show the count only when we want to. Also, we want to have some mechanism to reset the count. Let's take these requirements forward and see how we can solve these using another use case of closure:
Use Cases: getter() and setter()
We already saw that closure hides our state variables from the outside world and can only be accessed from the inner functions. So, how can we provide solutions to our requirements π€? Yes, you guessed it right. We will have more inner functions π€ͺ, one to get the value of our state (counter) and the other to update the state (reset). But, we won't be returning all 3 functions from our factory function. Instead, we will apply what we learned in the article: π‘Function as Objects to add getter
and setter
methods to the square() object before returning it.
def square_factory():
square_counter = 0
def square(num):
nonlocal square_counter
square_counter += 1
return num * num
def __get():
print(f'square() was called {square_counter} times')
def __set(value):
nonlocal square_counter
square_counter = value
square.show_counter = __get
square.reset_counter = lambda: __set(0)
return square
square = square_factory()
square(1)
square(3)
square.show_counter() # square() was called 2 times
square.reset_counter()
square.show_counter() # square() was called 0 times
square(4)
square.show_counter() # square() was called 1 times
Use Cases: Rate Limiting
Let's bring back our upload() example from Function as Objects and see how we can improve it using a closure π:
def create_uploader():
# Private state
uploaded = False
def upload(data, destination):
nonlocal uploaded
if uploaded:
print("Already uploaded. Skipping..")
return
try:
# Perform actual upload
pass
except:
print("An error occured during upload. Try Again..")
else:
uploaded = True
print(f"Uploaded data to {destination}")
return upload
upload = create_uploader()
upload("stars", "https://github.com/mochatek")
# Uploaded data to https://github.com/mochatek
upload("followers", "https://github.com/mochatek")
# Already uploaded. Skipping..
Now that you got the basic idea, can you create a throttler
and debouncer
using closure? ππ
Use Cases: Caching
A cache is a software or hardware component aimed at storing data so that future requests for the same data can be served faster.
Suppose you have an expensive function, let's say square() for the sake of this example, which takes approximately 1 minute to compute the square of a number. Now let's say you called this function once, to get the square of 5 and got 25 as the result after waiting for a minute, you did some processing with that result and then after some time you require the square of 5 again. Now, if we call square(5), then again we have to wait for another 1 minute just to get the result that we already computed a few seconds ago π₯΅.
As you can see, this is clearly a waste of computation and time. If we could compensate some memory for storing the results against the number for which we computed the square, then for future calls to that number, we could directly return the square from this store, instead of computing it again. This is what caching means and this store where the results are kept is called a cache.
Typically a cache is implemented using a key-value data structure, where the key will be the input and the value will be the computed result. The data structure: dictionary
can be an excellent choice to implement a cache in Python. So, if we keep our private state as the cache (dictionary), we could improve the performance of our square(). Let's try this in code π»:
def square_factory():
cache = {}
def square(num):
# If square of num was computed earlier and is
# available in cache, then return result from cache
if num in cache:
print("Cache Hit! Instantly returning from cache")
# Otherwise, compute the square of num.
# Add num-result to cache and then return the result
else:
print("Cache Miss! Computing result in a minute..")
result = num * num
cache[num] = result
return result
return square
square = square_factory()
square(5) # Cache Miss! Computing result in a minute..
square(5) # Cache Hit! Instantly returning from cache
I hope you understood the concept of caching and how we can use closure to implement it. If it is still unclear to you, don't worry, we will discuss it again when we look at decorators
in Python. π€π€
Well, thatβs all for this article. Thanks for reading! π
I hope you liked this article and if you have any questions or suggestions, please drop them in the comments. Happy Coding.. π