Context Managers: Managing resources in Python the right way

Context Managers: Managing resources in Python the right way

The What

If you've worked with Python for any length of time, chances are that you might have come across the term context manager. Or if you're a beginner getting your feet wet, you might have seen the with keyword being used in tutorials, most likely to programmatically open a file. Ever wondered what that is? What exactly is going on under the hood when the with keyword is used? Then put on your miner's hat and let's dig deep.

"What's in a name?". Famous words by a famous man. Clearly, Shakespeare never had to read any source code. Here, it's the name of the game: context manager. Essentially it's managing something, specifically a context. So what exactly is the context here? It's an object that deals with a resource. The resource can be any file (like text or JSON), a network socket, a database connection, a thread lock, etc. When I say a file, technically it's a file descriptor but for the sake of simplicity, we'll refer to it as a file.

with_open_file_the_python_life.png

The image above describes how you would open a text file using a context manager and print out its contents.
The with statement uses the built-in open() function as a context manager to open the file in a read-only mode (since we've specified the mode with the "r" string) and assigns the file object to the txt variable. We then call the readlines() method on the file and that returns a list of all the lines in the text file. In the end, we simply loop through the list and print out all the lines.


The Why

There are two primary questions here:
Q1. Why do you need to manage resources?
A1. Consider any resource in your life. Resources are finite. And anything finite needs to be managed efficiently or eventually you're going to run out. And what happens when you run out of resources on your computer? Nothing pleasant. If you're unlucky (or reckless), you end up losing your work. In case of your code, if your program crashes while operating on a particular file or any data, you may even end up corrupting it.

To further emphasize, think of it this way. When you manually open files in a text editor, do you leave them open after you're done reading the contents, or do you close them? I'm going to take the safer bet here and assume the latter. Then why would you write code that doesn't follow the same principle?

Q2. Why use a context manager to manage your resources?
A2. Well, for starters, writing code that handles it explicitly is a lot more verbose. When you write more code, you have to maintain more code. It increases the chances of human error and you risk introducing more bugs.

Context managers abstract away the same functionality and let you write the same snippet of code with a much cleaner syntax, in my opinion. Not only does the code look cleaner, but it also eliminates the need to explicitly release a particular resource every time you "acquire" it. This helps prevent something known as a resource leak.

open_file_the_python_life_blue.png

The above image describes how you would open the same file and print its contents without using a context manager.
As you can see, it is more verbose. And if you notice the code in the finally block, you need to explicitly close the file in order to release the file descriptor or "handler" which is the resource that gets acquired when you open a file. Now imagine having to do that every single time you acquire a resource. And if your code deals with sockets or database connections and you forget to explicitly close the connection then you have set up your program to crash as soon as it runs out of resources, which it definitely will.


The How

Now that you have an understanding (hopefully) of what a context manager is and why it is imperative to use it, it brings us to our next stage -
The open() function is a built-in context manager in Python, which is why I did not need to "process" it in any way in order to use it as a context manager. I just went ahead and did it. Can I do that with any function? The simple answer is no. That is because a function needs to follow a particular protocol that involves a setup and teardown process in order to identify as a context manager. Not sure what that means? It simply means the function must follow a pattern where it internally acquires the resource (setup) and then releases it back (teardown) after you're done getting the data you need from the resource. Let's look at a real-world example of creating a context manager that will clarify what I mean further.

So how do we create a context manager? There are two ways we can approach it. A class-based approach and a generator based approach that uses a decorator. If you do not know what a decorator is, I will be doing a follow-up article on decorators. For now, just understand that it is syntactic sugar that is implemented using the @ symbol in Python in order to modify a function. Opening and closing a file seems to be a simplistic example. So for our next example, we will create something a little more complex, a context manager that handles the connection to a database. Personally, I prefer using the SQLAlchemy library to work with databases and so I will be using it's Session object in the examples.

1. Class-based
In the class-based approach, a context manager is created by defining a class that contains the __enter__ and __exit__ dunder methods (dunder is a contraction of the words "double underscore").

class_based_context_manager_the_python_life.png using_class_based_context_manager_the_python_life.png

Explanation:

  • The Session object, which will interact with the database is instantiated in the __init__ method.

  • The __enter__ method returns the above session object. The object returned from this method is what gets assigned to the connector variable that is created in the with statement.

  • The __exit__ method contains four mandatory arguments including self. You can name these arguments as per your preference but the order is important and must be maintained. If an exception occurs in the code nested under the with statement, the details of the exception are passed onto the __exit__ method and into these variables. What they contain is pretty self-explanatory from their naming so you should endeavor to name them similarly if you do choose to rename them.

  • In the code under the __exit__ method, we're checking if the exc_type variable has any value. If it doesn't, it means the code under the with statement was executed successfully and the data is persisted onto the database using the commit() method. If the exc_type variable does have a value, it means an exception was raised so we print out the type of the exception and roll back any changes we tried to make to the database using the rollback() method. In case you wish to suppress the exception, the __exit__ method should return True at the end.

  • The __exit__ method should not reraise the passed-in exception. That is the caller's responsibility.

2. Generator based

contextlib_contextmanager_purple.png

using_generator_based_context_manager_the_python_life.png

Explanation:

  • The generator-based approach utilizes the contextmanager decorator, which is part of the contextlib module in the standard library, to convert the db_connect() function into a context manager.

  • Since this is a generator-based approach, the session object is outputted using the yield keyword in the try block. Any database operation-specific exceptions are caught and the changes are rolled back using the rollback() method. If the try block is successful, the code moves on to the else block and the data is persisted to the database using the commit() method. In the end, regardless of the outcome of the upper blocks, the finally block is executed and the connection to the database is closed.

  • With this approach, it is necessary to use a try-except block while acquiring the resource in order to properly free up the resource again if an exception is raised under the with statement. The contextmanager() decorator collects the exception from the with statement and passes it onto the generator body. This ensures that any code after the yield statement is not executed when the exception is raised.


And that concludes my thoughts on context managers in Python. I hope it was helpful to you in some way. Which method do you find more to your liking? Let me know in the comments!

You can download the sample code for the examples above here: Sample Code

PS: If you enjoyed the article and subscribed to my newsletter, you will receive a confirmation email to ensure that it was indeed your intention to subscribe. Sometimes email carriers tend to mark these as spam or junk so do check those folders if you do not see the confirmation email in your inbox. Have a nice day.