Or, Why You Need to Understand Transaction Isolation Levels
August 25 2013
Tags: django | mysql | python | programming

When I went to deploy a Django + MySQL project awhile ago, transitioning off of the Django test server started producing confusing results. The most obvious place I saw it was with logins. I would login, the page returned would appear logged in. But then on the next request it would appear as though I was not logged in. For awhile I thought I had screwed something up when customizing the login flow for my site, but couldn’t find any issues.

It turned out to be a problem with the default settings of MySQL (specifically isolation level) combined with using multiple persistent database connections and Django’s (default) auto-commit mode. Surprisingly, this issue is never mentioned anywhere in the Django documentation, despite there being a section discussing isolation levels in PostgreSQL, where the defaults actually work with Django.

At the time, I found a post that explained the problem and solution very clearly. But since it’s no longer available, I’m going to write it up here.

Isolation Levels

The source of the problem is that by default, MySQL uses a higher isolation level than Django expects. What does isolationlevel mean? It indicates how isolated, or protected, transactions are from other, concurrent transactions. Those other transactions might make changes which affect queries in your transaction. The isolation level controls when these changes become visible to your transaction (or connection).

MySQL defaults to REPEATABLE-READ, which means that once a transaction reads some data, the same values will be returned (or used in evaluating a query) for the rest of the transaction. For example, if we SELECT a user’s entry from a session table, until the transaction is closed we will continue to read the same data. Even if another process connects and executes a transaction that changes the data, we’ll still see the same values. We should only see new values if we wrote them ourselves.

Alternatively, we can weaken the isolation level to READ-COMMITTED, which allows changes from other transactions to become visible mid-transaction. This is weaker than REPEATABLE-READ and could be inconvenient. For example, if you’re running a blog and SELECT the set of stories in a time range, then SELECT the set of topics in the same time range via a join of the entries and topics tables, you can see inconsistent results under READ-COMMITTED. For example, another transaction could have changed all entries’ topics to “none” and the second SELECT would see those changes, resulting in an empty list of topics.

(In fact, even REPEATABLE-READ is weaker than the ideal SERIALIZABLE setting. For example, REPEATABLE-READ will fail in this case: UPDATE all blog entries within a time range and then SELECT them to list the titles of the entries you updated. If another transaction adds a blog entry in that date range, the set of entries modified and the set listed can differ under REPEATABLE-READ. They won’t under SERIALIZABLE.)

The tradeoff between isolation levels is performance vs. simplicity of code using the database. SERIALIZABLE is what most people think of when they think of a traditional ACID database. But it’s not the default for most databases because it’s expensive – it requires either more/longer locking or multiple version concurrency control (i.e. copying data). SERIALIZABLE makes it very easy to reason about the logic in your application, but often has poor performance.

Django and Isolation Levels

So how does this relate to Django? Django’s code assumes you are using a particular isolation level: READ-COMMITTED. It assumes, for example, that if it tries to SELECT to read some data, and then repeats it, that if data matching the query is added that it will show up. For example, get_or_create() relies on this property.

Just generally, the isolation level is critical to writing correct code, even within a single transaction. The examples above demonstrated why – depending on the isolation level, different orders of events can cause different types of errors.

However, since MySQL is in REPEATABLE-READ by default, Django misbehaves (at least in Django < 1.5 with pseudo-auto-commit mode, see final notes section for details). Django puts all queries into an explicit transaction and reuses transactions for multiple queries. Databases generally implement auto-commit where any statements not in a transaction are treated as their own transaction. But Python implements a different approach by default where transactions are reused (and kept open) until a write command is issued (i.e. an INSERT or UPDATE).

Unfortunately, this means that when you’re just doing reads all the time (hopefully the majority of your requests) you’ll continue to see the same data if you’ve already read it in the past. Once data is read, since the transaction is held open, future reads repeat the same value.

But even that’s not a problem until you start using multiple connections (via multiple processes). If you only have one Django process, all writes will go through it and all reads will see the updated state. But when you switch to two or more processes, you’ll start seeing what seem like inconsistencies. Yet another example of why making your testing environment look as close to deployment is important (i.e. don’t use the Django test server).

OK, so how do we fix it? Unless you want to rewrite a lot of Django (and third-party apps), we need to set the database into the mode which Django expects. There are a couple of ways to do so. I prefer trying to keep my deployment as simple as possible, avoiding changes to default configurations as necessary. The easiest fix, given this, is to adjust the setting on a per-connection basis in Django. Add something like this to your settings.py

DATABASES = {
    'default' : {
        ...
        'OPTIONS' : {
            'init_command': 'SET SESSION TRANSACTION ISOLATION LEVEL READ COMMITTED'
        }
    }
}

This just tells Django to run some raw SQL upon initiating each connection, in this case MySQL lets you set the isolation level on a per-connection basis.

Alternatively, you can just run MySQL with the “correct” isolation level by default. Add the following to your my.cnf:

transaction-isolation = READ-COMMITTED

And of course make sure you restart MySQL.

Final Notes

You can learn more about the isolation level settings in MySQL and generally about isolation levels (and some problems that can arise using different levels) on Wikipedia.

This particular problem really is Django-specific, but you should be aware of isolation levels regardless. If you’re already writing your code in a way that assumes READ-COMMITTED, you might as well get the performance gain of putting the database in that mode. If you have additional services interacting with the database (e.g. cron jobs, task queues, other standalone services), you need to be especially careful since the different code may make different assumptions and can operate with different isolation levels. Ideally, you can choose one level and commit to it (har har), but if you’re using frameworks or third-party libraries you can’t always do that.

I considered submitting a documentation patch to Django, but it appears to be moot at this point. Apparently in Django 1.6 the approach to autocommit will change so it is enabled via the database layer rather than simulated. This should get rid of this particular issue. However, you should still be aware of the differences between isolation levels because they can still have an impact on how you write transactions that work with Django and the database. Conveniently, it looks like the entire Django database transaction management API is getting an upgrade that should make it much easier to customize it to your needs.

Finally, there’s another way to handle all of this: don’t use MySQL. PostgreSQL seems like it might be a better option anyway. However, given the number of Django+MySQL installations that I assume exist, I’m surprised how difficult it was to track down this problem and its solution. And regardless, it’s important to understand isolation level issues and what level you’re supposed to be operating at since you can rarely write code which behaves properly under different isolation levels unless you write the lowest-common-denominator, assumes-no-isolation version.