I use SQLite as the database for my personal projects in Python. It is lightweight, reliable, well-documented, and better than the filesystem for persistent storage. I'd like to share a few lessons I have learned on using SQLite effectively in Python.
The official documentation for Python's sqlite3
module already has a section on "Using sqlite3
efficiently". It's worth reading that first, as this post covers different topics.
Foreign key constraints are not enforced by default in SQLite. If you want the database to prevent you from inserting invalid foreign keys, then you must run PRAGMA foreign_keys = 1
to turn enforcement on. Note that this pragma command must be run outside of a transaction; if you run it while a transaction is active, it will silently do nothing.
Since I prefer for my database to detect invalid foreign keys for me, and since (as we'll see below) the Python's sqlite3
module will sometimes open transactions implicitly, I run PRAGMA foreign_keys = 1
right after I open the connection to the database.
By default, the underlying SQLite library operates in autocommit
mode, in which changes are committed immediately unless a transaction has been opened with BEGIN
or SAVEPOINT
. You can verify this by opening the same database file with the sqlite3
command-line shell in two different terminals at the same time, and observing that, e.g., a row inserted in one terminal will be returned by a SELECT
statement run in the other. Once you open a transaction with BEGIN
, however, subsequent changes will not be visible to the other terminal until you commit the transaction with COMMIT
.
Python's sqlite3
module does not operate in autocommit
mode. Instead, it will start a transaction before data manipulation language (DML) statements1 such as INSERT
and UPDATE
, and, until Python 3.6, data definition language (DDL) statements such as CREATE TABLE
.
Opening a transaction in SQLite has several implications:
BEGIN
.These consequences can come as a surprise when sqlite3
has silently opened a transaction without your knowledge. Even worse, the Connection.close
method will not commit an open transaction, so you have to manually commit the transaction that sqlite3
automatically opened.
I prefer to manage my transactions explicitly. To do so, pass isolation_level=None
as an argument to sqlite3.connect
, which will leave the database in the default autocommit
mode and allow you to issue BEGIN
, COMMIT
, and ROLLBACK
statements yourself.
Python's sqlite3
module allows you to register adapters to convert Python objects to SQLite values, and converters to convert SQLite values to Python objects (based on the type of the column). sqlite3
automatically registers converters for DATE
and TIMESTAMP
columns, and corresponding adapters for Python date
and datetime
objects. Adapters are enabled by default, while converters must be explicitly enabled with the detect_types
parameter to sqlite3.connect
.
In addition to the default converters, I register my own for DECIMAL
, BOOLEAN
, and TIME
columns, to convert them to decimal.Decimal
, bool
, and datetime.time
values, respectively.
Python's default TIMESTAMP
converter ignores UTC offsets in the database row and always returns a naive datetime object. If your TIMESTAMP
rows contain UTC offsets, you can register your own converter to return aware datetime objects:2
import datetime
import sqlite3
sqlite3.register_converter("TIMESTAMP", datetime.datetime.fromisoformat)
Keep in mind that it is generally considered better practice to store time zone information as a string identifier from the IANA time zone database in a separate column, rather than use UTC offsets, which change often (e.g., due to daylight saving time).
Adapters and converters are registered globally, not per-database. Be warned that some Python libraries, like Django, register their own adapters and converters which will apply even if you use the raw sqlite3
interface instead of, e.g., Django's ORM.
SQLite lets you declare columns with any type that you want (or none at all). This can work nicely with Python's converters and adapters; for example, in one of my projects, I had columns of type CSV
and used a converter and an adapter to transparently convert them to Python lists and back.
Although SQLite is flexible with typing, ultimately it must choose a storage class for data, either TEXT
, NUMERIC
, INTEGER
, REAL
, or BLOB
. Columns have a "type affinity" which determines the preferred storage class for a column through a somewhat arbitrary set of rules. This ensures that inserting a string into an INT
column will convert the string to an integer, for compatibility with other, rigidly-typed database engines.
A corollary of SQLite's flexible typing is that different values in the same column can have different type affinities:
In SQLite, the datatype of a value is associated with the value itself, not with its container.
This can cause problems. I once wanted to copy some rows from one table to another. My rows had TIMESTAMP
columns, and since, as we saw, Python will silently drop UTC offsets, I replaced Python's TIMESTAMP
converter with one that simply returns the bytes object unchanged:
sqlite3.register_converter("TIMESTAMP", lambda b: b)
Unfortunately, this converter resulted in the new TIMESTAMP
columns having BLOB
affinity instead of TEXT
. This was a problem, because some SQL operations are sensitive to the affinities of their operands. One of them is LIKE
, which does not work on blob values:
sqlite> SELECT 'a' LIKE 'a';
1
sqlite> SELECT X'61'; -- 0x61 is the hexadecimal value of ASCII 'a'
a
sqlite> SELECT X'61' LIKE 'a';
0
Consequently, the query SELECT * FROM table WHERE date LIKE '2019%'
did not return any of the inserted rows because they all had BLOB
affinity and the LIKE
comparison always failed. Only when I ran SELECT typeof(date) FROM table
did I discover that some of the values in the same column had different affinities.
The correct procedure would have been to register the converter as lambda b: b.decode()
so that Python would insert string values with TEXT
affinity.3
Because I use SQLite in Python so often, I wrote my own library, isqlite, that handles most of these issues for me, and also provides a more convenient Python API and many other useful features. You can read about isqlite in next week's blog post. ∎
The sqlite3
docs use the term "Data Modification Language", but it appears that "data manipulation language" is the standard term. ↩
datetime.fromisoformat
was added in Python 3.7, so if you are using an older version of Python you will have to write the converter function yourself. You can take a look at how the sqlite3
module implements the naive datetime converter, and adapt it to also read the UTC offset if present. Or you can copy the implementation of datetime.fromisoformat
. ↩
You might reasonably wonder why I had enabled converters in the first place if I knew that they were not going to work for my TIMESTAMP
columns. In this case, I was using a library that wrapped sqlite3.connect
and enabled converters for me. ↩