Somewhere between raw SQL and an ORM
I'm writing Python code that uses a Postgres database, and I don't want to write untyped SQL but I don't want to use a full-blown ORM. What can I do?
For my personal projects, I've rigged up my own little system.
- All of my database tables are defined in a
schema.sql
file. It's idempotent (CREATE TABLE IF NOT EXISTS
, etc.), so it can be run anew on every deployment. - A script parses
schema.sql
and writes outmodels.py
files with the Python classes to represent each table. - A pre-commit check ensures that the
models.py
files are in sync withschema.sql
.
-- python-file: books/models.py
-- python-name: Book
CREATE TABLE IF NOT EXISTS books(
title TEXT NOT NULL,
author TEXT NOT NULL,
)
turns into
@dataclass
class Book:
title: str
author: str
class T:
title = sql.Identifier("title")
author = sql.Identifier("author")
table = sql.Identifier("books")
star = sql.SQL(", ").join([title, author])
and I can write queries like
T = Books.T
cursor.execute(
sql.SQL("SELECT {star} FROM {table} WHERE {author} = %(author)s")
.format(star=T.star, table=T.table, author=T.author)
dict(author=author),
)
The major advantage of this compared to a hard-coded SQL string is that if I renamed the author
column to author_or_editor
, Pyright would give me a type error.
Along with psycopg's class_row
wrapper for turning a database row into an instance of a Python dataclass, it's quite a pleasant API to use.
I tried to get too clever and write something like:
sql.SQL("SELECT {star} FROM {table} WHERE {author} = %(author)s)")
.format(**T.as_dict)
This works, but forgoes type protection: if you renamed or deleted a field, you wouldn't find out until runtime.
I spent some effort writing a Python library for generating SQL queries, like q.select(T.star, from_=T.table, where=q.not_(q.is_null(T.author)))
, on the theory that it would protect me against SQL syntax errors inside of Python strings. But the queries were hard to read, and when I realized that I was going to end up iteratively re-implementing the entire SQL language inside my library, I relented and returned to SQL strings.