Row-Level Security with SQLAlchemy and Alembic: A Complete Guide
This guide covers RLS implementation as of PostgreSQL 15+. Earlier versions may have different syntax requirements and is out of the scope here.
Introduction
Row-Level Security (RLS) is a PostgreSQL feature (They popularized the term and has robust native support) that restricts access to specific rows in a table based on user roles or session context. When combined with SQLAlchemy ORM and Alembic migrations, RLS provides a robust solution for multi-tenant applications where data isolation is critical.
This guide covers implementation from database migrations through application-level integration, addressing common pitfalls and security considerations.
Design Patterns Within the Architecture
The implementation incorporates several well-known patterns:
- Context Variable Pattern – Using
contextvarsto thread tenant context through request lifecycle - Event Listener Pattern – SQLAlchemy events for automatic session configuration
- Migration Pattern – Treating RLS policies as versioned schema changes via Alembic
- Defense-in-Depth Pattern – Layering database security alongside application authorization
Implementation Methods
These are the tactical techniques used to realize the architecture:
- PostgreSQL session variables (
SET app.current_tenant) - Alembic
op.execute()for raw SQL in migrations - SQLAlchemy event hooks (
before_cursor_execute) - Role-based policy definitions in PostgreSQL
PostgreSQL RLS Fundamentals
RLS operates at the database level, meaning policies are enforced regardless of how queries reach the database. This provides defense-in-depth security that complements application-layer authorization.
Key Concepts:
- Policy: Defines conditions for row access using USING (for SELECT/UPDATE/DELETE) and WITH CHECK (for INSERT/UPDATE)
- Session Variables: Custom settings like app.current_tenant that policies reference
- Permissive vs. Restrictive: Permissive policies allow access if any policy matches; restrictive policies require all policies to match
Enabling RLS via Alembic Migrations
Since RLS configuration is database-level, it belongs in Alembic migrations rather than Python code.
Migration Template
"""enable_rls_policies
Revision ID: enable_rls_v1
Revises: previous_revision
Create Date: 2024-01-15 10:00:00.000000
"""
from alembic import op
def upgrade():
# Enable RLS on the table
op.execute("ALTER TABLE your_table_name ENABLE ROW LEVEL SECURITY")
# Drop existing policies if they exist (for idempotency)
op.execute("DROP POLICY IF EXISTS tenant_isolation ON your_table_name")
op.execute("DROP POLICY IF EXISTS tenant_isolation_insert ON your_table_name")
op.execute("DROP POLICY IF EXISTS tenant_isolation_update ON your_table_name")
# Create SELECT policy
op.execute("""
CREATE POLICY tenant_isolation ON your_table_name
FOR SELECT
USING (tenant_id = current_setting('app.current_tenant')::integer)
""")
# Create INSERT policy
op.execute("""
CREATE POLICY tenant_isolation_insert ON your_table_name
FOR INSERT
WITH CHECK (tenant_id = current_setting('app.current_tenant')::integer)
""")
# Create UPDATE policy
op.execute("""
CREATE POLICY tenant_isolation_update ON your_table_name
FOR UPDATE
USING (tenant_id = current_setting('app.current_tenant')::integer)
WITH CHECK (tenant_id = current_setting('app.current_tenant')::integer)
""")
# Create DELETE policy
op.execute("""
CREATE POLICY tenant_isolation_delete ON your_table_name
FOR DELETE
USING (tenant_id = current_setting('app.current_tenant')::integer)
""")
def downgrade():
op.execute("DROP POLICY IF EXISTS tenant_isolation ON your_table_name")
op.execute("DROP POLICY IF EXISTS tenant_isolation_insert ON your_table_name")
op.execute("DROP POLICY IF EXISTS tenant_isolation_update ON your_table_name")
op.execute("DROP POLICY IF EXISTS tenant_isolation_delete ON your_table_name")
op.execute("ALTER TABLE your_table_name DISABLE ROW LEVEL SECURITY")
SQLAlchemy Model Definitions
Here’s the corrected SQLAlchemy model structure with proper syntax:
from sqlalchemy import Column, Integer, String, ForeignKey, event
from sqlalchemy.orm import declarative_base, relationship, Session
from contextvars import ContextVar
Base = declarative_base()
# Thread-safe tenant context using contextvars
current_tenant_id: ContextVar[int | None] = ContextVar("current_tenant_id", default=None)
class Tenant(Base):
__tablename__ = 'tenants'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
users = relationship('User', back_populates='tenant')
class User(Base):
__tablename__ = 'users'
id = Column(Integer, primary_key=True)
name = Column(String, nullable=False)
tenant_id = Column(Integer, ForeignKey('tenants.id'), nullable=False)
tenant = relationship('Tenant', back_populates='users')
class Document(Base):
__tablename__ = 'documents'
id = Column(Integer, primary_key=True)
title = Column(String, nullable=False)
content = Column(String)
tenant_id = Column(Integer, ForeignKey('tenants.id'), nullable=False)
created_by = Column(Integer, ForeignKey('users.id'))
Tenant Context in SQLAlchemy
The tenant context must be set per database session, ideally at the start of each request.
Option 1: Using SQLAlchemy Event Listeners
from sqlalchemy import event
@event.listens_for(Session, "before_cursor_execute")
def receive_before_cursor_execute(session, cursor, context, statement, parameters,
executemany, context):
tenant_id = current_tenant_id.get()
if tenant_id is not None:
cursor.execute(f"SET app.current_tenant = {tenant_id}")
Option 2: Using Middleware & FastAPI
A FastAPI example:
from fastapi import FastAPI, Depends, HTTPException
from sqlalchemy.orm import Session
from contextlib import asynccontextmanager
app = FastAPI()
@asynccontextmanager
async def tenant_context(tenant_id: int):
token = current_tenant_id.set(tenant_id)
try:
yield
finally:
current_tenant_id.reset(token)
@app.get("/documents")
async def list_documents(db: Session = Depends(get_db)):
tenant_id = current_tenant_id.get()
if tenant_id is None:
raise HTTPException(status_code=401, detail="Tenant context not set")
# RLS will automatically filter results
documents = db.query(Document).all()
return documents
Option 3: Using SQLAlchemy 2.0
from sqlalchemy import text
def set_tenant_context(session: Session, tenant_id: int):
# with Execution Options
session.execute(text(f"SET app.current_tenant = {tenant_id}"))
session.commit() # Required for SET commands
Security Considerations
Admin/Superuser Bypass
Administrators may need to bypass RLS for maintenance. PostgreSQL supports this through role-based policies:
CREATE POLICY admin_bypass ON your_table_name
FOR ALL
TO admin_role
USING (true)
WITH CHECK (true);
Or disable RLS temporarily (risky):
ALTER TABLE your_table_name DISABLE ROW LEVEL SECURITY;
-- Perform admin operations...
ALTER TABLE your_table_name ENABLE ROW LEVEL SECURITY;
Force Row Level Security
By default, RLS is not applied to table owners. To force RLS even for owners:
ALTER TABLE your_table_name FORCE ROW LEVEL SECURITY;
This is recommended for production multi-tenant systems.
Permissive vs Restrictive Policies
-- Permissive (default): Any matching policy grants access
CREATE POLICY policy_a ON table_name FOR SELECT USING (...);
CREATE POLICY policy_b ON table_name FOR SELECT USING (...);
-- Access granted if policy_a OR policy_b matches
-- Restrictive: All policies must match
-- Requires explicit policy definition with RESTRICTIVE keyword
Testing RLS Implementation
Verify RLS works correctly before deploying:
-- Test 1: Verify RLS is enabled
\d+ your_table_name
-- Look for "Row Security: yes"
-- Test 2: Switch to tenant user role
SET ROLE tenant_user;
SELECT * FROM your_table_name;
-- Should return only tenant's rows
-- Test 3: Attempt cross-tenant access
SET app.current_tenant = 999; -- Different tenant
SELECT * FROM your_table_name;
-- Should return empty result
-- Test 4: Verify admin bypass works
SET ROLE admin_role;
SELECT * FROM your_table_name;
-- Should return all rows
Performance Considerations
RLS adds overhead to every query. Optimize with:
- Index
tenant_idcolumns:CREATE INDEX idx_documents_tenant_id ON documents(tenant_id); - Use connection pooling: Each connection maintains its own session variables
- Monitor query plans:
EXPLAIN ANALYZE SELECT * FROM your_table_name; - Consider denormalization for frequently accessed tenant-scoped data
Common Pitfalls
| Issue | Solution |
|---|---|
| RLS not filtering results | Ensure ENABLE ROW LEVEL SECURITY is set |
| Admin users seeing all data | Use FORCE ROW LEVEL SECURITY |
| Session variable not persisting | Set it per connection, not per query |
| Migration fails on existing tables | Drop policies first with IF EXISTS |
| Performance degradation | Add indexes on tenant_id columns |
Key Findings & Takeaways
If you’re building modern multi-tenant architectures, these are the non-negotiable insights from our analysis:
- The “Defense-in-Depth” Imperative: RLS isn’t a replacement for application logic; it’s the final line of defense. Even if your API layer has a bug, the database will silently drop unauthorized rows. This centralizes security logic, reducing the surface area for human error in Python code.
- Alembic is Your Friend, Not Your Enemy: Many teams fear managing RLS policies via migrations. We prove that treating RLS policies as first-class schema objects (dropping/recreating them in upgrade() and downgrade()) ensures your security posture evolves exactly as your schema does. No more manual SQL scripts on production servers.
- Context is King and Queen: The magic of RLS relies on session variables (e.g.,
app.current_tenant). A thread-safe pattern using Python’scontextvarsand SQLAlchemy event listeners to automatically set this context at the start of every request, ensuring zero boilerplate in your business logic. - The Admin Bypass Paradox: A common failure point: locking out your own admins. We outline the specific PostgreSQL role configurations and FORCE ROW LEVEL SECURITY settings required to allow superusers to bypass policies for maintenance without disabling RLS entirely.
- Performance Reality: Check. Does RLS slow you down? Yes, slightly. But with proper indexing on tenant_id columns and the right query planning, the overhead is negligible compared to the cost of a data breach.
References
- PostgreSQL RLS Documentation
- SQLAlchemy ORM Documentation
- Alembic Migration Guide
- Github Discussion: Row Level Security / Column Level Security on SQLAlchemy Level
AI-generated images