2.1

OLTP Principles

What makes a database "OLTP" and why it matters

💡
OLTP vs OLAP in one sentence: OLTP is many users doing small, fast operations (place an order, update a cart). OLAP is a few analysts doing massive aggregations (total revenue by region this year). Most apps you build are OLTP.

The 5 Core OLTP Characteristics

High Transaction Volume
Thousands to millions of queries per second. Every row matters, every millisecond counts.
Small Queries
Queries touch a handful of rows — not millions. WHERE user_id = 42, not GROUP BY country.
Fast Inserts
New orders, messages, events — must write in under 5ms. Avoid heavy triggers on write paths.
Fast Updates
Update inventory, change order status, increment counters — single-row updates, always indexed.
ACID Transactions
Atomicity, Consistency, Isolation, Durability. Money never disappears; orders never half-create.

OLTP vs OLAP — Side by Side

PropertyOLTPOLAP
Query typePoint lookups, small updatesAggregations, full scans
Data volume per queryFew rowsMillions of rows
Concurrent usersThousandsFew analysts
Schema styleNormalized (3NF)Denormalized (star schema)
Primary concernLatency, throughputQuery speed on big data
ExamplesPostgreSQL, MySQLBigQuery, Redshift, Snowflake
2.2

Designing OLTP Systems

The 7-step process used by senior engineers

Follow these 7 steps every time you design a new OLTP system. We'll use an e-commerce app as our running example throughout.

Step 1
Entities
Step 2
Relationships
Step 3
Normalize
Step 4
Transactions
Step 5
Indexes
Step 6
Audit Cols
Step 7
Soft Deletes
📋
Step 1: Identify Business Entities — List every noun in the system. For e-commerce, that's users, products, orders, payments, etc.
User
Address
Product
Category
Inventory
Cart
Order
OrderItem
Payment
-- E-Commerce Entity List -- Ask: "What are the things (nouns) in this domain?" -- User management users → people who buy addresses → where to ship (user has many addresses) -- Catalog categories → Electronics, Clothing, Books… products → items for sale inventory → stock levels per product (maybe per warehouse) -- Shopping carts → temporary basket before checkout cart_items → products in the cart -- Fulfillment orders → confirmed purchase intent order_items → individual line items in an order payments → money transaction records
🔗
Step 2: Define Relationships — Map how entities connect. Draw the chain from user to money.
-- Relationship tree (read top to bottom): User (1) ├── has many → Addresses (N) ├── has many → Orders (N) │ │ │ └── has many → OrderItems (N) │ │ │ └── belongs to → Product (1) │ │ │ └── has one → Inventory └── has many → Payments (N) │ └── belongs to → Order (1) -- Key relationships: -- users 1:N orders -- orders 1:N order_items -- products 1:N order_items (product appears in many orders) -- orders 1:1 payments (each order has one payment record) -- products M:N categories (via product_categories junction)
✂️
Step 3: Normalize to 3NF — Split data so each fact lives in exactly one place. Remove all redundancy.
-- ❌ BAD: storing customer info directly on the order orders_bad ┌──────────┬───────────────┬───────────────────┬──────────────┐ │ order_id │ customer_name │ customer_email │ customer_city│ └──────────┴───────────────┴───────────────────┴──────────────┘ -- Problem: if email changes, every order row must be updated! -- Problem: city depends on user, not the order (transitive dep) -- ✅ GOOD: normalize — each fact lives in one table CREATE TABLE users ( user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, email TEXT UNIQUE NOT NULL, first_name TEXT NOT NULL, last_name TEXT NOT NULL ); CREATE TABLE orders ( order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, user_id BIGINT NOT NULL REFERENCES users(user_id), status TEXT NOT NULL DEFAULT 'pending', total NUMERIC(12,2) NOT NULL, created_at TIMESTAMPTZ DEFAULT NOW() ); -- Customer info is fetched via JOIN — never duplicated
🔒
Step 4: Define Transaction Boundaries — Group related writes into a single atomic transaction so they all succeed or all fail together.

Example: Placing an order involves 4 operations. All must succeed or none should commit:

1
Create Order — insert into orders table with status = 'pending'
2
Create Order Items — insert each product line into order_items
3
Reduce Inventory — decrement stock in inventory for each product
4
Create Payment Record — insert into payments with status = 'pending'
-- All 4 operations in a SINGLE transaction BEGIN; -- Step 1: Create the order INSERT INTO orders (user_id, status, total) VALUES (42, 'pending', 350.00) RETURNING order_id; -- capture the new order_id -- Step 2: Add order items INSERT INTO order_items (order_id, product_id, quantity, price) VALUES (1001, 5, 2, 100.00), (1001, 9, 1, 150.00); -- Step 3: Reduce inventory (with a check!) UPDATE inventory SET stock = stock - 2 WHERE product_id = 5 AND stock >= 2; -- prevents negative stock -- Step 4: Create payment record INSERT INTO payments (order_id, amount, status, method) VALUES (1001, 350.00, 'pending', 'card'); COMMIT; -- all or nothing! -- If anything fails above, run ROLLBACK instead: -- ROLLBACK; → undoes ALL 4 operations atomically
🚨
Never split related writes across multiple requests. If your server crashes between step 2 and step 3, you'd have an order with items but no inventory deduction. Always wrap in a transaction.
Step 5: Index Strategy — Add indexes on columns you filter or join on. But don't over-index — every index slows down writes.
-- ✅ Index columns you search by, filter on, or JOIN with -- users: look up by email (login) CREATE UNIQUE INDEX idx_users_email ON users (email); -- orders: look up all orders for a user (very common) CREATE INDEX idx_orders_user_id ON orders (user_id); -- orders: look up recent orders (dashboard, listing) CREATE INDEX idx_orders_created_at ON orders (created_at DESC); -- order_items: look up items for a given order (always needed) CREATE INDEX idx_order_items_order_id ON order_items (order_id); -- payments: look up payment for an order CREATE INDEX idx_payments_order_id ON payments (order_id); -- ❌ DON'T index columns you never filter on: -- first_name, last_name, description, notes -- ❌ DON'T index low-cardinality columns: -- status (only 'pending','paid','cancelled') → not selective enough -- ✅ EXCEPTION: partial index for low-cardinality but important subset CREATE INDEX idx_orders_pending ON orders (created_at) WHERE status = 'pending'; -- only index pending orders
🕒
Step 6: Audit Columns — Every OLTP table should track who created/changed a row and when. This is essential for debugging, compliance, and data lineage.
-- Standard audit columns — add to EVERY table CREATE TABLE orders ( order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, user_id BIGINT NOT NULL REFERENCES users(user_id), total NUMERIC(12,2), status TEXT DEFAULT 'pending', -- ✅ Audit columns created_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), updated_at TIMESTAMPTZ NOT NULL DEFAULT NOW(), created_by BIGINT REFERENCES users(user_id), -- who placed it updated_by BIGINT REFERENCES users(user_id) -- who last touched it ); -- Auto-update updated_at using a trigger: CREATE OR REPLACE FUNCTION set_updated_at() RETURNS TRIGGER AS $$ BEGIN NEW.updated_at = NOW(); RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER trg_orders_updated_at BEFORE UPDATE ON orders FOR EACH ROW EXECUTE FUNCTION set_updated_at(); -- Now every UPDATE automatically sets updated_at = NOW()
Pro tip: Create a reusable trigger function once and attach it to every table. Some teams use a base migration that adds these columns automatically to all new tables.
🗑️
Step 7: Soft Deletes — Don't physically delete rows in OLTP systems. Mark them as deleted instead. This preserves history, enables undo, and prevents FK violation errors.
-- Add soft delete columns to every table ALTER TABLE users ADD COLUMN is_deleted BOOLEAN NOT NULL DEFAULT FALSE, ADD COLUMN deleted_at TIMESTAMPTZ, ADD COLUMN deleted_by BIGINT REFERENCES users(user_id); -- ✅ Soft delete: mark as deleted, don't remove UPDATE users SET is_deleted = TRUE, deleted_at = NOW(), deleted_by = 1 -- the admin who deleted WHERE user_id = 42; -- ❌ NEVER do this in OLTP: -- DELETE FROM users WHERE user_id = 42; -- This would cascade-delete all their orders, payments, etc. -- Filter out deleted records in every query: SELECT * FROM users WHERE is_deleted = FALSE; -- ✅ Better: create a VIEW that hides deleted rows CREATE VIEW active_users AS SELECT * FROM users WHERE is_deleted = FALSE; -- App queries active_users instead of users directly SELECT * FROM active_users WHERE email = 'alice@example.com'; -- ✅ Partial index for performance (only index non-deleted rows) CREATE INDEX idx_users_email_active ON users (email) WHERE is_deleted = FALSE;
2.3

Common OLTP Patterns

Real schemas used at scale: Auth, E-Commerce, Ride Sharing, AI SaaS

Click any pattern to see its full production schema with SQL:

🔐 User Management
Auth, roles, permissions
users roles permissions user_roles
🛒 E-Commerce
Orders, products, payments
products orders order_items payments
🚗 Ride Sharing
Drivers, trips, locations
drivers riders trips locations
🤖 AI SaaS
Conversations, embeddings, RAG
conversations messages documents embeddings
🔐
User Management Pattern — RBAC (Role-Based Access Control) used by virtually every production app. Users have roles, roles have permissions.
users
user_idPK BIGINT
emailUQ TEXT
password_hashTEXT
is_activeBOOL
created_atTIMESTAMPTZ
roles
role_idPK BIGINT
nameUQ TEXT
descriptionTEXT
permissions
permission_idPK BIGINT
nameUQ TEXT
resourceTEXT
actionTEXT
user_roles
user_idFK BIGINT
role_idFK BIGINT
granted_atTIMESTAMPTZ
granted_byFK BIGINT
role_permissions
role_idFK BIGINT
permission_idFK BIGINT
-- Full RBAC schema CREATE TABLE users ( user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, email TEXT UNIQUE NOT NULL, password_hash TEXT NOT NULL, is_active BOOLEAN NOT NULL DEFAULT TRUE, is_deleted BOOLEAN NOT NULL DEFAULT FALSE, created_at TIMESTAMPTZ DEFAULT NOW(), updated_at TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE roles ( role_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, name TEXT UNIQUE NOT NULL, -- 'admin', 'editor', 'viewer' description TEXT ); CREATE TABLE permissions ( permission_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, name TEXT UNIQUE NOT NULL, -- 'orders:read', 'users:delete' resource TEXT NOT NULL, -- 'orders', 'users', 'products' action TEXT NOT NULL -- 'read', 'write', 'delete' ); -- Junction: users ↔ roles (M:N) CREATE TABLE user_roles ( user_id BIGINT REFERENCES users(user_id) ON DELETE CASCADE, role_id BIGINT REFERENCES roles(role_id) ON DELETE CASCADE, granted_at TIMESTAMPTZ DEFAULT NOW(), granted_by BIGINT REFERENCES users(user_id), PRIMARY KEY (user_id, role_id) ); -- Junction: roles ↔ permissions (M:N) CREATE TABLE role_permissions ( role_id BIGINT REFERENCES roles(role_id) ON DELETE CASCADE, permission_id BIGINT REFERENCES permissions(permission_id) ON DELETE CASCADE, PRIMARY KEY (role_id, permission_id) ); -- Check if a user has a specific permission: SELECT COUNT(*) > 0 AS has_permission FROM user_roles ur JOIN role_permissions rp ON ur.role_id = rp.role_id JOIN permissions p ON rp.permission_id = p.permission_id WHERE ur.user_id = 42 AND p.name = 'orders:read';
🛒
E-Commerce Pattern — The canonical OLTP schema. Used by Flipkart, Amazon-style apps. Notice how inventory is separate from products to allow per-warehouse tracking.
products
product_idPK
nameTEXT
priceNUMERIC
skuUQ
inventory
product_idFK
stockINT
reservedINT
orders
order_idPK
user_idFK
statusTEXT
totalNUMERIC
order_items
order_idFK
product_idFK
quantityINT
priceNUMERIC
payments
payment_idPK
order_idFK
amountNUMERIC
statusTEXT
gateway_refTEXT
-- Complete E-Commerce OLTP Schema CREATE TABLE categories ( category_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, name TEXT NOT NULL, parent_id BIGINT REFERENCES categories(category_id) -- self-ref for hierarchy ); CREATE TABLE products ( product_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, name TEXT NOT NULL, description TEXT, sku TEXT UNIQUE NOT NULL, -- Stock Keeping Unit price NUMERIC(10,2) NOT NULL, is_active BOOLEAN NOT NULL DEFAULT TRUE, created_at TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE inventory ( product_id BIGINT PRIMARY KEY REFERENCES products(product_id), stock INT NOT NULL DEFAULT 0 CHECK (stock >= 0), reserved INT NOT NULL DEFAULT 0, -- in-progress orders updated_at TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE orders ( order_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, user_id BIGINT NOT NULL REFERENCES users(user_id), shipping_address JSONB, -- snapshot of address at order time status TEXT NOT NULL DEFAULT 'pending', total NUMERIC(12,2) NOT NULL, created_at TIMESTAMPTZ DEFAULT NOW(), updated_at TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE order_items ( order_id BIGINT NOT NULL REFERENCES orders(order_id), product_id BIGINT NOT NULL REFERENCES products(product_id), quantity INT NOT NULL CHECK (quantity > 0), price NUMERIC(10,2) NOT NULL, -- price at time of order (snapshot) PRIMARY KEY (order_id, product_id) ); CREATE TABLE payments ( payment_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, order_id BIGINT UNIQUE NOT NULL REFERENCES orders(order_id), amount NUMERIC(12,2) NOT NULL, status TEXT NOT NULL DEFAULT 'pending', method TEXT, -- 'upi', 'card', 'netbanking', 'cod' gateway_ref TEXT, -- Razorpay/Stripe transaction ID paid_at TIMESTAMPTZ, created_at TIMESTAMPTZ DEFAULT NOW() );
🚗
Ride Sharing Pattern — Used by Uber/Ola. Key challenge: real-time location tracking and matching drivers to riders. Location history is append-only (never updated).
drivers
driver_idPK
user_idFK
vehicle_typeTEXT
is_availableBOOL
ratingNUMERIC
trips
trip_idPK
driver_idFK
rider_idFK
statusTEXT
fareNUMERIC
locations
location_idPK
driver_idFK
lat / lngNUMERIC
recorded_atTIMESTAMPTZ
payments
payment_idPK
trip_idFK
amountNUMERIC
methodTEXT
-- Ride Sharing OLTP Schema CREATE TABLE drivers ( driver_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, user_id BIGINT UNIQUE NOT NULL REFERENCES users(user_id), vehicle_type TEXT NOT NULL, -- 'auto', 'bike', 'mini', 'sedan' license_num TEXT UNIQUE NOT NULL, is_available BOOLEAN NOT NULL DEFAULT FALSE, rating NUMERIC(3,2) DEFAULT 5.0, -- running average created_at TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE trips ( trip_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, driver_id BIGINT NOT NULL REFERENCES drivers(driver_id), rider_id BIGINT NOT NULL REFERENCES users(user_id), status TEXT NOT NULL DEFAULT 'requested', -- status: requested → accepted → in_progress → completed / cancelled pickup_lat NUMERIC(9,6), pickup_lng NUMERIC(9,6), dropoff_lat NUMERIC(9,6), dropoff_lng NUMERIC(9,6), distance_km NUMERIC(8,2), fare NUMERIC(8,2), started_at TIMESTAMPTZ, completed_at TIMESTAMPTZ, created_at TIMESTAMPTZ DEFAULT NOW() ); -- Location pings from driver app (append-only, NEVER update) CREATE TABLE driver_locations ( location_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, driver_id BIGINT NOT NULL REFERENCES drivers(driver_id), lat NUMERIC(9,6) NOT NULL, lng NUMERIC(9,6) NOT NULL, recorded_at TIMESTAMPTZ NOT NULL DEFAULT NOW() ); -- This table grows FAST. Partition by day/week (covered in Topic 11) -- Get latest driver location: SELECT DISTINCT ON (driver_id) driver_id, lat, lng, recorded_at FROM driver_locations WHERE driver_id = 7 ORDER BY driver_id, recorded_at DESC;
🤖
AI SaaS Pattern — Used for apps like ChatGPT, Notion AI, or your own FastAPI + AI agent backend. Multi-tenant with conversation history and vector embeddings for RAG.
organizations
org_idPK
nameTEXT
planTEXT
conversations
conv_idPK
user_idFK
titleTEXT
modelTEXT
messages
msg_idPK
conv_idFK
roleTEXT
contentTEXT
tokensINT
documents
doc_idPK
org_idFK
filenameTEXT
contentTEXT
embeddings
embed_idPK
doc_idFK
chunk_textTEXT
vectorVECTOR
-- AI SaaS Multi-Tenant Schema -- Requires: pgvector extension for embeddings CREATE EXTENSION IF NOT EXISTS vector; CREATE TABLE organizations ( org_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, name TEXT NOT NULL, plan TEXT NOT NULL DEFAULT 'free', -- 'free','pro','enterprise' created_at TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE users ( user_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, org_id BIGINT NOT NULL REFERENCES organizations(org_id), email TEXT UNIQUE NOT NULL, created_at TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE conversations ( conv_id UUID PRIMARY KEY DEFAULT gen_random_uuid(), user_id BIGINT NOT NULL REFERENCES users(user_id), org_id BIGINT NOT NULL REFERENCES organizations(org_id), title TEXT, model TEXT NOT NULL DEFAULT 'gpt-4', created_at TIMESTAMPTZ DEFAULT NOW(), updated_at TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE messages ( msg_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, conv_id UUID NOT NULL REFERENCES conversations(conv_id), role TEXT NOT NULL, -- 'user', 'assistant', 'system' content TEXT NOT NULL, tokens INT, -- track usage for billing created_at TIMESTAMPTZ DEFAULT NOW() ); CREATE TABLE documents ( doc_id UUID PRIMARY KEY DEFAULT gen_random_uuid(), org_id BIGINT NOT NULL REFERENCES organizations(org_id), filename TEXT NOT NULL, mime_type TEXT, content TEXT, created_at TIMESTAMPTZ DEFAULT NOW() ); -- RAG: store chunked text + vector embedding CREATE TABLE embeddings ( embed_id BIGINT GENERATED ALWAYS AS IDENTITY PRIMARY KEY, doc_id UUID NOT NULL REFERENCES documents(doc_id), chunk_index INT NOT NULL, chunk_text TEXT NOT NULL, vector VECTOR(1536) NOT NULL, -- 1536 dims for OpenAI ada-002 created_at TIMESTAMPTZ DEFAULT NOW() ); -- Semantic search: find most similar chunks to a query vector SELECT chunk_text, 1 - (vector <=> '[0.1, 0.2, ...]'::VECTOR) AS similarity FROM embeddings WHERE doc_id = '...' ORDER BY vector <=> '[0.1, 0.2, ...]'::VECTOR LIMIT 5;
⚠️
Multi-tenancy tip: Always store org_id on every table and add it to every query. Use Row Level Security (RLS, covered in Topic 17) to enforce tenant isolation automatically at the database level.

Topic 2: OLTP Database Design — all sections covered.

Ready for Topic 3: PostgreSQL Data Types?

🎉

Topic 2 approved! Reply "Next" in the chat to get Topic 3: PostgreSQL Data Types.