Database Design
The foundation of everything. Before you learn partitioning or replication, you must know how to design a database properly — entities, keys, relationships, and normalization.
4 sub-sections • click each to expand
Data Modeling
Entities, attributes, relationships & cardinality
Entities
An entity is a real-world thing your database needs to track. Think of it as a noun — a person, a product, an order.
Attributes
Attributes are the properties or columns of an entity. There are three kinds:
NOT NULL. Example: email for a userRelationships
Relationships describe how entities connect to each other. There are three types:
Cardinality
Cardinality defines how many instances of one entity relate to another. It also tells you whether a relationship is mandatory or optional.
| Type | Notation | Example | DB Implementation |
|---|---|---|---|
| Mandatory 1:1 | ||—|| | Employee must have exactly 1 contract | NOT NULL + UNIQUE FK |
| Optional 1:1 | |o—|| | User may have a profile (optional) | UNIQUE FK (nullable) |
| Mandatory 1:N | ||—< | Order must belong to a user | NOT NULL FK |
| Optional 1:N | |o—< | Category may have 0 products | Nullable FK |
| M:N | >—< | Products ↔ Tags | Junction table |
Keys
Primary, foreign, candidate, alternate & composite keys
Primary Keys
A primary key uniquely identifies every row. There are several strategies:
Foreign Keys
A foreign key enforces referential integrity — it ensures a child row can't exist without a valid parent.
is_deleted = true) over physical deletes in production.Candidate, Alternate & Composite Keys
| Key Type | Definition | Example |
|---|---|---|
| Candidate Key | Any minimal set of columns that uniquely identifies a row | email, passport_num, user_id |
| Primary Key | The chosen candidate key — only one per table | user_id |
| Alternate Key | Candidate keys not chosen as PK (enforced via UNIQUE) | |
| Composite Key | PK or UNIQUE constraint spanning multiple columns | (order_id, product_id) |
Normalization
1NF → 2NF → 3NF → BCNF → 4NF → 5NF
Normalization is the process of organizing tables to eliminate redundancy and ensure data integrity. Each normal form adds stricter rules. Click each form below to explore it:
1NF — First Normal Form: Atomic Values & No Repeating Groups
Every column must contain a single, indivisible value. No lists, no arrays as column values, no duplicate rows.
2NF — Second Normal Form: No Partial Dependencies
Applies when you have a composite primary key. Every non-key column must depend on the entire composite key, not just part of it.
3NF — Third Normal Form: No Transitive Dependencies
No non-key column should depend on another non-key column. All columns must depend directly on the primary key.
BCNF — Boyce-Codd Normal Form: Strict 3NF
Every determinant (anything that determines another column's value) must be a candidate key. Catches edge cases that 3NF misses when there are multiple overlapping candidate keys.
4NF — Fourth Normal Form: No Multi-Valued Dependencies
A table must not have more than one independent multi-valued fact about an entity.
5NF — Fifth Normal Form: No Join Dependencies
A table must not be decomposable into smaller tables without losing information, unless the join dependencies are implied by the candidate keys. This is rare in practice.
Denormalization
Controlled redundancy for read performance
Read Optimization Patterns
There are three common denormalization patterns used in production:
When to Normalize vs Denormalize
| Normalized (3NF) | Denormalized | |
|---|---|---|
| Read speed | Slower (JOINs needed) | Faster (pre-computed) |
| Write speed | Faster (one place) | Slower (update multiple) |
| Storage | Less | More |
| Consistency | Automatic | Manual maintenance |
| Best for | OLTP (transactions) | Reporting, dashboards |