Skip to content

@tank/relational-db-mastery

1.1.0
Skill

Description

Relational database performance for PostgreSQL and MySQL. Indexing (B-tree, GIN, GiST, BRIN, composite, partial, covering), query optimization (JOINs, CTEs, pagination, N+1), EXPLAIN ANALYZE, schema design, PostgreSQL tuning (VACUUM, pg_stat_statements), MySQL/InnoDB tuning, monitoring.

Triggered by

slow querydatabase performanceindexEXPLAIN ANALYZEquery planschema design
Download
Review Recommended
tank install @tank/relational-db-mastery

Relational Database Mastery

Core Philosophy

  1. Measure before optimizing — Run EXPLAIN ANALYZE before changing anything. Intuition about query performance is unreliable.
  2. Index for queries, not columns — Design indexes based on actual query patterns, not schema structure.
  3. Normalize first, denormalize with evidence — Start with 3NF. Denormalize only when measurements prove a JOIN is the bottleneck.
  4. The database is not a black box — Understand the planner's decisions. Read query plans. Monitor cache hit ratios.
  5. PostgreSQL and MySQL are different engines — Tuning that works for one may hurt the other. Know which you're targeting.

Quick-Start: Common Problems

"My query is slow"

  1. Get the query → Run EXPLAIN (ANALYZE, BUFFERS) on it
  2. Read the plan bottom-to-top → Find the node with highest actual time
  3. Is it a Seq Scan on a large table? → Add an index (see references/indexing-strategies.md)
  4. Are row estimates wildly wrong? → Run ANALYZE on the table
  5. Is it a disk-bound Sort? → Increase work_mem or add matching index → See references/explain-analyze.md for full plan reading guide

"Which index should I create?"

  1. What's the query pattern? Equality, range, full-text, JSONB, geometric?
  2. Apply the Index Type Selection table → B-tree, GIN, GiST, or BRIN
  3. Composite index? → Apply E-R-S rule: Equality columns → Range columns → Sort columns
  4. Large table, skewed data? → Use partial index
  5. Need index-only scans? → Use covering index with INCLUDE → See references/indexing-strategies.md

"Database is generally slow"

  1. Check cache hit ratio → Below 99%? Increase shared_buffers
  2. Check active connections → Near max? Add connection pooling
  3. Check dead tuple ratio → Above 20%? Tune autovacuum
  4. Check pg_stat_statements → Find top queries by total time
  5. Check for lock contention → Long-running transactions blocking others → See references/monitoring-diagnostics.md and references/postgresql-tuning.md

"Should I normalize or denormalize?"

  1. Is this a new schema? → Normalize to 3NF
  2. Are JOINs proven slow via EXPLAIN? → Consider denormalization
  3. Is it read-heavy with complex aggregations? → Materialized views
  4. Is it write-heavy with consistency needs? → Stay normalized → See references/schema-design.md

Decision Trees

Index Type Selection

Query PatternIndex TypeExample
Equality (=)B-treeWHERE status = 'active'
Range (<, >, BETWEEN)B-treeWHERE created_at > '2024-01-01'
Pattern match (LIKE 'prefix%')B-treeWHERE name LIKE 'John%'
Full-text searchGINWHERE to_tsvector('english', body) @@ query
JSONB containmentGINWHERE metadata @> '{"type":"admin"}'
Array containsGINWHERE tags @> ARRAY['python']
Geometric/spatialGiSTWHERE ST_DWithin(geom, point, 1000)
Range overlapGiSTWHERE tsrange @> now()
Large time-series tableBRINWHERE created_at > '2024-01-01' (billions of rows)

Query Pattern Selection

SituationUseAvoid
Direct table relationshipJOINCorrelated subquery
Existence checkEXISTS (SELECT 1 ...)IN (SELECT ...) on large sets
Multi-step transformationCTE (PG 12+ inlines)Nested subqueries
Top-N per groupLATERAL JOIN or ROW_NUMBER()Correlated subquery per row
Large dataset paginationKeyset/cursor paginationOFFSET on large offsets
Bulk data loadingCOPY (PG) / LOAD DATA (MySQL)Individual INSERT statements

PostgreSQL vs MySQL Tuning

ConcernPostgreSQLMySQL
Main memory settingshared_buffers (25% RAM)innodb_buffer_pool_size (70-80% RAM)
Dead row cleanupVACUUM / autovacuumInnoDB purge thread (automatic)
Connection overheadHeavy (process per conn) → use PgBouncerLight (thread per conn)
Slow query identificationpg_stat_statementsSlow query log + pt-query-digest
Online index creationCREATE INDEX CONCURRENTLYALTER TABLE ... ADD INDEX (online by default)
SSD optimizationrandom_page_cost = 1.1innodb_io_capacity = 2000+

Anti-Patterns Quick Reference

Anti-PatternProblemFix
No index on foreign keysSlow JOINs, slow CASCADE deletesAlways index FKs
SELECT * everywhereExtra I/O, blocks index-only scansSelect only needed columns
WHERE function(column)Kills index usageExpression index or rewrite query
OFFSET 10000 LIMIT 20Scans and discards 10000 rowsKeyset pagination
N+1 queries in ORM1 + N round trips to databaseEager loading or batch fetch
FLOAT for moneyRounding errorsNUMERIC(precision, scale)
UUID v4 primary keysRandom B-tree inserts, fragmentationUUIDv7 or BIGSERIAL + public UUID
Over-indexingWrite amplification, disk wasteAudit with pg_stat_user_indexes
Skipping ANALYZEStale statistics, bad plansRun after bulk data changes

Reference Files

FileContents
references/indexing-strategies.mdAll index types, selection decision tree, composite index column order (E-R-S rule), GIN/GiST/BRIN deep dives, anti-patterns, PG vs MySQL differences
references/query-optimization.mdJOIN optimization, subquery vs CTE vs JOIN, N+1 problem, pagination strategies, batch operations, window functions, query rewriting
references/explain-analyze.mdReading EXPLAIN output, scan types, join types, cost model, red flags, estimation errors, MySQL differences, optimization workflow
references/schema-design.mdNormalization/denormalization, data type selection (UUID vs serial, JSONB vs columns), partitioning, temporal patterns, schema anti-patterns
references/postgresql-tuning.mdMemory config, SSD settings, VACUUM/autovacuum, connection pooling (PgBouncer), lock management, pg_stat_statements, WAL tuning
references/mysql-tuning.mdInnoDB architecture, buffer pool, flush settings, slow query log, optimizer hints, character sets, replication, PG vs MySQL comparison
references/monitoring-diagnostics.mdKey metrics dashboard, diagnostic queries (PG + MySQL), cache hit ratio, index usage, table bloat, alerting thresholds, diagnostic workflow

Command Palette

Search packages, docs, and navigate Tank