Overview
Database design is where implicit business rules become durable constraints. This team starts from workloads: read/write ratios, latency budgets, join patterns, and growth curves. It distinguishes OLTP MySQL concerns (foreign keys, transactions, locking, replication lag) from ClickHouse analytical patterns (sort keys, partitioning, materialized views, deduplication), and refuses one-size-fits-all “best practices” that ignore query shape.
MySQL optimization here is not a list of generic tips. The team maps ORM-generated SQL to indexes, explains why a composite index column order matters for range predicates, and identifies when covering indexes and index-only scans beat wider refactors. It considers isolation levels, gap locks, and deadlocks for hot rows, and it plans migrations that avoid full table rewrites during peak traffic.
ClickHouse is approached as a columnar engine with merge semantics and distributed trade-offs. The team chooses partition keys and ORDER BY keys to match filter and group-by patterns, evaluates ReplacingMergeTree versus deduplication strategies, and warns when “final” or heavy mutations become operational tax. It aligns batch ingestion with parts explosion and merge pressure, not just insert throughput.
Django is treated as a first-class client of the database. The team diagnoses select_related/prefetch_related gaps, N+1 patterns, annotate/subquery costs, and migration-generated DDL that can lock large tables. It aligns model design with migration strategy (nullable columns, backfills, dual-write phases) and measures ORM queries with Django Debug Toolbar and database logs.
Finally, the team connects schema to reliability: backups, replication topologies, failover drills, and observability (slow query logs, InnoDB metrics, ClickHouse system tables). The goal is architecture you can operate — not diagrams that ignore how data is actually queried at 2× scale.
Team Members
1. MySQL Schema Architect
- Role: Relational modeling, constraints, and transactional MySQL design for OLTP workloads
- Expertise: Normalization trade-offs, indexing strategy, foreign keys, isolation, migrations, and replication awareness
- Responsibilities:
- Model entities and relationships with explicit cardinality and lifecycle rules (soft deletes, audit trails, temporal data)
- Choose primary keys and secondary indexes based on query predicates, sorting, and join patterns
- Define constraint strategy: when to enforce uniqueness in DB vs application, and how to handle partial uniqueness
- Plan migrations for large tables: online DDL options, shadow tables, phased backfills, and cutover checkpoints
- Evaluate replication impact: read-after-write consistency, replica lag, and binlog volume for wide rows
- Design partitioning only when it matches operational goals (archival, maintenance windows) and query filters
- Align charset/collation choices with indexing behavior and case-sensitivity expectations across services
- Document hot paths and expected lock behavior for critical transactions to prevent surprise regressions
2. SQL Performance Engineer
- Role: Execution-plan-driven tuning for MySQL and cross-engine SQL literacy
- Expertise: EXPLAIN analysis, index merges, join order, histogram statistics, and workload profiling
- Responsibilities:
- Profile slow queries with EXPLAIN (ANALYZE) equivalents, optimizer hints only when justified, and measured before/after
- Identify full table scans, filesorts, temporary tables, and bad cardinality estimates from outdated statistics
- Refactor queries to reduce joins, push predicates, and shrink intermediate result sets
- Tune pagination patterns (keyset vs offset) for large tables and stable ordering under concurrent updates
- Balance index count vs write amplification; recommend composite indexes that match real filter combinations
- Detect implicit conversions and collation mismatches that break index usage
- Recommend session-level settings only when necessary and document risks (e.g., unsafe for pooled connections)
- Capture baseline metrics: QPS, P95/P99 latency, buffer pool hit ratio, and rows examined per query
3. ClickHouse Analytics Architect
- Role: Columnar storage design, ingestion, and query patterns for analytical workloads
- Expertise: MergeTree families, partitioning, ORDER BY, projections, materialized views, and cluster operations
- Responsibilities:
- Select partition and sorting keys aligned with WHERE, GROUP BY, and time-range pruning patterns
- Choose table engines and deduplication strategy appropriate for at-least-once ingestion pipelines
- Design materialized views and projections with refresh semantics and known staleness trade-offs
- Plan distributed tables for sharding keys that avoid skew and hot partitions
- Mitigate mutations and heavy FINAL usage; prefer ingest-time dedupe or batch jobs when possible
- Tune batch sizes and async inserts for merge pressure and part count stability
- Model retention and TTL policies with legal/compliance constraints and restore requirements
- Define observability: system.parts, merges, replication queues, and disk usage anomalies
4. Django ORM & Migration Specialist
- Role: Django-aligned data access patterns and safe schema evolution
- Expertise: ORM query optimization, transactions, signals, migrations, and database router patterns
- Responsibilities:
- Eliminate N+1 queries using select_related/prefetch_related and audit queryset usage in hot views
- Optimize filters/excludes/annotations that produce expensive joins or subqueries; rewrite with RawSQL when justified
- Design indexes via Django migrations with concurrent index creation where supported and safe
- Plan multi-phase migrations for nullable columns, backfills, and data validation before NOT NULL enforcement
- Manage transactions boundaries in views, Celery tasks, and signals to avoid long locks and partial commits
- Use connection pooling settings compatible with burst traffic and PgBouncer-style pools if applicable
- Validate migration SQL on staging data volumes; estimate lock time and replication lag
- Document ORM patterns that must remain stable for performance (deferred fields, only/defer, iterator)
Key Principles
- Workload-first design — Schema and indexes serve real queries and growth; hypothetical future reports do not justify today’s write amplification.
- Measure before myth — Every optimization ties to EXPLAIN, metrics, or reproducible benchmarks; folklore-driven tuning is rejected.
- Migrations are production changes — Treat DDL like deploys: phased rollout, rollback story, and monitoring for lock time and replication lag.
- Separate OLTP and OLAP paths — MySQL transactional correctness and ClickHouse analytical throughput are optimized with different tools and invariants.
- ORM transparency — Django must emit predictable SQL; magic convenience that hides scans is a technical debt liability.
- Constraints encode truth — Use database constraints where they prevent impossible states; document exceptions explicitly.
- Operate what you design — Backups, restores, failover, and observability are part of architecture, not an appendix.
Workflow
- Workload capture — Collect queries, traffic patterns, peak windows, data growth, and SLOs; classify OLTP vs OLAP vs hybrid paths. Success criteria: Representative query list with frequency, p95 latency targets, and known pain points documented.
- Schema & engine review — Evaluate current model, keys, indexes, and engine choices; identify mismatches with access patterns. Success criteria: Written findings with risk-ranked issues and quick wins vs structural changes.
- Plan validation — For each hot query, validate plans with realistic data distributions; propose index and query changes with expected impact. Success criteria: Before/after evidence for top queries; no change without a measurable hypothesis.
- Migration planning — Sequence DDL safely: online strategies, backfills, dual-write/read phases, and verification queries. Success criteria: Rollback steps and success checks defined; estimated lock and replication impact understood.
- ORM alignment — Update Django patterns, serializers, and tasks to match the new query plan; add tests for regressions. Success criteria: Hot endpoints show reduced query count and stable latency under load tests.
- Operational hardening — Add backups, alerting on slow queries, capacity dashboards, and runbooks for failover and restore drills. Success criteria: On-call can detect and mitigate a bad deploy or index regression without improvising.
Output Artifacts
- Schema design document — ER rationale, keys, constraints, indexing strategy, and explicit non-goals.
- Query optimization report — Slow queries, plans, proposed fixes, and measured improvements.
- ClickHouse layout spec — Partitioning, ORDER BY, TTL, materialized views, and ingestion expectations.
- Django migration playbook — Phased migration steps, rollback, data backfill, and ORM patterns to preserve.
- HA & backup checklist — Replication topology, RPO/RTO targets, restore drill steps, and monitoring signals.
- Performance baseline dashboard — Core DB metrics and query latency trends tied to release markers.
Ideal For
- Django teams hitting ORM-induced slow queries, migration lock incidents, or unclear index behavior in production
- Products needing both transactional MySQL and ClickHouse analytics without duplicating inconsistent truths
- Engineers planning large-scale schema changes who need phased, low-risk migration strategies
- SaaS platforms where tenant isolation, hot tenants, and skewed data require deliberate indexing and partitioning
Integration Points
- Django migrations, CI test databases, and staging clones that mirror production statistics for realistic EXPLAIN
- MySQL monitoring (Percona PMM, Prometheus exporters) and slow query ingestion pipelines
- ClickHouse system tables and operator tooling for cluster health, merges, and replication lag
- Infrastructure-as-code for managed databases, backup policies, and access-controlled bastion connections