24 KiB
Database Migration Strategy
BellSystems CP v2 — Firestore + SQLite → Postgres
This is the living plan. Update it as phases complete. Never start a phase without reading the notes from the previous one.
Database Split — Target State
| Data | Target | Source | Flutter uses? |
|---|---|---|---|
| Devices | Firestore | Firestore | YES — keep |
| App users (device owners) | Firestore | Firestore | YES — keep |
| Published melodies | Firestore | Firestore | YES — keep |
| Draft melodies | Postgres | SQLite | No |
| Built melodies | Postgres | SQLite | No |
| CRM customers | Postgres | Firestore | No |
| CRM products | Postgres | Firestore | No |
| CRM orders | Postgres | Firestore (subcollection) | No |
| Console settings | Postgres | Firestore | No |
| Public features settings | Postgres | Firestore | No |
| Staff / admin users | Postgres | Firestore | No |
| Firmware versions | Postgres | Firestore | No |
| Notes / Issues | Postgres | New (done) | No |
| Support tickets | Postgres | New (done) | No |
| CRM comms log | Postgres | SQLite | No |
| CRM media references | Postgres | SQLite | No |
| CRM sync state | Postgres | SQLite | No |
| CRM quotations + items | Postgres | SQLite | No |
| Mfg audit log | Postgres | SQLite | No |
| Device alerts | Postgres | SQLite | No |
| MQTT commands | Postgres | SQLite | No |
| MQTT heartbeats | Postgres | SQLite | No |
| Device logs | Postgres (partitioned) | SQLite | No |
| Staff audit log | Postgres | New | No |
Rule: Everything that FlutterFlow touches directly stays in Firestore forever. The Console backend continues to write to those Firestore collections exactly as today. We only stop reading from Firestore in the Console — never stop writing to it.
Deployment Context — Critical
This project runs in two environments:
| Environment | SQLite data | Firestore data | Where migrations run |
|---|---|---|---|
| Local (Windows + Docker for Desktop) | Empty / stale test data | Live (correct) | Development & testing only |
| VPS (production Docker) | Live correct data | Live (correct) | All Phase 1 migrations run here |
What this means for each phase:
- Phase 0 (schema): Alembic migrations can be developed and tested locally, then the same migrations are run on the VPS via
docker compose exec backend alembic upgrade head. The VPS is authoritative. - Phase 1 (SQLite → Postgres): Migration scripts must be run on the VPS only. The local SQLite is not a valid source. Do not run Phase 1 migration scripts locally and assume they reflect real data.
- Phase 2 (Firestore → Postgres): Can be run on either environment (Firestore is the same), but the VPS run is the one that matters. Run locally first to verify the scripts work, then run on the VPS.
- Phase 3–5: All service cutover and testing happens on the VPS.
The deployment workflow:
- Develop and test code locally
- Push code to VPS (git pull or equivalent)
- Run
docker compose exec backend alembic upgrade headon the VPS to apply schema changes - Run migration scripts on the VPS when Phase 1 begins
- Verify everything on the VPS before marking a phase complete
Non-negotiable Safety Rules
- Never touch a Firestore collection — only read from it during migration. Never delete, update, or rename documents until you have personally verified the Postgres data is complete and correct.
- Every migration script runs in a transaction — if any row fails, the entire script rolls back cleanly.
- Idempotent scripts — every script uses
ON CONFLICT DO NOTHINGor equivalent. Safe to run twice. - Count verification before commit — each script prints
Source: N docs/rows → Postgres: N rows ✓and aborts if counts don't match. - Migration run log — a
_migration_runstable in Postgres records what ran, when, how many rows, and success/failure. Check it after each script. - One domain at a time — complete and verify a full domain (schema + migration script + service cutover + smoke test) before starting the next.
- No data loss = no rushing — downtime during migration is acceptable. Data loss is not.
Phase 0 — Schema Foundation
Status: COMPLETE — Alembic revision b1c2d3e4f5a6 applied locally. Apply on VPS with docker compose exec backend alembic upgrade head before starting Phase 1.
What exists already in Postgres
entries+entry_links(notes/issues module)support_tickets+ticket_messages(tickets module)- Alembic version history in
alembic_version
What Phase 0 adds
Add the _migration_runs tracking table and all new table definitions via Alembic before any data moves.
New tables to create in this phase (schema only, no data yet):
_migration_runs— tracks what migration scripts have runcrm_products— flat columns, no JSONB neededcrm_customers— core columns + JSONB forcontacts,notes,owned_items,location,tags,technical_issues,install_support,transaction_history,crm_summarycrm_orders— core columns + JSONB foritems,discount,shipping,payment_status,timelinestaff— replacesadmin_usersFirestore collectionconsole_settings— key/value or typed columns, replaces Firestoresettingsdocpublic_features— typed columns, replaces Firestorepublic_featuresdoccrm_comms_log— mirrors current SQLite schema, adds proper TIMESTAMPTZ columnscrm_media— mirrors current SQLite schemacrm_sync_state— key/valuecrm_quotations+crm_quotation_items— mirrors current SQLite schemamfg_audit_log— mirrors current SQLite schemadevice_alerts— mirrors current SQLite schemacommands— mirrors current SQLite schemaheartbeats— mirrors current SQLite schemamelody_drafts— mirrors current SQLite schemabuilt_melodies— mirrors current SQLite schemadevice_logs— partitioned by month onreceived_ataudit_log— new staff action audit system (see schema below)
Key schema decisions
device_logs — monthly partitioning
CREATE TABLE device_logs (
id BIGSERIAL,
device_serial TEXT NOT NULL,
level TEXT NOT NULL,
message TEXT NOT NULL,
device_timestamp BIGINT,
received_at TIMESTAMPTZ NOT NULL DEFAULT now(),
PRIMARY KEY (id, received_at)
) PARTITION BY RANGE (received_at);
-- Partitions created monthly by a background job or manually:
CREATE TABLE device_logs_2025_01 PARTITION OF device_logs
FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
-- etc.
CREATE INDEX idx_device_logs_serial_time ON device_logs(device_serial, received_at DESC);
CREATE INDEX idx_device_logs_level ON device_logs(level, received_at DESC);
Dropping a partition to purge old data: DROP TABLE device_logs_2024_06; — instant, no DELETE scan.
crm_customers — JSONB for flexible arrays
CREATE TABLE crm_customers (
id TEXT PRIMARY KEY, -- keep Firestore UUID as-is
firestore_id TEXT UNIQUE, -- same value during transition, null-able later
title TEXT,
name TEXT NOT NULL,
surname TEXT,
organization TEXT,
religion TEXT,
language TEXT NOT NULL DEFAULT 'el',
folder_id TEXT UNIQUE NOT NULL,
relationship_status TEXT NOT NULL DEFAULT 'lead',
nextcloud_folder TEXT,
contacts JSONB NOT NULL DEFAULT '[]',
notes JSONB NOT NULL DEFAULT '[]',
location JSONB,
tags TEXT[] NOT NULL DEFAULT '{}',
owned_items JSONB NOT NULL DEFAULT '[]',
linked_user_ids TEXT[] NOT NULL DEFAULT '{}',
technical_issues JSONB NOT NULL DEFAULT '[]',
install_support JSONB NOT NULL DEFAULT '[]',
transaction_history JSONB NOT NULL DEFAULT '[]',
crm_summary JSONB,
created_at TIMESTAMPTZ NOT NULL,
updated_at TIMESTAMPTZ NOT NULL
);
CREATE INDEX idx_crm_customers_rel_status ON crm_customers(relationship_status);
CREATE INDEX idx_crm_customers_tags ON crm_customers USING GIN(tags);
CREATE INDEX idx_crm_customers_name ON crm_customers(name, surname);
crm_orders — separate table (was Firestore subcollection)
CREATE TABLE crm_orders (
id TEXT PRIMARY KEY,
customer_id TEXT NOT NULL REFERENCES crm_customers(id) ON DELETE CASCADE,
order_number TEXT UNIQUE NOT NULL,
title TEXT,
created_by TEXT,
status TEXT NOT NULL DEFAULT 'negotiating',
status_updated_date TIMESTAMPTZ,
status_updated_by TEXT,
items JSONB NOT NULL DEFAULT '[]',
subtotal NUMERIC(12,2) NOT NULL DEFAULT 0,
discount JSONB,
total_price NUMERIC(12,2) NOT NULL DEFAULT 0,
currency TEXT NOT NULL DEFAULT 'EUR',
shipping JSONB,
payment_status JSONB NOT NULL DEFAULT '{}',
invoice_path TEXT,
notes TEXT,
timeline JSONB NOT NULL DEFAULT '[]',
created_at TIMESTAMPTZ NOT NULL,
updated_at TIMESTAMPTZ NOT NULL
);
CREATE INDEX idx_crm_orders_customer ON crm_orders(customer_id);
CREATE INDEX idx_crm_orders_status ON crm_orders(status);
staff — replaces Firestore admin_users
CREATE TABLE staff (
id TEXT PRIMARY KEY, -- keep Firestore doc ID as-is during transition
firestore_id TEXT UNIQUE, -- same as id during transition
email TEXT UNIQUE NOT NULL,
name TEXT NOT NULL,
role TEXT NOT NULL DEFAULT 'staff',
permissions JSONB NOT NULL DEFAULT '{}',
hashed_password TEXT NOT NULL,
is_active BOOLEAN NOT NULL DEFAULT TRUE,
created_at TIMESTAMPTZ NOT NULL DEFAULT now(),
updated_at TIMESTAMPTZ NOT NULL DEFAULT now()
);
audit_log — new system, no migration source
CREATE TABLE audit_log (
id BIGSERIAL PRIMARY KEY,
occurred_at TIMESTAMPTZ NOT NULL DEFAULT now(),
actor_id TEXT NOT NULL,
actor_name TEXT NOT NULL,
action TEXT NOT NULL, -- CREATE | UPDATE | DELETE | COMMAND | LOGIN | LOGOUT | etc.
entity_type TEXT NOT NULL, -- customer | order | device | melody | product | staff | ticket | note | quotation | etc.
entity_id TEXT NOT NULL,
entity_label TEXT, -- denormalized human name: "Church of St. George", "SN-0042", etc.
changes JSONB, -- {"field": {"old": x, "new": y}, ...} — null for CREATE/DELETE/COMMAND
meta JSONB -- extra context: ip_address, command_name, etc.
);
-- Indexes covering the exact filter combos we need:
CREATE INDEX idx_audit_actor ON audit_log(actor_id, occurred_at DESC);
CREATE INDEX idx_audit_entity ON audit_log(entity_type, entity_id, occurred_at DESC);
CREATE INDEX idx_audit_action ON audit_log(action, occurred_at DESC);
CREATE INDEX idx_audit_occurred ON audit_log(occurred_at DESC);
Phase 1 — SQLite → Postgres (Data Migration)
Status: NOT STARTED Prerequisite: Phase 0 complete (all tables exist in Postgres)
No downtime required — SQLite is local, can read it while the app is running. After migration is verified, services are switched to read from Postgres.
Migration order (least dependencies first)
| Step | Table | Script |
|---|---|---|
| 1.1 | melody_drafts |
migration/migrate_melody_drafts.py |
| 1.2 | built_melodies |
migration/migrate_built_melodies.py |
| 1.3 | mfg_audit_log |
migration/migrate_mfg_audit_log.py |
| 1.4 | device_alerts |
migration/migrate_device_alerts.py |
| 1.5 | crm_sync_state |
migration/migrate_crm_sync_state.py |
| 1.6 | crm_quotations |
migration/migrate_crm_quotations.py |
| 1.7 | crm_quotation_items |
migration/migrate_crm_quotation_items.py |
| 1.8 | crm_media |
migration/migrate_crm_media.py |
| 1.9 | crm_comms_log |
migration/migrate_crm_comms_log.py |
| 1.10 | commands |
migration/migrate_commands.py |
| 1.11 | heartbeats |
migration/migrate_heartbeats.py |
| 1.12 | device_logs |
migration/migrate_device_logs.py (largest — batched) |
Per-script pattern
# Every script follows this structure
async def run():
sqlite_rows = await read_all_from_sqlite("table_name")
source_count = len(sqlite_rows)
print(f"Source: {source_count} rows")
async with pg_session() as session:
async with session.begin():
await session.execute(
insert(PgModel).values(rows).on_conflict_do_nothing()
)
pg_count = await session.scalar(select(func.count()).select_from(PgModel))
if pg_count < source_count:
raise RuntimeError(f"Count mismatch: source={source_count} pg={pg_count}")
print(f"Postgres: {pg_count} rows ✓")
await log_migration_run("table_name", source_count, pg_count)
Service cutover per domain
After each group is migrated and verified:
- Update service to import from
database.postgresinstead ofdatabase.core - Replace
aiosqlitequeries with SQLAlchemy async queries - Smoke test via the Console UI — verify the page loads correctly
- Leave SQLite file untouched for 48h as a fallback
Phase 2 — Firestore → Postgres (Data Migration)
Status: NOT STARTED Prerequisite: Phase 1 complete
Requires shared.firebase.get_db() to read from Firestore.
Scripts run with Firebase Admin SDK — same SDK already initialized in the backend.
Migration order
| Step | Collection | Script | Notes |
|---|---|---|---|
| 2.1 | settings (doc) |
migration/migrate_settings.py |
Single document |
| 2.2 | public_features (doc) |
migration/migrate_public_features.py |
Single document |
| 2.3 | crm_products |
migration/migrate_crm_products.py |
No dependencies |
| 2.4 | crm_customers |
migration/migrate_crm_customers.py |
Strip legacy negotiating/has_problem fields |
| 2.5 | orders (subcollection) |
migration/migrate_crm_orders.py |
Uses collection_group("orders") |
Converting Firestore types
Use the existing _convert_firestore_value helpers in devices/service.py — copy into a shared migration/utils.py. Key conversions:
DatetimeWithNanoseconds→.isoformat()stringGeoPoint→{"lat": x, "lng": y}dictDocumentReference→.idstring (just the doc ID, no path)
Cutover
After each Firestore collection is migrated and verified:
- Switch service to read/write Postgres
- Keep all Firestore write calls — continue writing to Firestore on every mutation so the data stays current there for any emergency rollback
- After 48h of stable operation, remove the redundant Firestore writes (one service at a time)
Phase 3 — Staff Auth Cutover
Status: NOT STARTED Prerequisite: Phase 2 step 2.5 complete, staff table verified
This is the highest-risk phase because auth affects every request.
Steps
- Migrate
admin_usersFirestore collection →staffPostgres table (script:migration/migrate_staff.py) - Verify: compare email list, role list, permission maps between Firestore and Postgres
- Update
auth/dependencies.pyto query Postgresstafftable instead of Firestore - Update
staff/service.pyto read/write Postgres - Update
seed_admin.pyto write to Postgres (keep old Firestore version asseed_admin_firestore_legacy.py) - Test: log in as each role, verify permissions work
- Only after 24h stable — remove Firestore reads from auth
Rollback plan
The JWT token payload doesn't change — it still contains sub (staff ID) and permissions.
Rolling back is just reverting the two files (auth/dependencies.py and staff/service.py).
Phase 4 — Audit Log System
Status: NOT STARTED
Prerequisite: Phase 0 (audit_log table created)
The audit log system can be built and wired in incrementally — it doesn't block other phases. Wire it into each service as that service is cut over to Postgres.
The logging utility
backend/shared/audit.py — a single async function all services call:
async def log_action(
db: AsyncSession,
actor_id: str,
actor_name: str,
action: str, # "CREATE" | "UPDATE" | "DELETE" | "COMMAND" | ...
entity_type: str, # "customer" | "order" | "device" | ...
entity_id: str,
entity_label: str | None = None,
changes: dict | None = None, # {"field": {"old": x, "new": y}}
meta: dict | None = None, # {"ip": ..., "command_name": ...}
) -> None
How to capture diffs
In service update functions:
old_data = existing_record.to_dict() # before
await session.execute(update_stmt)
new_data = updated_record.to_dict() # after
changes = {
k: {"old": old_data[k], "new": new_data[k]}
for k in new_data
if old_data.get(k) != new_data.get(k)
}
await log_action(db, actor_id, actor_name, "UPDATE", "customer", id, label, changes)
Action types
| Action | When |
|---|---|
CREATE |
Any new record created |
UPDATE |
Any field changed |
DELETE |
Any record deleted |
COMMAND |
MQTT command sent to device |
PUBLISH |
Melody published to Firestore |
UNPUBLISH |
Melody unpublished |
LOGIN |
Staff login |
LOGOUT |
Staff logout |
PERMISSION_CHANGE |
Staff permissions updated |
STATUS_CHANGE |
Order/customer/ticket status changed (convenience — also captured as UPDATE) |
API endpoint
GET /api/audit-log with query params:
actor_id— filter by staff memberentity_type+entity_id— filter by a specific recordaction— filter by action typefrom_date/to_date— date rangelimit/offset— pagination (default limit: 50, max: 200)
Phase 5 — MQTT Live Data Cutover
Status: NOT STARTED Prerequisite: Phase 1 complete (device_logs in Postgres)
This phase switches the live MQTT ingestion from SQLite to Postgres.
Steps
- Update
database/core.pyinsert_log,insert_heartbeat,insert_commandto write to Postgres - Update read functions (
get_logs,get_heartbeats, etc.) similarly - The partition management background job: each month, at startup or via a cron, ensure next month's partition exists:
async def ensure_current_partitions(db: AsyncSession):
for month_offset in [0, 1]: # current + next month
d = date.today().replace(day=1) + relativedelta(months=month_offset)
partition_name = f"device_logs_{d.strftime('%Y_%m')}"
start = d.isoformat()
end = (d + relativedelta(months=1)).isoformat()
await db.execute(text(f"""
CREATE TABLE IF NOT EXISTS {partition_name}
PARTITION OF device_logs
FOR VALUES FROM ('{start}') TO ('{end}')
"""))
Log retention
- Keep last 6 months of partitions
- Cron job runs monthly: checks for partitions older than 6 months and drops them
- Dropping a partition =
DROP TABLE device_logs_2024_09;— instantaneous, no row-by-row delete
Verification Checklist (run after each phase)
SELECT COUNT(*)in Postgres matches source count for every migrated table- Sample 10 random records — compare field by field against source
- Timestamps are stored as TIMESTAMPTZ, not TEXT strings
- All JSONB columns parse correctly (no
nullwhere arrays expected) - Relevant Console pages load without errors
- API endpoints return correct data
_migration_runstable shows success for all scripts
Files & Locations
backend/
├── migration/ ← all migration scripts live here
│ ├── utils.py ← shared helpers (Firestore type converters, PG connection, etc.)
│ ├── migrate_melody_drafts.py
│ ├── migrate_crm_customers.py
│ ├── migrate_crm_orders.py
│ └── ... (one file per table)
├── shared/
│ └── audit.py ← audit log utility (Phase 4)
└── alembic/versions/ ← never edit by hand
Current Status Summary
| Phase | Description | Status |
|---|---|---|
| 0 | Schema foundation (all tables in Postgres) | COMPLETE — applied on VPS 2026-04-17 |
| 1 | SQLite → Postgres (data migration) | COMPLETE — all 12 scripts ran successfully on VPS 2026-04-17 |
| 2 | Firestore → Postgres (data migration) | COMPLETE — all 5 scripts ran successfully on VPS 2026-04-17 |
| 3 | Staff auth cutover | COMPLETE — Postgres auth live 2026-04-17 |
| 4 | Audit log system | NOT STARTED |
| 5 | MQTT live data cutover | NOT STARTED |
Update this table as each phase completes.
Phase 1 — Run Order & Commands
Run each command on the VPS in order. Verify the output of each before proceeding.
# 1.1
docker compose exec backend python -m migration.migrate_melody_drafts
# 1.2
docker compose exec backend python -m migration.migrate_built_melodies
# 1.3
docker compose exec backend python -m migration.migrate_mfg_audit_log
# 1.4
docker compose exec backend python -m migration.migrate_device_alerts
# 1.5
docker compose exec backend python -m migration.migrate_crm_sync_state
# 1.6 (FK enforcement suppressed — crm_customers not in PG yet)
docker compose exec backend python -m migration.migrate_crm_quotations
# 1.7
docker compose exec backend python -m migration.migrate_crm_quotation_items
# 1.8
docker compose exec backend python -m migration.migrate_crm_media
# 1.9
docker compose exec backend python -m migration.migrate_crm_comms_log
# 1.10
docker compose exec backend python -m migration.migrate_commands
# 1.11
docker compose exec backend python -m migration.migrate_heartbeats
# 1.12 (largest — batched, shows progress)
docker compose exec backend python -m migration.migrate_device_logs
After all scripts complete, verify the run log:
docker compose exec postgres psql -U bellsystems_user -d bellsystems_db \
-c "SELECT script_name, ran_at, source_rows, dest_rows, success FROM _migration_runs ORDER BY ran_at;"
Phase 2 — Run Order & Commands
crm_customers MUST run before crm_orders (FK dependency).
# 2.1
docker compose exec backend python -m migration.migrate_settings
# 2.2
docker compose exec backend python -m migration.migrate_public_features
# 2.3
docker compose exec backend python -m migration.migrate_crm_products
# 2.4 (required before 2.5)
docker compose exec backend python -m migration.migrate_crm_customers
# 2.5 (depends on 2.4)
docker compose exec backend python -m migration.migrate_crm_orders
Phase 3 — Run Order & Commands
Apply the new Alembic revision first (adds ui_prefs column + makes permissions nullable):
# Apply schema change
docker compose exec backend alembic upgrade head
# 3.1 — migrate Firestore admin_users → Postgres staff table
docker compose exec backend python -m migration.migrate_staff
# Verify
docker compose exec postgres psql -U bellsystems_user -d bellsystems_db \
-c "SELECT id, email, role, is_active FROM staff ORDER BY role, name;"
After verifying the staff table is populated correctly:
# Restart the backend so it picks up the new auth/staff code
docker compose restart backend
Then test: log in as each role in the Console UI and verify permissions work.
After 24h stable operation, Firestore reads from auth are fully removed (already done in code).
Rollback: revert auth/router.py, auth/dependencies.py, staff/service.py, staff/router.py
to the Firestore versions — the JWT payload is unchanged so tokens remain valid during rollback.