uslap-query / Code_files /CONSOLIDATION_ISSUES.md
uslap's picture
Upload folder using huggingface_hub
7cc8e29 verified
|
Raw
History Blame Contribute Delete
3.4 kB

A newer version of the Gradio SDK is available: 6.19.0

Upgrade

Schema Consolidation — Issues Found (2026-03-27)

Status: BLOCKED — needs fixes before re-run

Backup: backups/v3_pre_consolidation_20260327_1643.db (85MB) DB restored to pre-consolidation state. All other session work intact.

Blocker 1: PK Conflicts on RU Mirror Tables

RU tables use the SAME primary keys as EN tables (they're translations, not new data):

  • a2_имена_аллаха: allah_id 1-99 = same as names_of_allah allah_id 1-99
  • a4_производные: deriv_id = same as a4_derivatives deriv_id
  • a5_перекрёстные_ссылки: xref_id = same as a5_cross_refs xref_id

Fix options:

  1. UPDATE existing rows — add RU content to the EN row (e.g., add ru_meaning column to names_of_allah). Preserves PK.
  2. Offset PKs — insert RU rows with PK + 100000 offset. Avoids conflict but breaks ID meaning.
  3. Separate lang column — generate NEW integer PKs for RU rows, add lang='RU' column. Original RU PK stored in orig_ru_id column.

Recommended: Option 1 for Names of Allah (same 99 names, just add RU fields). Option 3 for derivatives/cross-refs (genuinely different data rows).

Blocker 2: Orphaned Views

Several views reference tables that don't exist or have been renamed:

  • m1_phonetic_shifts → references phonetic_shifts (doesn't exist — data is in shift_lookup)
  • a3_quran_refs → is a VIEW, not a table
  • a6_country_names → is a VIEW, not a table
  • a1_записи → is a VIEW (data already in entries)
  • a1_entries → is a VIEW

Fix: Drop orphaned views BEFORE dropping triggers. Current script drops triggers first, which causes ALTER TABLE to fail when it touches a table referenced by a view.

Correct order:

  1. Save all triggers + views (SQL)
  2. Drop ALL views
  3. Drop ALL triggers
  4. Run migration
  5. Recreate views (new definitions)
  6. Recreate triggers (only for surviving tables)

Blocker 3: UNIQUE Constraints from Hardening

harden_v4_schema.py added UNIQUE indexes:

  • uq_entries_en_root on entries(en_term, root_id)
  • uq_bitig_orig2 on bitig_a1_entries(orig2_term, root_letters)
  • uq_eu_lang_term on european_a1_entries(lang, term)
  • uq_lat_term on latin_a1_entries(lat_term)
  • uq_roots_letters on roots(root_letters)

These may block RU data insertion if values collide. Need to check each before INSERT.

Migration Script

consolidate_v5_clean.py — handles Phases 1-3 but needs the above fixes. consolidate_schema_v5.py — original version, same issues.

What Was Completed This Session

  1. Domain-specific QUF (12 lattice layers) — 97% pass, 102K rows, 27 tables
  2. Extended QUF to 130 remaining tables — 40% pass
  3. 4 new AMR AI modules (jism, hisab, tarikh, istakhbarat) — all with domain QUF colours
  4. Schema hardening (indexes, views, health check)
  5. amr_lawh.py QUF filtering wired
  6. Automated backup script created
  7. Banned term "theological" removed from all code
  8. 12-layer lattice architecture defined (replaces 8 academic categories)

Next Session: Consolidation

  1. Fix Blocker 1: per-table PK strategy (UPDATE for names, new PKs for derivatives)
  2. Fix Blocker 2: drop views BEFORE triggers
  3. Fix Blocker 3: handle UNIQUE constraints
  4. Re-run consolidation
  5. Update code references (amr_jism.py, uslap_quf.py, uslap_handler.py, etc.)
  6. Re-run domain QUF on consolidated structure