Magento 2 Module Generator (LoRA Adapter)
LoRA adapter for Qwen2.5-Coder-7B-Instruct that converts natural language table descriptions into structured MagentoSchemaSpec JSON β which is then deterministically compiled into a complete Magento 2 module (40+ files).
Base model: Qwen/Qwen2.5-Coder-7B-Instruct Training data: fchis/magento2-schema-training Code: github.com/florinel-chis/laravel-ai-gen
What It Does
Describe a database table in plain English. Get a complete Magento 2 module with:
db_schema.xml+db_schema_whitelist.json- Models, Resource Models, Collections
- Repository interfaces + implementations
- Data interfaces (getters/setters from columns)
- Admin controllers (Index, Edit, Save, Delete, New)
- UI Component grid + form XML
- Layout XML, Routes, Menu, ACL
- REST API endpoints (
webapi.xml) registration.php,module.xml,composer.json,di.xml
The LLM only does one thing: convert NL to a table specification. Everything else β 40+ files β is generated deterministically. No hallucinations possible in the generated module code.
How It Works
NL: "Create a warranty table with serial number, dates, status, FK to products and customers"
β LLM (this adapter, per-entity)
Table spec JSON (columns, types, FKs, indexes, constraints)
β Spec Compiler (validates, <1ms)
β Module Generator (deterministic, no LLM)
40 files β 23/23 PHP valid, 0 hallucinations
Example
Input (to the LLM):
Create a Magento 2 table 'acme_warranty_warranty' for Product Warranty Registration.
Fields: product_id (unsigned int, required) references catalog_product_entity.entity_id
with CASCADE delete; customer_id (unsigned int, nullable) references customer_entity.entity_id
with SET NULL delete; purchase_date (date, required); warranty_start (date, required);
warranty_end (date, required); serial_number (varchar 64, required, unique);
status (varchar 32, default 'active'). Track creation and update timestamps.
Index on product_id, customer_id, and status.
Output (from the LLM β JSON spec):
{
"name": "acme_warranty_warranty",
"comment": "Product Warranty Registration",
"columns": [
{"name": "warranty_id", "type": "int", "unsigned": true, "identity": true, "nullable": false, "comment": "Warranty ID"},
{"name": "product_id", "type": "int", "unsigned": true, "nullable": false, "comment": "Product ID"},
{"name": "customer_id", "type": "int", "unsigned": true, "nullable": true, "comment": "Customer ID"},
{"name": "purchase_date", "type": "date", "nullable": false, "comment": "Purchase Date"},
{"name": "warranty_start", "type": "date", "nullable": false, "comment": "Warranty Start Date"},
{"name": "warranty_end", "type": "date", "nullable": false, "comment": "Warranty End Date"},
{"name": "serial_number", "type": "varchar", "length": 64, "nullable": false, "comment": "Serial Number"},
{"name": "status", "type": "varchar", "length": 32, "default": "active", "comment": "Status"},
{"name": "created_at", "type": "timestamp", "nullable": false, "comment": "Created At"},
{"name": "updated_at", "type": "timestamp", "nullable": false, "on_update": true, "comment": "Updated At"}
],
"primary_key": ["warranty_id"],
"unique_keys": [{"name": "ACME_WARRANTY_SERIAL_NUMBER", "columns": ["serial_number"]}],
"foreign_keys": [
{"column": "product_id", "reference_table": "catalog_product_entity", "reference_column": "entity_id", "on_delete": "CASCADE"},
{"column": "customer_id", "reference_table": "customer_entity", "reference_column": "entity_id", "on_delete": "SET NULL"}
],
"indexes": [
{"name": "ACME_WARRANTY_WARRANTY_PRODUCT_ID", "columns": ["product_id"], "type": "btree"},
{"name": "ACME_WARRANTY_WARRANTY_STATUS", "columns": ["status"], "type": "btree"}
]
}
Then the deterministic generator produces 40 files β no AI needed for this step.
Training Details
- Method: LoRA (rank 8, 8 trainable layers)
- Framework: MLX on Apple M2 Pro 16GB
- Training data: 164 per-entity examples reverse-engineered from 43 real Magento 2.4.8 core modules + 10 hand-crafted custom modules
- Approach: Per-entity training (one table per example) instead of per-module β enables ALL tables to fit within token limits
- Iterations: 200
- val_loss: 0.145
- Peak memory: 9.1 GB
Training Data Sources
| Source | Tables | Method |
|---|---|---|
| Magento 2.4.8 core modules | 142 | Reverse-engineered from real db_schema.xml |
| Hand-crafted custom modules | 35 | Blog, FAQ, Q&A, Events, Loyalty, Store Locator, etc. |
| Total | 177 (164 after token filtering) |
Round-Trip Verification
All reverse-engineered modules verified: parse original XML β build spec β compile β regenerate XML β compare.
Result: 43/43 modules match (100%). 160 tables, 1,053 columns verified.
Results
Tested on a novel "Product Warranty" module (not in training data):
| Metric | Result |
|---|---|
| Tables generated | 2 (warranty + claims) |
| Columns | 17 (all requested fields present) |
| Foreign keys | 3 (product, customer, warrantyβclaim) |
| Unique constraints | 1 (serial_number) |
| Indexes | 5 |
| Full module files | 40 |
| PHP syntax valid | 23/23 (100%) |
| Manual fixes | 0 |
Module Generation Output (from spec)
For a 2-entity module, the deterministic generator produces:
Acme_Warranty/
βββ registration.php
βββ composer.json
βββ etc/
β βββ module.xml
β βββ db_schema.xml
β βββ db_schema_whitelist.json
β βββ di.xml
β βββ webapi.xml
β βββ acl.xml
β βββ adminhtml/
β βββ routes.xml
β βββ menu.xml
βββ Api/
β βββ Data/
β β βββ WarrantyInterface.php
β β βββ ClaimInterface.php
β βββ WarrantyRepositoryInterface.php
β βββ ClaimRepositoryInterface.php
βββ Model/
β βββ Warranty.php
β βββ WarrantyRepository.php
β βββ Claim.php
β βββ ClaimRepository.php
β βββ ResourceModel/
β βββ Warranty.php
β βββ Warranty/Collection.php
β βββ Claim.php
β βββ Claim/Collection.php
βββ Controller/Adminhtml/
β βββ Warranty/ (Index, Edit, Save, Delete, NewAction)
β βββ Claim/ (Index, Edit, Save, Delete, NewAction)
βββ view/adminhtml/
βββ layout/ (4 files)
βββ ui_component/ (4 files: grid + form per entity)
Key Insight
For declarative output formats like Magento's XML configs and boilerplate PHP, AI is only needed for intent capture (NL β structured spec). The 40-file module generation is 100% deterministic β no model involved, no hallucinations possible.
This validates the "domain ontology" approach: make developer intent explicit in a structured format, validate it with a compiler, then generate code mechanically.
Related
- Laravel version: fchis/Laravel-13x-Qwen2.5-Coder-7B-Instruct-LoRA-Spec
- Laravel training data: fchis/laravel-buildspec-training
- Blog post: Validating BuildSpec on Magento 2