| | --- |
| | license: cc-by-4.0 |
| | language: |
| | - pt |
| | tags: |
| | - monotropic-model |
| | - small-language-model |
| | - structural-engineering |
| | - timoshenko-beam-theory |
| | - curriculum-learning |
| | - validated-synthetic-data |
| | - physics-informed-ai |
| | - mlx |
| | - apple-silicon |
| | pipeline_tag: text-generation |
| | library_name: mlx |
| | --- |
| | |
| | # Mini-Enedina: A Domain-Specialized Small Language Model for Structural Shaft Analysis |
| |
|
| | **Mini-Enedina** is a monotropic language model -- deliberately small and intensively specialized -- with 37.5 million parameters, designed exclusively for structural shaft analysis according to Timoshenko beam theory. |
| |
|
| | ## Model Details |
| |
|
| | | Parameter | Value | |
| | |-----------|-------| |
| | | **Parameters** | 37.57M | |
| | | **Layers** | 7 | |
| | | **Attention Heads** | 8 | |
| | | **Model Dimension** | 512 | |
| | | **Feed-Forward Dimension** | 2048 | |
| | | **Vocabulary Size** | 8,012 (8,000 BPE + 12 Harmony tokens) | |
| | | **Max Sequence Length** | 14,336 tokens | |
| | | **Positional Encoding** | RoPE | |
| | | **Normalization** | RMSNorm (pre-norm) | |
| | | **Activation** | SiLU (SwiGLU) | |
| | | **Framework** | MLX (Apple Silicon) | |
| | | **Precision** | BFloat16 | |
| | | **Model Size** | 143 MB | |
| |
|
| | ## Training |
| |
|
| | - **Dataset:** 60,000 physically validated samples (621M tokens) of Timoshenko shaft analysis problems |
| | - **Training Strategy:** Multidimensional curriculum learning with 4 phases (Foundation, Intermediate, Advanced, Full) |
| | - **Three Analysis Levels:** |
| | - **Bachelor:** Deflection analysis (V, M, w, theta) |
| | - **Master:** + Von Mises stress analysis |
| | - **Doctor:** + Fatigue evaluation (Marin factors, Goodman criterion) |
| | - **Hardware:** Apple M4 Pro, 48 GB unified memory |
| | - **Training Time:** ~23 hours (14,920 steps) |
| | - **Optimizer:** AdamW (lr=3e-4, cosine schedule with warmup) |
| |
|
| | ## Evaluation Results (6,000 held-out test samples) |
| |
|
| | | Metric | Overall | Bachelor | Master | Doctor | |
| | |--------|---------|----------|--------|--------| |
| | | **Loss** | 0.0787 | 0.0733 | 0.0804 | 0.0825 | |
| | | **Perplexity** | 1.08 | 1.08 | 1.08 | 1.09 | |
| | | **Correct Stop Token** | 94% | 97% | 100% | 85% | |
| | | **Valid Harmony Structure** | 100% | 100% | 100% | 100% | |
| |
|
| | ## Output Format: Harmony-Enedina |
| |
|
| | The model generates structured responses using the Harmony-Enedina format with two channels: |
| |
|
| | 1. **Analysis Channel:** Chain-of-thought reasoning, problem classification, and qualitative analysis |
| | 2. **Final Channel:** Complete Python solver code with numerical grounding, quantitative results, and validation summary |
| |
|
| | Domain-specific tokens (`<|shaft|>`, `<|python|>`, `<|numerical|>`, `<|latex|>`) demarcate semantic boundaries within the output. |
| |
|
| | ## Inference Configuration |
| |
|
| | The model was trained **without** sliding window attention, repetition penalty, or n-gram blocking. These techniques must remain **disabled** during inference: |
| |
|
| | ```python |
| | # CORRECT configuration (BASELINE) |
| | use_sliding_window = False |
| | repetition_penalty = 1.0 |
| | no_repeat_ngram_size = 0 |
| | temperature = 0.0 # greedy decoding |
| | ``` |
| |
|
| | Enabling these techniques degrades performance from 94% to 8% correct stop tokens. |
| |
|
| | ## Intended Use |
| |
|
| | Mini-Enedina is designed for: |
| |
|
| | - Structural shaft analysis according to Timoshenko beam theory |
| | - Engineering education and design iteration |
| | - Generating complete, executable Python solver code |
| | - Deployment on consumer hardware (edge, air-gapped environments) |
| |
|
| | **Important:** Model outputs should always be verified against independent calculations for safety-critical applications. |
| |
|
| | ## Limitations |
| |
|
| | - Handles exclusively shaft analysis according to Timoshenko theory |
| | - Training language is Brazilian Portuguese |
| | - Numerical accuracy is limited by tokenization granularity |
| | - May struggle with support conditions or load combinations not represented in training |
| |
|
| | ## Citation |
| |
|
| | If you use this model, please cite: |
| |
|
| | ```bibtex |
| | @article{leitaofilho2026minienedina, |
| | title={Mini-Enedina: A Domain-Specialized Small Language Model for Structural Shaft Analysis Using Timoshenko Beam Theory}, |
| | author={Leit{\~a}o Filho, Antonio de Sousa and Barros Filho, Allan Kardec Duailibe and Lima, Fabr{\'i}cio Saul and Santos, Selby Mykael Lima dos and Sousa, Rejani Bandeira Vieira}, |
| | year={2026} |
| | } |
| | ``` |
| |
|
| | ## Acknowledgments |
| |
|
| | This work was supported by Aia Context Ltda. and by FINEP -- Funding Authority for Studies and Projects, a Brazilian government agency for science, technology, and innovation linked to the Ministry of Science, Technology and Innovation (MCTI), under Contract No. 03.25.0080.00. |
| |
|
| | ## License |
| |
|
| | CC-BY-4.0 |
| |
|