aiacontext
/

mini-enedina

+---
+license: cc-by-4.0
+language:
+  - pt
+tags:
+  - monotropic-model
+  - small-language-model
+  - structural-engineering
+  - timoshenko-beam-theory
+  - curriculum-learning
+  - validated-synthetic-data
+  - physics-informed-ai
+  - mlx
+  - apple-silicon
+pipeline_tag: text-generation
+library_name: mlx
+---
+# Mini-Enedina: A Domain-Specialized Small Language Model for Structural Shaft Analysis
+**Mini-Enedina** is a monotropic language model -- deliberately small and intensively specialized -- with 37.5 million parameters, designed exclusively for structural shaft analysis according to Timoshenko beam theory.
+## Model Details
+| Parameter | Value |
+|-----------|-------|
+| **Parameters** | 37.57M |
+| **Layers** | 7 |
+| **Attention Heads** | 8 |
+| **Model Dimension** | 512 |
+| **Feed-Forward Dimension** | 2048 |
+| **Vocabulary Size** | 8,012 (8,000 BPE + 12 Harmony tokens) |
+| **Max Sequence Length** | 14,336 tokens |
+| **Positional Encoding** | RoPE |
+| **Normalization** | RMSNorm (pre-norm) |
+| **Activation** | SiLU (SwiGLU) |
+| **Framework** | MLX (Apple Silicon) |
+| **Precision** | BFloat16 |
+| **Model Size** | 143 MB |
+## Training
+- **Dataset:** 60,000 physically validated samples (621M tokens) of Timoshenko shaft analysis problems
+- **Training Strategy:** Multidimensional curriculum learning with 4 phases (Foundation, Intermediate, Advanced, Full)
+- **Three Analysis Levels:**
+  - **Bachelor:** Deflection analysis (V, M, w, theta)
+  - **Master:** + Von Mises stress analysis
+  - **Doctor:** + Fatigue evaluation (Marin factors, Goodman criterion)
+- **Hardware:** Apple M4 Pro, 48 GB unified memory
+- **Training Time:** ~23 hours (14,920 steps)
+- **Optimizer:** AdamW (lr=3e-4, cosine schedule with warmup)
+## Evaluation Results (6,000 held-out test samples)
+| Metric | Overall | Bachelor | Master | Doctor |
+|--------|---------|----------|--------|--------|
+| **Loss** | 0.0787 | 0.0733 | 0.0804 | 0.0825 |
+| **Perplexity** | 1.08 | 1.08 | 1.08 | 1.09 |
+| **Correct Stop Token** | 94% | 97% | 100% | 85% |
+| **Valid Harmony Structure** | 100% | 100% | 100% | 100% |
+## Output Format: Harmony-Enedina
+The model generates structured responses using the Harmony-Enedina format with two channels:
+1. **Analysis Channel:** Chain-of-thought reasoning, problem classification, and qualitative analysis
+2. **Final Channel:** Complete Python solver code with numerical grounding, quantitative results, and validation summary
+Domain-specific tokens (`<|shaft|>`, `<|python|>`, `<|numerical|>`, `<|latex|>`) demarcate semantic boundaries within the output.
+## Inference Configuration
+The model was trained **without** sliding window attention, repetition penalty, or n-gram blocking. These techniques must remain **disabled** during inference:
+```python
+# CORRECT configuration (BASELINE)
+use_sliding_window = False
+repetition_penalty = 1.0
+no_repeat_ngram_size = 0
+temperature = 0.0  # greedy decoding
+```
+Enabling these techniques degrades performance from 94% to 8% correct stop tokens.
+## Intended Use
+Mini-Enedina is designed for:
+- Structural shaft analysis according to Timoshenko beam theory
+- Engineering education and design iteration
+- Generating complete, executable Python solver code
+- Deployment on consumer hardware (edge, air-gapped environments)
+**Important:** Model outputs should always be verified against independent calculations for safety-critical applications.
+## Limitations
+- Handles exclusively shaft analysis according to Timoshenko theory
+- Training language is Brazilian Portuguese
+- Numerical accuracy is limited by tokenization granularity
+- May struggle with support conditions or load combinations not represented in training
+## Citation
+If you use this model, please cite:
+```bibtex
+@article{leitaofilho2026minienedina,
+  title={Mini-Enedina: A Domain-Specialized Small Language Model for Structural Shaft Analysis Using Timoshenko Beam Theory},
+  author={Leit{\~a}o Filho, Antonio de Sousa and Barros Filho, Allan Kardec Duailibe and Lima, Fabr{\'i}cio Saul and Santos, Selby Mykael Lima dos and Sousa, Rejani Bandeira Vieira},
+  year={2026}
+}
+```
+## Acknowledgments
+This work was supported by Aia Context Ltda. and by FINEP -- Funding Authority for Studies and Projects, a Brazilian government agency for science, technology, and innovation linked to the Ministry of Science, Technology and Innovation (MCTI), under Contract No. 03.25.0080.00.
+## License
+CC-BY-4.0

config.json ADDED Viewed

	@@ -0,0 +1,20 @@

+{
+  "model_type": "mini-enedina",
+  "architectures": ["MiniEnedina"],
+  "dim": 512,
+  "n_layers": 7,
+  "n_heads": 8,
+  "head_dim": 64,
+  "intermediate_size": 2048,
+  "vocab_size": 8012,
+  "max_seq_len": 14336,
+  "norm_eps": 1e-5,
+  "rope_theta": 10000.0,
+  "normalization": "rmsnorm",
+  "activation": "silu_swiglu",
+  "positional_encoding": "rope",
+  "weight_tying": true,
+  "total_parameters": 37570000,
+  "framework": "mlx",
+  "torch_dtype": "bfloat16"
+}

model.safetensors ADDED Viewed

	@@ -0,0 +1,3 @@

+version https://git-lfs.github.com/spec/v1
+oid sha256:ef3b9e0cd0821fb69a2a9cb5efe5a9b2b7ab717587258d07941f55f867c5468b
+size 150295004

training_state.json ADDED Viewed

	@@ -0,0 +1,5 @@

+{
+  "step": 14000,
+  "best_val_loss": 0.07652725413288604,
+  "phase_idx": 3
+}