Upload folder using huggingface_hub

bcd4a45 verified 11 days ago

4.44 kB

	---
	license: cc-by-4.0
	language:
	- pt
	tags:
	- monotropic-model
	- small-language-model
	- structural-engineering
	- timoshenko-beam-theory
	- curriculum-learning
	- validated-synthetic-data
	- physics-informed-ai
	- mlx
	- apple-silicon
	pipeline_tag: text-generation
	library_name: mlx
	---

	# Mini-Enedina: A Domain-Specialized Small Language Model for Structural Shaft Analysis

	Mini-Enedina is a monotropic language model -- deliberately small and intensively specialized -- with 37.5 million parameters, designed exclusively for structural shaft analysis according to Timoshenko beam theory.

	## Model Details

	\| Parameter \| Value \|
	\|-----------\|-------\|
	\| Parameters \| 37.57M \|
	\| Layers \| 7 \|
	\| Attention Heads \| 8 \|
	\| Model Dimension \| 512 \|
	\| Feed-Forward Dimension \| 2048 \|
	\| Vocabulary Size \| 8,012 (8,000 BPE + 12 Harmony tokens) \|
	\| Max Sequence Length \| 14,336 tokens \|
	\| Positional Encoding \| RoPE \|
	\| Normalization \| RMSNorm (pre-norm) \|
	\| Activation \| SiLU (SwiGLU) \|
	\| Framework \| MLX (Apple Silicon) \|
	\| Precision \| BFloat16 \|
	\| Model Size \| 143 MB \|

	## Training

	- Dataset: 60,000 physically validated samples (621M tokens) of Timoshenko shaft analysis problems
	- Training Strategy: Multidimensional curriculum learning with 4 phases (Foundation, Intermediate, Advanced, Full)
	- Three Analysis Levels:
	- Bachelor: Deflection analysis (V, M, w, theta)
	- Master: + Von Mises stress analysis
	- Doctor: + Fatigue evaluation (Marin factors, Goodman criterion)
	- Hardware: Apple M4 Pro, 48 GB unified memory
	- Training Time: ~23 hours (14,920 steps)
	- Optimizer: AdamW (lr=3e-4, cosine schedule with warmup)

	## Evaluation Results (6,000 held-out test samples)

	\| Metric \| Overall \| Bachelor \| Master \| Doctor \|
	\|--------\|---------\|----------\|--------\|--------\|
	\| Loss \| 0.0787 \| 0.0733 \| 0.0804 \| 0.0825 \|
	\| Perplexity \| 1.08 \| 1.08 \| 1.08 \| 1.09 \|
	\| Correct Stop Token \| 94% \| 97% \| 100% \| 85% \|
	\| Valid Harmony Structure \| 100% \| 100% \| 100% \| 100% \|

	## Output Format: Harmony-Enedina

	The model generates structured responses using the Harmony-Enedina format with two channels:

	1. Analysis Channel: Chain-of-thought reasoning, problem classification, and qualitative analysis
	2. Final Channel: Complete Python solver code with numerical grounding, quantitative results, and validation summary

	Domain-specific tokens (`<\|shaft\|>`, `<\|python\|>`, `<\|numerical\|>`, `<\|latex\|>`) demarcate semantic boundaries within the output.

	## Inference Configuration

	The model was trained without sliding window attention, repetition penalty, or n-gram blocking. These techniques must remain disabled during inference:

	```python
	# CORRECT configuration (BASELINE)
	use_sliding_window = False
	repetition_penalty = 1.0
	no_repeat_ngram_size = 0
	temperature = 0.0 # greedy decoding
	```

	Enabling these techniques degrades performance from 94% to 8% correct stop tokens.

	## Intended Use

	Mini-Enedina is designed for:

	- Structural shaft analysis according to Timoshenko beam theory
	- Engineering education and design iteration
	- Generating complete, executable Python solver code
	- Deployment on consumer hardware (edge, air-gapped environments)

	Important: Model outputs should always be verified against independent calculations for safety-critical applications.

	## Limitations

	- Handles exclusively shaft analysis according to Timoshenko theory
	- Training language is Brazilian Portuguese
	- Numerical accuracy is limited by tokenization granularity
	- May struggle with support conditions or load combinations not represented in training

	## Citation

	If you use this model, please cite:

	```bibtex
	@article{leitaofilho2026minienedina,
	title={Mini-Enedina: A Domain-Specialized Small Language Model for Structural Shaft Analysis Using Timoshenko Beam Theory},
	author={Leit{\~a}o Filho, Antonio de Sousa and Barros Filho, Allan Kardec Duailibe and Lima, Fabr{\'i}cio Saul and Santos, Selby Mykael Lima dos and Sousa, Rejani Bandeira Vieira},
	year={2026}
	}
	```

	## Acknowledgments

	This work was supported by Aia Context Ltda. and by FINEP -- Funding Authority for Studies and Projects, a Brazilian government agency for science, technology, and innovation linked to the Ministry of Science, Technology and Innovation (MCTI), under Contract No. 03.25.0080.00.

	## License

	CC-BY-4.0