Update model card: v3 training — 4 passes, 6 new engineering domains, 23,850 examples
Browse files
README.md
CHANGED
|
@@ -1,13 +1,101 @@
|
|
| 1 |
---
|
|
|
|
|
|
|
| 2 |
license: gemma
|
|
|
|
| 3 |
library_name: mlx
|
| 4 |
pipeline_tag: text-generation
|
| 5 |
-
extra_gated_heading: Access Gemma on Hugging Face
|
| 6 |
-
extra_gated_prompt: To access Gemma on Hugging Face, you’re required to review and
|
| 7 |
-
agree to Google’s usage license. To do this, please ensure you’re logged in to Hugging
|
| 8 |
-
Face and click below. Requests are processed immediately.
|
| 9 |
-
extra_gated_button_content: Acknowledge license
|
| 10 |
-
base_model: mlx-community/gemma-3-12b-it-4bit
|
| 11 |
tags:
|
| 12 |
- mlx
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
---
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
language:
|
| 3 |
+
- en
|
| 4 |
license: gemma
|
| 5 |
+
base_model: google/gemma-3-12b-it
|
| 6 |
library_name: mlx
|
| 7 |
pipeline_tag: text-generation
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 8 |
tags:
|
| 9 |
- mlx
|
| 10 |
+
- aerospace
|
| 11 |
+
- engineering
|
| 12 |
+
- thermodynamics
|
| 13 |
+
- mathematics
|
| 14 |
+
- finance
|
| 15 |
+
- coding
|
| 16 |
+
- signals
|
| 17 |
+
- statics
|
| 18 |
+
- dynamics
|
| 19 |
+
- mechanics-of-materials
|
| 20 |
+
- controls
|
| 21 |
+
- manufacturing
|
| 22 |
+
- lora
|
| 23 |
+
- fine-tuned
|
| 24 |
+
- chain-of-thought
|
| 25 |
---
|
| 26 |
+
|
| 27 |
+
# gemma3-12b-engineering
|
| 28 |
+
|
| 29 |
+
A fine-tuned version of [Gemma 3 12B IT](https://huggingface.co/google/gemma-3-12b-it) specialized for **aerospace engineering, thermodynamics, advanced mathematics, coding, finance, and 6 additional engineering disciplines**.
|
| 30 |
+
|
| 31 |
+
## Model Details
|
| 32 |
+
|
| 33 |
+
- **Base model:** google/gemma-3-12b-it (4-bit quantized via MLX)
|
| 34 |
+
- **Fine-tuning method:** QLoRA (MLX/LoRA) — 4 sequential training passes
|
| 35 |
+
- **Format:** MLX 4-bit quantized safetensors (~6.7 GB)
|
| 36 |
+
- **Hardware:** Apple MacBook Air M4 16GB
|
| 37 |
+
|
| 38 |
+
## Training Summary
|
| 39 |
+
|
| 40 |
+
| Pass | Dataset | Examples | Best Val Loss |
|
| 41 |
+
|------|---------|----------|---------------|
|
| 42 |
+
| v2 domain | MetaMathQA, Open-Platypus, OpenHermes STEM, ArXiv QA, SciQ, WikiText, WikiQA, CAMEL Physics/Math, CodeAlpaca, Finance | ~56K | 0.617 |
|
| 43 |
+
| CoT reasoning | nvidia/OpenMathReasoning, Open-Platypus CoT, MetaMathQA CoT, handcrafted aerospace | ~11K | 0.439 |
|
| 44 |
+
| Precision | Handcrafted aerospace/thermo — correct R=8314/M derivation (never R=287 for custom propellants) | ~60 | 0.620 |
|
| 45 |
+
| v3 comprehensive | 6 new engineering domains + NuminaMath-CoT, Magicoder, Finance-Alpaca, OpenHermes STEM, CodeFeedback | 23,850 | 0.689 |
|
| 46 |
+
|
| 47 |
+
**LoRA config:** rank=16, alpha=32, lora_layers=4, keys=[q_proj, v_proj], LR=2e-6
|
| 48 |
+
|
| 49 |
+
## Capabilities
|
| 50 |
+
|
| 51 |
+
- **Aerospace:** Isentropic flow, normal shocks, Brayton/Rankine cycles, rocket nozzles, Hohmann transfers
|
| 52 |
+
- **Thermodynamics:** Carnot, heat exchangers, entropy, propellant property derivation
|
| 53 |
+
- **Signals & Systems:** Laplace transforms, Bode plots, Z-transforms, Fourier series, stability, RC filters, sampling
|
| 54 |
+
- **Statics:** Equilibrium, beam reactions, trusses, centroids, moments of inertia, friction, frames
|
| 55 |
+
- **Dynamics:** Kinematics, Newton's 2nd law, work-energy, impulse-momentum, rotation, vibrations
|
| 56 |
+
- **Mechanics of Materials:** Axial stress, torsion, bending, Mohr's circle, Euler buckling, thermal stress, deflection
|
| 57 |
+
- **Controls:** Routh-Hurwitz, PID design, state-space, root locus, steady-state error, block diagrams, time specs
|
| 58 |
+
- **Manufacturing:** Turning/milling, Taylor tool life, tolerances, Chvorinov's rule, grinding, machining time
|
| 59 |
+
- **Mathematics:** ODEs, linear algebra, RK4, Newton-Raphson, induction proofs, competition-level math
|
| 60 |
+
- **Coding:** Python, C++, Java, JavaScript, numerical solvers
|
| 61 |
+
- **Finance:** Black-Scholes, NPV, DCF, engineering economics, portfolio theory
|
| 62 |
+
|
| 63 |
+
## Chain-of-Thought Reasoning
|
| 64 |
+
|
| 65 |
+
Activate step-by-step reasoning with this system prompt:
|
| 66 |
+
|
| 67 |
+
```
|
| 68 |
+
You are an expert aerospace engineer. Always reason step by step inside <think> tags before giving your final answer.
|
| 69 |
+
```
|
| 70 |
+
|
| 71 |
+
## Usage (MLX on Apple Silicon)
|
| 72 |
+
|
| 73 |
+
```python
|
| 74 |
+
from mlx_lm import load, generate
|
| 75 |
+
|
| 76 |
+
model, tokenizer = load("vininhosts/gemma3-12b-engineering")
|
| 77 |
+
|
| 78 |
+
prompt = "A rocket nozzle has Pc=2MPa, Tc=3000K, exit Mach=3, propellant M=20g/mol, gamma=1.3. Find exit pressure."
|
| 79 |
+
response = generate(model, tokenizer, prompt=prompt, max_tokens=1024)
|
| 80 |
+
print(response)
|
| 81 |
+
```
|
| 82 |
+
|
| 83 |
+
## Key Precision: R = 8314 / M
|
| 84 |
+
|
| 85 |
+
This model correctly computes the specific gas constant from molar mass as **R = 8314 / M**,
|
| 86 |
+
and never defaults to R = 287 J/(kg·K) (air) when a different propellant molar mass is given.
|
| 87 |
+
This was enforced via a dedicated precision fine-tuning pass with 60+ handcrafted examples.
|
| 88 |
+
|
| 89 |
+
## Example Domains Covered
|
| 90 |
+
|
| 91 |
+
- Isentropic nozzle flow and normal shock relations
|
| 92 |
+
- Brayton cycle thermal efficiency and compressor work
|
| 93 |
+
- PID controller tuning and Routh-Hurwitz stability
|
| 94 |
+
- Beam deflection and Mohr's circle stress analysis
|
| 95 |
+
- Z-transform and discrete-time system stability
|
| 96 |
+
- Taylor tool life equation and machining parameters
|
| 97 |
+
- Black-Scholes option pricing and DCF valuation
|
| 98 |
+
|
| 99 |
+
## License
|
| 100 |
+
|
| 101 |
+
Derived from Gemma 3 — subject to [Gemma Terms of Use](https://ai.google.dev/gemma/terms).
|