Text Generation
PEFT
Safetensors
PyTorch
mistral
lora
code-generation
neural-architecture-search
delta-nas
conversational
Instructions to use ABrain/Delta-NAS-Mistral-7B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PEFT
How to use ABrain/Delta-NAS-Mistral-7B with PEFT:
from peft import PeftModel from transformers import AutoModelForCausalLM base_model = AutoModelForCausalLM.from_pretrained("mistralai/Mistral-7B-Instruct-v0.3") model = PeftModel.from_pretrained(base_model, "ABrain/Delta-NAS-Mistral-7B") - Notebooks
- Google Colab
- Kaggle
| base_model: mistralai/Mistral-7B-Instruct-v0.3 | |
| library_name: peft | |
| license: apache-2.0 | |
| pipeline_tag: text-generation | |
| tags: | |
| - lora | |
| - code-generation | |
| - neural-architecture-search | |
| - delta-nas | |
| - pytorch | |
| # Delta-NAS Mistral-7B-Instruct LoRA Adapter | |
| This is a fully merged model (LoRA weights merged into base) for [Mistral-7B-Instruct-v0.3](https://huggingface.co/mistralai/Mistral-7B-Instruct-v0.3), fine-tuned for **delta-based Neural Architecture Search (NAS)** — generating novel PyTorch image-classification architectures via unified code diffs. | |
| ## Model Description | |
| This adapter is the result of 22 iterative fine-tuning cycles on the delta-NAS pipeline described in **"Delta-Based Neural Architecture Search: LLM Fine-Tuning via Code Diffs"**. The model generates unified diffs that modify a baseline neural network architecture to produce new, functional PyTorch models. | |
| ### Training Details | |
| - **Base model**: `mistralai/Mistral-7B-Instruct-v0.3` | |
| - **Fine-tuning method**: LoRA (Low-Rank Adaptation) | |
| - **LoRA rank (r)**: 16 | |
| - **LoRA alpha**: 32 | |
| - **LoRA dropout**: 0.05 | |
| - **Target modules**: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj | |
| - **Training cycles**: 22 (iterative self-improvement) | |
| - **Total trained candidates**: 733 | |
| - **Admitted novel architectures**: 68 (MinHash-Jaccard novelty filter + τ_acc ≥ 0.40) | |
| ### Evaluation Datasets | |
| Models were evaluated on 6 LEMUR image-classification benchmarks: | |
| - CIFAR-10, CIFAR-100, MNIST, SVHN, ImageNette, CelebA-Gender | |
| ### Key Results | |
| | Metric | Value | | |
| |--------|-------| | |
| | Trained candidates | 733 | | |
| | Valid rate (compiles + trains) | 66.4% | | |
| | Mean 1-epoch accuracy | 50.0% (±8.1% SD across cycles) | | |
| | ≥40% accuracy rate | 58.4% | | |
| | Novel architectures admitted to LEMUR | 68 | | |
| ## Usage | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| from peft import PeftModel | |
| # Load base model | |
| base_model = AutoModelForCausalLM.from_pretrained( | |
| "mistralai/Mistral-7B-Instruct-v0.3", | |
| torch_dtype="auto", | |
| device_map="auto" | |
| ) | |
| tokenizer = AutoTokenizer.from_pretrained("mistralai/Mistral-7B-Instruct-v0.3") | |
| # Load LoRA adapter | |
| model = PeftModel.from_pretrained(base_model, "ABrain/Delta-NAS-Mistral-7B") | |
| # Generate a diff to modify a baseline architecture | |
| prompt = """Given the following PyTorch neural network baseline: | |
| [baseline code here] | |
| Generate a unified diff that creates a novel architecture variant.""" | |
| inputs = tokenizer(prompt, return_tensors="pt").to(model.device) | |
| outputs = model.generate(**inputs, max_new_tokens=512) | |
| print(tokenizer.decode(outputs[0], skip_special_tokens=True)) | |
| ``` | |
| ## Associated Resources | |
| - **Code**: [ABrain-One/nn-gpt](https://github.com/ABrain-One/nn-gpt) | |
| - **Generated models**: [ABrain-One/nn-dataset PR #204](https://github.com/ABrain-One/nn-dataset/pull/204) (197 del-* prefixed architectures) | |
| - **Paper**: "Delta-Based Neural Architecture Search: LLM Fine-Tuning via Code Diffs" (submitted to CVPR 2026) | |
| ## Citation | |
| ```bibtex | |
| @article{deltanas2026, | |
| title={Delta-Based Neural Architecture Search: LLM Fine-Tuning via Code Diffs}, | |
| author={Adhikari, Santosh and Ignatov, Dmitry}, | |
| year={2026} | |
| } | |
| ``` | |
| ## License | |
| Apache 2.0 License (same as the base model) | |