| | --- |
| | license: apache-2.0 |
| | datasets: |
| | - Allanatrix/Scientific_Research_Tokenized |
| | language: |
| | - en |
| | base_model: |
| | - meta-llama/Llama-2-7b-chat-hf |
| | pipeline_tag: text-generation |
| | library_name: peft |
| | tags: |
| | - lora |
| | - peft |
| | - transformers |
| | - scientific-ml |
| | - fine-tuned |
| | - research-assistant |
| | - hypothesis-generation |
| | - scientific-writing |
| | - scientific-reasoning |
| | --- |
| | |
| | # Model Card for `nexa-Llama-sci7b` |
| |
|
| | ## Model Details |
| |
|
| | **Model Description**: |
| | `nexa-Llama-sci7b` is a fine-tuned variant of the open-weight `meta-llama/Llama-2-7b` model, optimized for scientific research generation tasks such as hypothesis generation, abstract writing, and methodology completion. Fine-tuning was performed using the PEFT (Parameter-Efficient Fine-Tuning) library with LoRA in 4-bit quantized mode using the `bitsandbytes` backend. |
| |
|
| | This model is part of the **Nexa Scientific Intelligence** series, developed for scalable, automated scientific reasoning and domain-specific text generation. |
| |
|
| | - **Developed by**: Allan (Independent Scientific Intelligence Architect) |
| | - **Funded by**: Self-funded |
| | - **Shared by**: Allan ([https://huggingface.co/allan-wandia](https://huggingface.co/allan-wandia)) |
| | - **Model type**: Decoder-only transformer (causal language model) |
| | - **Language(s)**: English (scientific domain-specific vocabulary) |
| | - **License**: Apache 2.0 (inherits from base model) |
| | - **Fine-tuned from**: `meta-llama/Llama-2-7b` |
| | - **Repository**: [https://huggingface.co/allan-wandia/nexa-Llama-sci7b](https://huggingface.co/allan-wandia/nexa-Llama-sci7b) |
| | - **Demo**: Coming soon via Hugging Face Spaces or Lambda inference endpoint |
| |
|
| | ## Uses |
| |
|
| | ### Direct Use |
| | - Scientific hypothesis generation |
| | - Abstract and method section synthesis |
| | - Domain-specific research writing |
| | - Semantic completion of structured research prompts |
| |
|
| | ### Downstream Use |
| | - Fine-tuning or distillation into smaller expert models |
| | - Foundation for test-time reasoning agents |
| | - Seed model for bootstrapping larger synthetic scientific corpora |
| |
|
| | ### Out-of-Scope Use |
| | - General conversation or chat use cases |
| | - Non-English scientific domains |
| | - Legal, financial, or clinical advice generation |
| |
|
| | ## Bias, Risks, and Limitations |
| |
|
| | While the model performs well on structured scientific input, it inherits biases from its base model (`meta-llama/Llama-2-7b`) and fine-tuning dataset. Results should be evaluated by domain experts before use in high-stakes settings. It may hallucinate plausible but incorrect facts, especially in low-data areas. |
| |
|
| | ## Recommendations |
| |
|
| | Users should: |
| | - Validate critical outputs against trusted scientific literature |
| | - Avoid deploying in clinical or regulatory environments without further evaluation |
| | - Consider additional domain fine-tuning for niche fields |
| |
|
| | ## How to Get Started with the Model |
| |
|
| | ```python |
| | from transformers import AutoTokenizer, AutoModelForCausalLM |
| | |
| | model_name = "allan-wandia/nexa-Llama-sci7b" |
| | |
| | tokenizer = AutoTokenizer.from_pretrained(model_name) |
| | model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto") |
| | |
| | prompt = "Generate a novel hypothesis in quantum materials research:" |
| | inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
| | outputs = model.generate(**inputs, max_new_tokens=250) |
| | |
| | print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
| | ``` |
| |
|
| | ## Training Details |
| | # Training Data |
| |
|
| | - Size: 100 million tokens sampled from a 500M+ token corpus |
| | - Source: Curated scientific literature, abstracts, methodologies, and domain-labeled corpora (Bio, Physics, QST, Astro) |
| | - Labeling: Token-level labels auto-generated via Nexa DataVault tokenizer infrastructure |
| |
|
| | # Preprocessing |
| |
|
| | - Tokenization with sequence truncation to 1024 tokens |
| | - Labeled and batched using CPU; inference dispatched to GPU asynchronously |
| |
|
| | # Training Hyperparameters |
| |
|
| | Base model: meta-llama/Llama-2-7b-chat-hf |
| | Sequence length: 1024 |
| | Batch size: 1 (with gradient accumulation) |
| | Gradient Accumulation Steps: 64 |
| | Effective Batch Size: 64 |
| | Learning rate: 2e-5 |
| | Epochs: 2 |
| | LoRA: Enabled (PEFT) |
| | Quantization: 4-bit via bitsandbytes |
| | Optimizer: 8-bit AdamW |
| | Framework: Transformers + PEFT + Accelerate |
| |
|
| | # Environmental Impact |
| |
|
| | Component |
| | Value |
| |
|
| | Hardware Type |
| | 2× NVIDIA T4 GPUs |
| |
|
| | Hours used |
| | ~7.5 |
| |
|
| | Cloud Provider |
| | Kaggle (Google Cloud) |
| |
|
| | Compute Region |
| | US |
| |
|
| | Carbon Emitted |
| | Estimate pending (likely < 1kg CO2) |
| |
|
| |
|
| | ## Technical Specifications |
| | # Model Architecture |
| |
|
| | Transformer decoder (Llama-2-7b architecture) |
| | LoRA adapters applied to attention and FFN layers |
| | Quantized with bitsandbytes to 4-bit for memory efficiency |
| |
|
| | # Compute Infrastructure |
| |
|
| | - CPU: Intel i5 8th Gen vPro (batch preprocessing) |
| | - GPU: 2× NVIDIA T4 (CUDA 12.1) |
| |
|
| | # Software Stack |
| |
|
| | - PEFT 0.12.0 |
| | - Transformers 4.51.3 |
| | - Accelerate |
| | - TRL |
| | - Torch 2.x |
| |
|
| | # Citation |
| | ``` |
| | @misc{nexa-Llama-sci7b, |
| | title = {Nexa Llama Sci7b}, |
| | author = {Allan Wandia}, |
| | year = {2025}, |
| | howpublished = {\url{https://huggingface.co/allan-wandia/nexa-Llama-sci7b}}, |
| | note = {Fine-tuned model for scientific generation tasks} |
| | } |
| | ``` |
| |
|
| | # Model Card Contact |
| | For questions, contact Allan via Hugging Face or at:📫 Email: allanw.mk@gmail.com |
| | Model Card Authors |
| |
|
| | Allan Wandia (Independent ML Engineer and Systems Architect) |
| |
|
| | # Glossary |
| | LoRA: Low-Rank Adaptation |
| | PEFT: Parameter-Efficient Fine-Tuning |
| | Safe Tensors: Secure, fast format for model weights |
| |
|
| | # Links |
| | GitHub Repo and Notebook: https://github.com/DarkStarStrix/Nexa_Auto |
| | |
| | |
| | |