---
license: apache-2.0
datasets:
- Allanatrix/Scientific_Research_Tokenized
language:
- en
base_model:
- meta-llama/Llama-2-7b-chat-hf
pipeline_tag: text-generation
library_name: peft
tags:
- lora
- peft
- transformers
- scientific-ml
- fine-tuned
- research-assistant
- hypothesis-generation
- scientific-writing
- scientific-reasoning
---

# Model Card for `nexa-Llama-sci7b`

## Model Details

**Model Description**:  
`nexa-Llama-sci7b` is a fine-tuned variant of the open-weight `meta-llama/Llama-2-7b` model, optimized for scientific research generation tasks such as hypothesis generation, abstract writing, and methodology completion. Fine-tuning was performed using the PEFT (Parameter-Efficient Fine-Tuning) library with LoRA in 4-bit quantized mode using the `bitsandbytes` backend.

This model is part of the **Nexa Scientific Intelligence** series, developed for scalable, automated scientific reasoning and domain-specific text generation.

- **Developed by**: Allan (Independent Scientific Intelligence Architect)
- **Funded by**: Self-funded
- **Shared by**: Allan ([https://huggingface.co/allan-wandia](https://huggingface.co/allan-wandia))
- **Model type**: Decoder-only transformer (causal language model)
- **Language(s)**: English (scientific domain-specific vocabulary)
- **License**: Apache 2.0 (inherits from base model)
- **Fine-tuned from**: `meta-llama/Llama-2-7b`
- **Repository**: [https://huggingface.co/allan-wandia/nexa-Llama-sci7b](https://huggingface.co/allan-wandia/nexa-Llama-sci7b)
- **Demo**: Coming soon via Hugging Face Spaces or Lambda inference endpoint

## Uses

### Direct Use
- Scientific hypothesis generation
- Abstract and method section synthesis
- Domain-specific research writing
- Semantic completion of structured research prompts

### Downstream Use
- Fine-tuning or distillation into smaller expert models
- Foundation for test-time reasoning agents
- Seed model for bootstrapping larger synthetic scientific corpora

### Out-of-Scope Use
- General conversation or chat use cases
- Non-English scientific domains
- Legal, financial, or clinical advice generation

## Bias, Risks, and Limitations

While the model performs well on structured scientific input, it inherits biases from its base model (`meta-llama/Llama-2-7b`) and fine-tuning dataset. Results should be evaluated by domain experts before use in high-stakes settings. It may hallucinate plausible but incorrect facts, especially in low-data areas.

## Recommendations

Users should:
- Validate critical outputs against trusted scientific literature
- Avoid deploying in clinical or regulatory environments without further evaluation
- Consider additional domain fine-tuning for niche fields

## How to Get Started with the Model

```python
from transformers import AutoTokenizer, AutoModelForCausalLM

model_name = "allan-wandia/nexa-Llama-sci7b"

tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto")

prompt = "Generate a novel hypothesis in quantum materials research:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=250)

print(tokenizer.decode(outputs[0], skip_special_tokens=True))
```

## Training Details
# Training Data

- Size: 100 million tokens sampled from a 500M+ token corpus
- Source: Curated scientific literature, abstracts, methodologies, and domain-labeled corpora (Bio, Physics, QST, Astro)
- Labeling: Token-level labels auto-generated via Nexa DataVault tokenizer infrastructure

# Preprocessing

- Tokenization with sequence truncation to 1024 tokens
- Labeled and batched using CPU; inference dispatched to GPU asynchronously

# Training Hyperparameters

Base model: meta-llama/Llama-2-7b-chat-hf
Sequence length: 1024
Batch size: 1 (with gradient accumulation)
Gradient Accumulation Steps: 64
Effective Batch Size: 64
Learning rate: 2e-5
Epochs: 2
LoRA: Enabled (PEFT)
Quantization: 4-bit via bitsandbytes
Optimizer: 8-bit AdamW
Framework: Transformers + PEFT + Accelerate

# Environmental Impact

Component
Value

Hardware Type
2× NVIDIA T4 GPUs

Hours used
~7.5

Cloud Provider
Kaggle (Google Cloud)

Compute Region
US

Carbon Emitted
Estimate pending (likely < 1kg CO2)


## Technical Specifications
# Model Architecture

Transformer decoder (Llama-2-7b architecture)
LoRA adapters applied to attention and FFN layers
Quantized with bitsandbytes to 4-bit for memory efficiency

# Compute Infrastructure

- CPU: Intel i5 8th Gen vPro (batch preprocessing)
- GPU: 2× NVIDIA T4 (CUDA 12.1)

# Software Stack

- PEFT 0.12.0
- Transformers 4.51.3
- Accelerate
- TRL
- Torch 2.x

# Citation
```
@misc{nexa-Llama-sci7b,
  title = {Nexa Llama Sci7b},
  author = {Allan Wandia},
  year = {2025},
  howpublished = {\url{https://huggingface.co/allan-wandia/nexa-Llama-sci7b}},
  note = {Fine-tuned model for scientific generation tasks}
}
```

# Model Card Contact
For questions, contact Allan via Hugging Face or at:📫 Email: allanw.mk@gmail.com
Model Card Authors

Allan Wandia (Independent ML Engineer and Systems Architect)

# Glossary
LoRA: Low-Rank Adaptation
PEFT: Parameter-Efficient Fine-Tuning
Safe Tensors: Secure, fast format for model weights

# Links
GitHub Repo and Notebook: https://github.com/DarkStarStrix/Nexa_Auto