Update README.md

bfd379a verified 8 months ago

5.34 kB

	---
	license: apache-2.0
	datasets:
	- Allanatrix/Scientific_Research_Tokenized
	language:
	- en
	base_model:
	- meta-llama/Llama-2-7b-chat-hf
	pipeline_tag: text-generation
	library_name: peft
	tags:
	- lora
	- peft
	- transformers
	- scientific-ml
	- fine-tuned
	- research-assistant
	- hypothesis-generation
	- scientific-writing
	- scientific-reasoning
	---

	# Model Card for `nexa-Llama-sci7b`

	## Model Details

	Model Description:
	`nexa-Llama-sci7b` is a fine-tuned variant of the open-weight `meta-llama/Llama-2-7b` model, optimized for scientific research generation tasks such as hypothesis generation, abstract writing, and methodology completion. Fine-tuning was performed using the PEFT (Parameter-Efficient Fine-Tuning) library with LoRA in 4-bit quantized mode using the `bitsandbytes` backend.

	This model is part of the Nexa Scientific Intelligence series, developed for scalable, automated scientific reasoning and domain-specific text generation.

	- Developed by: Allan (Independent Scientific Intelligence Architect)
	- Funded by: Self-funded
	- Shared by: Allan ([https://huggingface.co/allan-wandia](https://huggingface.co/allan-wandia))
	- Model type: Decoder-only transformer (causal language model)
	- Language(s): English (scientific domain-specific vocabulary)
	- License: Apache 2.0 (inherits from base model)
	- Fine-tuned from: `meta-llama/Llama-2-7b`
	- Repository: [https://huggingface.co/allan-wandia/nexa-Llama-sci7b](https://huggingface.co/allan-wandia/nexa-Llama-sci7b)
	- Demo: Coming soon via Hugging Face Spaces or Lambda inference endpoint

	## Uses

	### Direct Use
	- Scientific hypothesis generation
	- Abstract and method section synthesis
	- Domain-specific research writing
	- Semantic completion of structured research prompts

	### Downstream Use
	- Fine-tuning or distillation into smaller expert models
	- Foundation for test-time reasoning agents
	- Seed model for bootstrapping larger synthetic scientific corpora

	### Out-of-Scope Use
	- General conversation or chat use cases
	- Non-English scientific domains
	- Legal, financial, or clinical advice generation

	## Bias, Risks, and Limitations

	While the model performs well on structured scientific input, it inherits biases from its base model (`meta-llama/Llama-2-7b`) and fine-tuning dataset. Results should be evaluated by domain experts before use in high-stakes settings. It may hallucinate plausible but incorrect facts, especially in low-data areas.

	## Recommendations

	Users should:
	- Validate critical outputs against trusted scientific literature
	- Avoid deploying in clinical or regulatory environments without further evaluation
	- Consider additional domain fine-tuning for niche fields

	## How to Get Started with the Model

	```python
	from transformers import AutoTokenizer, AutoModelForCausalLM

	model_name = "allan-wandia/nexa-Llama-sci7b"

	tokenizer = AutoTokenizer.from_pretrained(model_name)
	model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto")

	prompt = "Generate a novel hypothesis in quantum materials research:"
	inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
	outputs = model.generate(**inputs, max_new_tokens=250)

	print(tokenizer.decode(outputs[0], skip_special_tokens=True))
	```

	## Training Details
	# Training Data

	- Size: 100 million tokens sampled from a 500M+ token corpus
	- Source: Curated scientific literature, abstracts, methodologies, and domain-labeled corpora (Bio, Physics, QST, Astro)
	- Labeling: Token-level labels auto-generated via Nexa DataVault tokenizer infrastructure

	# Preprocessing

	- Tokenization with sequence truncation to 1024 tokens
	- Labeled and batched using CPU; inference dispatched to GPU asynchronously

	# Training Hyperparameters

	Base model: meta-llama/Llama-2-7b-chat-hf
	Sequence length: 1024
	Batch size: 1 (with gradient accumulation)
	Gradient Accumulation Steps: 64
	Effective Batch Size: 64
	Learning rate: 2e-5
	Epochs: 2
	LoRA: Enabled (PEFT)
	Quantization: 4-bit via bitsandbytes
	Optimizer: 8-bit AdamW
	Framework: Transformers + PEFT + Accelerate

	# Environmental Impact

	Component
	Value

	Hardware Type
	2× NVIDIA T4 GPUs

	Hours used
	~7.5

	Cloud Provider
	Kaggle (Google Cloud)

	Compute Region
	US

	Carbon Emitted
	Estimate pending (likely < 1kg CO2)


	## Technical Specifications
	# Model Architecture

	Transformer decoder (Llama-2-7b architecture)
	LoRA adapters applied to attention and FFN layers
	Quantized with bitsandbytes to 4-bit for memory efficiency

	# Compute Infrastructure

	- CPU: Intel i5 8th Gen vPro (batch preprocessing)
	- GPU: 2× NVIDIA T4 (CUDA 12.1)

	# Software Stack

	- PEFT 0.12.0
	- Transformers 4.51.3
	- Accelerate
	- TRL
	- Torch 2.x

	# Citation
	```
	@misc{nexa-Llama-sci7b,
	title = {Nexa Llama Sci7b},
	author = {Allan Wandia},
	year = {2025},
	howpublished = {\url{https://huggingface.co/allan-wandia/nexa-Llama-sci7b}},
	note = {Fine-tuned model for scientific generation tasks}
	}
	```

	# Model Card Contact
	For questions, contact Allan via Hugging Face or at:📫 Email: allanw.mk@gmail.com
	Model Card Authors

	Allan Wandia (Independent ML Engineer and Systems Architect)

	# Glossary
	LoRA: Low-Rank Adaptation
	PEFT: Parameter-Efficient Fine-Tuning
	Safe Tensors: Secure, fast format for model weights

	# Links
	GitHub Repo and Notebook: https://github.com/DarkStarStrix/Nexa_Auto