--- license: apache-2.0 datasets: - Allanatrix/Scientific_Research_Tokenized language: - en base_model: - meta-llama/Llama-2-7b-chat-hf pipeline_tag: text-generation library_name: peft tags: - lora - peft - transformers - scientific-ml - fine-tuned - research-assistant - hypothesis-generation - scientific-writing - scientific-reasoning --- # Model Card for `nexa-Llama-sci7b` ## Model Details **Model Description**: `nexa-Llama-sci7b` is a fine-tuned variant of the open-weight `meta-llama/Llama-2-7b` model, optimized for scientific research generation tasks such as hypothesis generation, abstract writing, and methodology completion. Fine-tuning was performed using the PEFT (Parameter-Efficient Fine-Tuning) library with LoRA in 4-bit quantized mode using the `bitsandbytes` backend. This model is part of the **Nexa Scientific Intelligence** series, developed for scalable, automated scientific reasoning and domain-specific text generation. - **Developed by**: Allan (Independent Scientific Intelligence Architect) - **Funded by**: Self-funded - **Shared by**: Allan ([https://huggingface.co/allan-wandia](https://huggingface.co/allan-wandia)) - **Model type**: Decoder-only transformer (causal language model) - **Language(s)**: English (scientific domain-specific vocabulary) - **License**: Apache 2.0 (inherits from base model) - **Fine-tuned from**: `meta-llama/Llama-2-7b` - **Repository**: [https://huggingface.co/allan-wandia/nexa-Llama-sci7b](https://huggingface.co/allan-wandia/nexa-Llama-sci7b) - **Demo**: Coming soon via Hugging Face Spaces or Lambda inference endpoint ## Uses ### Direct Use - Scientific hypothesis generation - Abstract and method section synthesis - Domain-specific research writing - Semantic completion of structured research prompts ### Downstream Use - Fine-tuning or distillation into smaller expert models - Foundation for test-time reasoning agents - Seed model for bootstrapping larger synthetic scientific corpora ### Out-of-Scope Use - General conversation or chat use cases - Non-English scientific domains - Legal, financial, or clinical advice generation ## Bias, Risks, and Limitations While the model performs well on structured scientific input, it inherits biases from its base model (`meta-llama/Llama-2-7b`) and fine-tuning dataset. Results should be evaluated by domain experts before use in high-stakes settings. It may hallucinate plausible but incorrect facts, especially in low-data areas. ## Recommendations Users should: - Validate critical outputs against trusted scientific literature - Avoid deploying in clinical or regulatory environments without further evaluation - Consider additional domain fine-tuning for niche fields ## How to Get Started with the Model ```python from transformers import AutoTokenizer, AutoModelForCausalLM model_name = "allan-wandia/nexa-Llama-sci7b" tokenizer = AutoTokenizer.from_pretrained(model_name) model = AutoModelForCausalLM.from_pretrained(model_name, device_map="auto", torch_dtype="auto") prompt = "Generate a novel hypothesis in quantum materials research:" inputs = tokenizer(prompt, return_tensors="pt").to(model.device) outputs = model.generate(**inputs, max_new_tokens=250) print(tokenizer.decode(outputs[0], skip_special_tokens=True)) ``` ## Training Details # Training Data - Size: 100 million tokens sampled from a 500M+ token corpus - Source: Curated scientific literature, abstracts, methodologies, and domain-labeled corpora (Bio, Physics, QST, Astro) - Labeling: Token-level labels auto-generated via Nexa DataVault tokenizer infrastructure # Preprocessing - Tokenization with sequence truncation to 1024 tokens - Labeled and batched using CPU; inference dispatched to GPU asynchronously # Training Hyperparameters Base model: meta-llama/Llama-2-7b-chat-hf Sequence length: 1024 Batch size: 1 (with gradient accumulation) Gradient Accumulation Steps: 64 Effective Batch Size: 64 Learning rate: 2e-5 Epochs: 2 LoRA: Enabled (PEFT) Quantization: 4-bit via bitsandbytes Optimizer: 8-bit AdamW Framework: Transformers + PEFT + Accelerate # Environmental Impact Component Value Hardware Type 2× NVIDIA T4 GPUs Hours used ~7.5 Cloud Provider Kaggle (Google Cloud) Compute Region US Carbon Emitted Estimate pending (likely < 1kg CO2) ## Technical Specifications # Model Architecture Transformer decoder (Llama-2-7b architecture) LoRA adapters applied to attention and FFN layers Quantized with bitsandbytes to 4-bit for memory efficiency # Compute Infrastructure - CPU: Intel i5 8th Gen vPro (batch preprocessing) - GPU: 2× NVIDIA T4 (CUDA 12.1) # Software Stack - PEFT 0.12.0 - Transformers 4.51.3 - Accelerate - TRL - Torch 2.x # Citation ``` @misc{nexa-Llama-sci7b, title = {Nexa Llama Sci7b}, author = {Allan Wandia}, year = {2025}, howpublished = {\url{https://huggingface.co/allan-wandia/nexa-Llama-sci7b}}, note = {Fine-tuned model for scientific generation tasks} } ``` # Model Card Contact For questions, contact Allan via Hugging Face or at:📫 Email: allanw.mk@gmail.com Model Card Authors Allan Wandia (Independent ML Engineer and Systems Architect) # Glossary LoRA: Low-Rank Adaptation PEFT: Parameter-Efficient Fine-Tuning Safe Tensors: Secure, fast format for model weights # Links GitHub Repo and Notebook: https://github.com/DarkStarStrix/Nexa_Auto