|
|
--- |
|
|
license: apache-2.0 |
|
|
language: |
|
|
- en |
|
|
library_name: transformers |
|
|
tags: |
|
|
- text-generation |
|
|
- causal-lm |
|
|
- transformer |
|
|
- argonne |
|
|
- pretrained |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# Argonne 2.0 |
|
|
|
|
|
A **4.9 billion parameter** decoder-only transformer language model trained from scratch. |
|
|
|
|
|
## Model Architecture |
|
|
|
|
|
| Component | Specification | |
|
|
|-----------|--------------| |
|
|
| **Parameters** | ~4.9B | |
|
|
| **Layers** | 24 transformer blocks | |
|
|
| **Hidden Size** | 4,080 | |
|
|
| **Attention Heads** | 24 query / 8 key-value (GQA) | |
|
|
| **Context Length** | 4,096 tokens | |
|
|
| **Vocabulary Size** | 151,665 | |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
"PursuitOfDataScience/Argonne-2.0", |
|
|
torch_dtype=torch.bfloat16, |
|
|
device_map="auto", |
|
|
trust_remote_code=True |
|
|
) |
|
|
tokenizer = AutoTokenizer.from_pretrained("PursuitOfDataScience/Argonne-2.0", trust_remote_code=True) |
|
|
|
|
|
prompt = "The future of AI is" |
|
|
inputs = tokenizer(prompt, return_tensors="pt").to(model.device) |
|
|
outputs = model.generate(**inputs, max_length=256, do_sample=True, temperature=0.7) |
|
|
print(tokenizer.decode(outputs[0], skip_special_tokens=True)) |
|
|
``` |
|
|
|
|
|
## License |
|
|
|
|
|
Apache 2.0 |
|
|
|
|
|
## Citation |
|
|
|
|
|
```bibtex |
|
|
@misc{argonne2, |
|
|
author = {PursuitOfDataScience}, |
|
|
title = {Argonne 2.0: A 4.9B Parameter Language Model}, |
|
|
year = {2026}, |
|
|
publisher = {Hugging Face}, |
|
|
url = {https://huggingface.co/PursuitOfDataScience/Argonne-2.0} |
|
|
} |
|
|
``` |
|
|
|
|
|
## Links |
|
|
|
|
|
- GitHub: [PursuitOfDataScience](https://github.com/PursuitOfDataScience) |
|
|
- Hugging Face: [PursuitOfDataScience](https://huggingface.co/PursuitOfDataScience) |
|
|
|