|
|
--- |
|
|
datasets: |
|
|
- tiiuae/falcon-refinedweb |
|
|
language: |
|
|
- en |
|
|
inference: false |
|
|
license: apache-2.0 |
|
|
base_model: tiiuae/falcon-7b |
|
|
model_creator: Technology Innovation Institute |
|
|
model_type: causal-lm |
|
|
pipeline_tag: text-generation |
|
|
--- |
|
|
|
|
|
# 🦅 Falcon-7B Model Card (MarkAI Hosted Version) |
|
|
|
|
|
<img src="https://mg-zon.vercel.app/_next/image?url=icons%2Fmarkai.png&w=48&q=75" width="300" alt="Falcon Logo"> |
|
|
|
|
|
## Model Overview |
|
|
**Falcon-7B** is a 7 billion parameter causal decoder-only model developed by [Technology Innovation Institute (TII)](https://www.tii.ae). This repository hosts the original model weights as part of MarkAI's model collection. |
|
|
|
|
|
## Technical Specifications |
|
|
### Architecture |
|
|
| Component | Specification | |
|
|
|--------------------|----------------------------------------| |
|
|
| Model Type | Causal Decoder-only | |
|
|
| Attention Mechanism | Multi-Query + FlashAttention | |
|
|
| Positional Embeddings | Rotary Positional Embeddings | |
|
|
| Normalization | Single LayerNorm per block | |
|
|
|
|
|
### Training Details |
|
|
| Parameter | Value | |
|
|
|--------------------|----------------------------------------| |
|
|
| Training Tokens | 1,500B (1.5 trillion) | |
|
|
| Training Compute | 384 × A100 40GB GPUs (P4d instances) | |
|
|
| Training Time | ~2 weeks | |
|
|
| Precision | bfloat16 | |
|
|
|
|
|
## Usage Examples |
|
|
|
|
|
### Text Generation |
|
|
```python |
|
|
from transformers import AutoModelForCausalLM, AutoTokenizer |
|
|
import torch |
|
|
|
|
|
model = AutoModelForCausalLM.from_pretrained( |
|
|
"ibrahimlasfar/MarkAI", |
|
|
device_map="auto", |
|
|
torch_dtype=torch.bfloat16, |
|
|
trust_remote_code=True |
|
|
) |
|
|
tokenizer = AutoTokenizer.from_pretrained("ibrahimlasfar/MarkAI") |
|
|
|
|
|
inputs = tokenizer( |
|
|
"The future of artificial intelligence", |
|
|
return_tensors="pt" |
|
|
).to("cuda") |
|
|
outputs = model.generate( |
|
|
**inputs, |
|
|
max_length=100, |
|
|
do_sample=True, |
|
|
top_k=10 |
|
|
) |
|
|
print(tokenizer.decode(outputs[0])) |