Commit ·
11feb53
1
Parent(s): 1a53e98
Update README.md
Browse files
README.md
CHANGED
|
@@ -10,20 +10,6 @@ license: apache-2.0
|
|
| 10 |
|
| 11 |
**Falcon-7B is a 7B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained on 1,500B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. It is made available under the Apache 2.0 license.**
|
| 12 |
|
| 13 |
-
*Paper coming soon* 😊.
|
| 14 |
-
|
| 15 |
-
🤗 To get started with Falcon (inference, finetuning, quantization, etc.), we recommend reading [this great blogpost fron HF](https://huggingface.co/blog/falcon)!
|
| 16 |
-
|
| 17 |
-
|
| 18 |
-
## Why use Falcon-7B?
|
| 19 |
-
|
| 20 |
-
* **It outperforms comparable open-source models** (e.g., [MPT-7B](https://huggingface.co/mosaicml/mpt-7b), [StableLM](https://github.com/Stability-AI/StableLM), [RedPajama](https://huggingface.co/togethercomputer/RedPajama-INCITE-Base-7B-v0.1) etc.), thanks to being trained on 1,500B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. See the [OpenLLM Leaderboard](https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard).
|
| 21 |
-
* **It features an architecture optimized for inference**, with FlashAttention ([Dao et al., 2022](https://arxiv.org/abs/2205.14135)) and multiquery ([Shazeer et al., 2019](https://arxiv.org/abs/1911.02150)).
|
| 22 |
-
* **It is made available under a permissive Apache 2.0 license allowing for commercial use**, without any royalties or restrictions.
|
| 23 |
-
|
| 24 |
-
⚠️ **This is a raw, pretrained model, which should be further finetuned for most usecases.** If you are looking for a version better suited to taking generic instructions in a chat format, we recommend taking a look at [Falcon-7B-Instruct](https://huggingface.co/tiiuae/falcon-7b-instruct).
|
| 25 |
-
|
| 26 |
-
🔥 **Looking for an even more powerful model?** [Falcon-40B](https://huggingface.co/tiiuae/falcon-40b) is Falcon-7B's big brother!
|
| 27 |
|
| 28 |
```python
|
| 29 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
@@ -68,9 +54,6 @@ You will need **at least 16GB of memory** to swiftly run inference with Falcon-7
|
|
| 68 |
- **Language(s) (NLP):** English and French;
|
| 69 |
- **License:** Apache 2.0.
|
| 70 |
|
| 71 |
-
### Model Source
|
| 72 |
-
|
| 73 |
-
- **Paper:** *coming soon*.
|
| 74 |
|
| 75 |
## Uses
|
| 76 |
|
|
|
|
| 10 |
|
| 11 |
**Falcon-7B is a 7B parameters causal decoder-only model built by [TII](https://www.tii.ae) and trained on 1,500B tokens of [RefinedWeb](https://huggingface.co/datasets/tiiuae/falcon-refinedweb) enhanced with curated corpora. It is made available under the Apache 2.0 license.**
|
| 12 |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 13 |
|
| 14 |
```python
|
| 15 |
from transformers import AutoTokenizer, AutoModelForCausalLM
|
|
|
|
| 54 |
- **Language(s) (NLP):** English and French;
|
| 55 |
- **License:** Apache 2.0.
|
| 56 |
|
|
|
|
|
|
|
|
|
|
| 57 |
|
| 58 |
## Uses
|
| 59 |
|