jordimas
/

bloom-ctranslate2

Model card Files Files and versions

jordimas commited on Jun 30, 2023

Commit

920004e

·

1 Parent(s): f2572d6

Update README

Files changed (1) hide show

README.md +54 -0

README.md CHANGED Viewed

@@ -1,3 +1,57 @@
 ---
 license: bigscience-bloom-rail-1.0
 ---

 ---
 license: bigscience-bloom-rail-1.0
 ---
+# Bloom CTranslate2's model
+This is a collection of some of the [Bigscience Bloom](https://huggingface.co/bigscience/bloom) exported to
+[CTranslate2](https://github.com/OpenNMT/CTranslate2) model format. This allows to load and usage these models
+efficently on CPU or GPU.
+## Models
+The models have been converted to *float16* and can be load in with any other quantification method (e.g. *int 8*).
+| Model name | Description |
+| --- | --- |
+| [bloom-560m](https://huggingface.co/bigscience/bloom-560m) |  560M parameter model pretrained on ROOTS|
+| [bloom-3b](https://huggingface.co/bigscience/bloom-3b) | 3B parameter model pretrained on ROOTS
+| [bloomz-7b1](https://huggingface.co/bigscience/bloomz-7b1) |  7.1B parameter model finetuned on xP3|
+| [bloomz-7b1-mt](https://huggingface.co/bigscience/bloomz-7b1-mt) |  7.1B parameter model finetuned on xP3mt |
+| [mt0-xxl-mt](https://huggingface.co/bigscience/mt0-xxl-mt) |  13B parameter model finetuned on xP3|
+## Simple code to use them
+Install dependencies:
+```shell
+pip install huggingface_hub ctranslate2 transformers torch
+```
+Usage:
+```python
+model_name = "bloomz-7b1"
+prompt = "Hello, I am Joan and I am from Barcelona and"
+repo_id = "jordimas/bloom-ctranslate2"
+output_dir = "output/"
+kwargs = {
+    "local_dir" : output_dir,
+    "local_dir_use_symlinks" : False,
+}
+huggingface_hub.snapshot_download(repo_id = repo_id, allow_patterns=f"*{model_name}*", **kwargs)
+model = f"{output_dir}{model_name}"
+print(f"model: {model}")
+generator = ctranslate2.Generator(model, compute_type="int8")
+tokenizer = transformers.AutoTokenizer.from_pretrained(model)
+start_tokens = tokenizer.convert_ids_to_tokens(tokenizer.encode(prompt))
+results = generator.generate_batch([start_tokens], max_length=90)
+result = tokenizer.decode(results[0].sequences_ids[0])
+print(f"Result: {result}")
+```