EdBerg commited on
Commit
e8ea779
·
verified ·
1 Parent(s): 7b13213

EdBerg/cohere_Baha_1A_cupper_in_gold

Browse files
README.md ADDED
@@ -0,0 +1,57 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: CohereForAI/c4ai-command-r7b-arabic-02-2025
3
+ library_name: transformers
4
+ model_name: cohere_Baha_1A_cupper_in_gold
5
+ tags:
6
+ - generated_from_trainer
7
+ - trl
8
+ - sft
9
+ licence: license
10
+ ---
11
+
12
+ # Model Card for cohere_Baha_1A_cupper_in_gold
13
+
14
+ This model is a fine-tuned version of [CohereForAI/c4ai-command-r7b-arabic-02-2025](https://huggingface.co/CohereForAI/c4ai-command-r7b-arabic-02-2025).
15
+ It has been trained using [TRL](https://github.com/huggingface/trl).
16
+
17
+ ## Quick start
18
+
19
+ ```python
20
+ from transformers import pipeline
21
+
22
+ question = "If you had a time machine, but could only go to the past or the future once and never return, which would you choose and why?"
23
+ generator = pipeline("text-generation", model="EdBerg/cohere_Baha_1A_cupper_in_gold", device="cuda")
24
+ output = generator([{"role": "user", "content": question}], max_new_tokens=128, return_full_text=False)[0]
25
+ print(output["generated_text"])
26
+ ```
27
+
28
+ ## Training procedure
29
+
30
+ [<img src="https://raw.githubusercontent.com/wandb/assets/main/wandb-github-badge-28.svg" alt="Visualize in Weights & Biases" width="150" height="24"/>](https://wandb.ai/harpermia882/huggingface/runs/ha3bi4pf)
31
+
32
+ This model was trained with SFT.
33
+
34
+ ### Framework versions
35
+
36
+ - TRL: 0.12.0
37
+ - Transformers: 4.48.3
38
+ - Pytorch: 2.6.0+cu124
39
+ - Datasets: 3.4.1
40
+ - Tokenizers: 0.21.1
41
+
42
+ ## Citations
43
+
44
+
45
+
46
+ Cite TRL as:
47
+
48
+ ```bibtex
49
+ @misc{vonwerra2022trl,
50
+ title = {{TRL: Transformer Reinforcement Learning}},
51
+ author = {Leandro von Werra and Younes Belkada and Lewis Tunstall and Edward Beeching and Tristan Thrush and Nathan Lambert and Shengyi Huang and Kashif Rasul and Quentin Gallouédec},
52
+ year = 2020,
53
+ journal = {GitHub repository},
54
+ publisher = {GitHub},
55
+ howpublished = {\url{https://github.com/huggingface/trl}}
56
+ }
57
+ ```
adapter_config.json CHANGED
@@ -24,13 +24,13 @@
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
27
- "down_proj",
28
  "o_proj",
29
- "gate_proj",
30
  "v_proj",
31
  "up_proj",
32
- "q_proj",
33
- "k_proj"
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
 
24
  "rank_pattern": {},
25
  "revision": null,
26
  "target_modules": [
27
+ "q_proj",
28
  "o_proj",
29
+ "k_proj",
30
  "v_proj",
31
  "up_proj",
32
+ "down_proj",
33
+ "gate_proj"
34
  ],
35
  "task_type": "CAUSAL_LM",
36
  "trainable_token_indices": null,
adapter_model.safetensors CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:d139a8ac3ffa27a001f55ec80e9d58cf010a4d33d3d8ebab82687360ca84ae2c
3
  size 335604696
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:14e7c66d00e1a28e8092591ea4fc71f79c8eb080484dc6ca99841a97a39a7848
3
  size 335604696
runs/Mar17_18-34-43_f2553125dbc1/events.out.tfevents.1742236485.f2553125dbc1.252.0 ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:91b96c2e6bf7aa43ad03a5edb0fa4268fbd1d36b5bd1371f52ae94776d716b54
3
+ size 111671
special_tokens_map.json CHANGED
@@ -17,7 +17,13 @@
17
  "rstrip": false,
18
  "single_word": false
19
  },
20
- "pad_token": "<|END_OF_TURN_TOKEN|>",
 
 
 
 
 
 
21
  "unk_token": {
22
  "content": "<UNK>",
23
  "lstrip": false,
 
17
  "rstrip": false,
18
  "single_word": false
19
  },
20
+ "pad_token": {
21
+ "content": "<PAD>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false
26
+ },
27
  "unk_token": {
28
  "content": "<UNK>",
29
  "lstrip": false,
tokenizer.json CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:29d3034c1c9469f68e3b0863dc1debd8b1b29dccec11d51ce0ac29c94592e771
3
- size 20125020
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:953b2730d23ca19e7dca96f75f3e10b497bb679290b06d8981190bff2039fc72
3
+ size 20124922
tokenizer_config.json CHANGED
@@ -357,7 +357,7 @@
357
  "legacy": true,
358
  "merges_file": null,
359
  "model_max_length": 1000000000000000019884624838656,
360
- "pad_token": "<|END_OF_TURN_TOKEN|>",
361
  "sp_model_kwargs": {},
362
  "spaces_between_special_tokens": false,
363
  "tokenizer_class": "CohereTokenizer",
 
357
  "legacy": true,
358
  "merges_file": null,
359
  "model_max_length": 1000000000000000019884624838656,
360
+ "pad_token": "<PAD>",
361
  "sp_model_kwargs": {},
362
  "spaces_between_special_tokens": false,
363
  "tokenizer_class": "CohereTokenizer",
training_args.bin CHANGED
@@ -1,3 +1,3 @@
1
  version https://git-lfs.github.com/spec/v1
2
- oid sha256:43e64376ec65d51c1187dbd6d76917a8ad13b8f31ab2858b76fe7db8e36d7846
3
  size 5624
 
1
  version https://git-lfs.github.com/spec/v1
2
+ oid sha256:d378855a0e4e706bc73841d34742c84bef5e0491cd6b6570b3f89d01c761c0c3
3
  size 5624