saucam commited on
Commit
8a3d3b2
·
verified ·
1 Parent(s): 00c0741

Update Readme

Browse files
Files changed (1) hide show
  1. README.md +46 -0
README.md CHANGED
@@ -17,6 +17,52 @@ base_model: google/gemma-7b
17
  - **License:** apache-2.0
18
  - **Finetuned from model :** google/gemma-7b
19
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
20
  This gemma model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
21
 
22
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
17
  - **License:** apache-2.0
18
  - **Finetuned from model :** google/gemma-7b
19
 
20
+ This is a finetuned version of gemma-7b on sarvamai/samvaad-hi-v1 hindi dataset using chatml format.
21
+
22
+ ## Inference
23
+
24
+ We can use unsloth for fast inference
25
+ ```
26
+ from unsloth import FastLanguageModel
27
+ from unsloth.chat_templates import get_chat_template
28
+ from unsloth.chat_templates import get_chat_template
29
+
30
+ model, tokenizer = FastLanguageModel.from_pretrained(
31
+ model_name = "saucam/gemma-samvaad-7b", # YOUR MODEL YOU USED FOR TRAINING
32
+ max_seq_length = 2048,
33
+ dtype = None,
34
+ load_in_4bit = False,
35
+ )
36
+ FastLanguageModel.for_inference(model) # Enable native 2x faster inference
37
+
38
+ tokenizer = get_chat_template(
39
+ tokenizer,
40
+ chat_template = "chatml",
41
+ map_eos_token = True, # Maps <|im_end|> to </s> instead
42
+ )
43
+
44
+ FastLanguageModel.for_inference(model) # Enable native 2x faster inference
45
+
46
+ messages = [
47
+ {"role": "user", "content": "(9+1)+(5+0). इसे 3 चरणों में हल करें."},
48
+ ]
49
+ inputs = tokenizer.apply_chat_template(
50
+ messages,
51
+ tokenize = True,
52
+ add_generation_prompt = True, # Must add for generation
53
+ return_tensors = "pt",
54
+ ).to("cuda")
55
+
56
+ outputs = model.generate(input_ids = inputs, max_new_tokens = 512, use_cache = True)
57
+ out = tokenizer.batch_decode(outputs)
58
+ print(out)
59
+ ```
60
+
61
+ ```
62
+ ['<bos><|im_start|>user\n(9+1)+(5+0). इसे 3 चरणों में हल करें.<|im_end|>\n
63
+ <|im_start|>assistant\n(9+1)+(5+0) को 3 चरणों में हल करने के लिए, हम इसे छोटे भागों में विभाजित कर सकते हैं। पहले चरण में, हम 9 को 1 से जोड़ते हैं, जो 10 देता है। दूसरे चरण में, हम 5 को 0 से जोड़ते हैं, जो 5 देता है। तीसरे चरण में, हम 10 को 5 से जोड़ते हैं, जो 15 देता है। इसलिए, (9+1)+(5+0) का परिणाम 15 है।<|im_end|>
64
+ ```
65
+
66
  This gemma model was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth) and Huggingface's TRL library.
67
 
68
  [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)