Arthur LAGACHERIE commited on
Commit
f62fbc0
·
verified ·
1 Parent(s): 0e06c01

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +43 -0
README.md CHANGED
@@ -20,6 +20,49 @@ This model uses the 4-bits quantization. So you need to install bitsandbytes to
20
  ```python
21
  pip install bitsandbytes
22
  ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
 
25
  # Model Trained Using AutoTrain
 
20
  ```python
21
  pip install bitsandbytes
22
  ```
23
+ For inference (streaming):
24
+ ```python
25
+ from transformers import AutoModelForCausalLM, AutoTokenizer, pipeline
26
+ import torch
27
+ from transformers import TextIteratorStreamer
28
+ from threading import Thread
29
+ device = 'cuda' if torch.cuda.is_available() else 'cpu'
30
+
31
+ model_id = "Arthur-LAGACHERIE/Reflection-Gemma-2-2b"
32
+
33
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
34
+ model = AutoModelForCausalLM.from_pretrained(model_id)
35
+
36
+ prompt = """
37
+ ### System
38
+ You are a world-class AI system, capable of complex reasoning and reflection.
39
+ Reason through the query inside <thinking> tags, and then provide your final response inside <output> tags.
40
+ If you detect that you made a mistake in your reasoning at any point, correct yourself inside <reflection> tags.
41
+ Try an answer and see if it's correct before generate the ouput.
42
+ But don't forget to think very carefully.
43
+
44
+ ### Question
45
+ The question here.
46
+ """
47
+
48
+ chat = [
49
+ { "role": "user", "content": prompt},
50
+ ]
51
+ question = tokenizer.apply_chat_template(chat, tokenize=False, add_generation_prompt=True)
52
+ question = tokenizer(question, return_tensors="pt").to(device)
53
+ streamer = TextIteratorStreamer(tokenizer, skip_prompt=True)
54
+ generation_kwargs = dict(question, streamer=streamer, max_new_tokens=4000)
55
+ thread = Thread(target=model.generate, kwargs=generation_kwargs)
56
+
57
+ # generate
58
+ thread.start()
59
+ for new_text in streamer:
60
+ print(new_text, end="")
61
+ ```
62
+
63
+ # Some info
64
+ If you want to know how I fine tune it, what datasets I used and the training code. [See here]()
65
+
66
 
67
 
68
  # Model Trained Using AutoTrain