Bturtel commited on
Commit
18a0b27
·
verified ·
1 Parent(s): f061ea6

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +7 -22
README.md CHANGED
@@ -96,9 +96,7 @@ python merge.py --output ./trump-forecaster-merged
96
 
97
  This downloads the base model, dequantizes to bf16, applies the LoRA adapter, and saves the merged model.
98
 
99
- ### Inference with the merged model
100
-
101
- With [SGLang](https://github.com/sgl-project/sglang) (recommended for MoE):
102
 
103
  ```python
104
  import sglang as sgl
@@ -111,34 +109,21 @@ engine = sgl.Engine(
111
  tp_size=2,
112
  )
113
 
114
- prompt = """You are a forecasting expert. Given the question and context below, predict the probability that the answer is "Yes".
 
 
115
 
116
  Question: Will Trump impose 25% tariffs on all goods from Canada by February 1, 2025?
117
 
 
 
 
118
  Respond with your reasoning, then give your final answer as a probability between 0 and 1 inside <answer></answer> tags."""
119
 
120
  output = engine.generate(prompt, sampling_params={"max_new_tokens": 4096, "stop": ["</answer>"]})
121
  print(output["text"])
122
  ```
123
 
124
- Or with transformers:
125
-
126
- ```python
127
- from transformers import AutoModelForCausalLM, AutoTokenizer
128
-
129
- model = AutoModelForCausalLM.from_pretrained(
130
- "./trump-forecaster-merged",
131
- torch_dtype="auto",
132
- device_map="auto",
133
- trust_remote_code=True,
134
- )
135
- tokenizer = AutoTokenizer.from_pretrained("openai/gpt-oss-120b", trust_remote_code=True)
136
-
137
- inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
138
- outputs = model.generate(**inputs, max_new_tokens=4096, do_sample=True, temperature=0.7)
139
- print(tokenizer.decode(outputs[0], skip_special_tokens=True))
140
- ```
141
-
142
  ---
143
 
144
  ## Links
 
96
 
97
  This downloads the base model, dequantizes to bf16, applies the LoRA adapter, and saves the merged model.
98
 
99
+ ### Inference
 
 
100
 
101
  ```python
102
  import sglang as sgl
 
109
  tp_size=2,
110
  )
111
 
112
+ news_context = "... relevant news articles ..."
113
+
114
+ prompt = f"""You are a forecasting expert. Given the question and context below, predict the probability that the answer is "Yes".
115
 
116
  Question: Will Trump impose 25% tariffs on all goods from Canada by February 1, 2025?
117
 
118
+ Context:
119
+ {news_context}
120
+
121
  Respond with your reasoning, then give your final answer as a probability between 0 and 1 inside <answer></answer> tags."""
122
 
123
  output = engine.generate(prompt, sampling_params={"max_new_tokens": 4096, "stop": ["</answer>"]})
124
  print(output["text"])
125
  ```
126
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
127
  ---
128
 
129
  ## Links