Bturtel commited on
Commit
f061ea6
·
verified ·
1 Parent(s): 7a10263

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +2 -2
README.md CHANGED
@@ -94,7 +94,7 @@ pip install torch transformers safetensors tqdm huggingface-hub
94
  python merge.py --output ./trump-forecaster-merged
95
  ```
96
 
97
- This downloads the base model (MXFP4, ~30 GB), dequantizes to bf16, applies the LoRA adapter, and saves the merged model (~300 GB bf16). Requires ~300 GB RAM, no GPU needed.
98
 
99
  ### Inference with the merged model
100
 
@@ -108,7 +108,7 @@ engine = sgl.Engine(
108
  tokenizer_path="openai/gpt-oss-120b",
109
  trust_remote_code=True,
110
  dtype="bfloat16",
111
- tp_size=2, # needs 2x 80GB GPUs for bf16
112
  )
113
 
114
  prompt = """You are a forecasting expert. Given the question and context below, predict the probability that the answer is "Yes".
 
94
  python merge.py --output ./trump-forecaster-merged
95
  ```
96
 
97
+ This downloads the base model, dequantizes to bf16, applies the LoRA adapter, and saves the merged model.
98
 
99
  ### Inference with the merged model
100
 
 
108
  tokenizer_path="openai/gpt-oss-120b",
109
  trust_remote_code=True,
110
  dtype="bfloat16",
111
+ tp_size=2,
112
  )
113
 
114
  prompt = """You are a forecasting expert. Given the question and context below, predict the probability that the answer is "Yes".