Phaedrus33 commited on
Commit
4678202
·
verified ·
1 Parent(s): 92faf76

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +21 -7
README.md CHANGED
@@ -10,20 +10,34 @@ train_grpo_final.py # GRPO reinforcement learning (3 reward functions)
10
  inference_grpo_final.py # vLLM inference with majority/plurality voting
11
  run_final.sh # End-to-end pipeline orchestration
12
  requirements.txt # Python dependencies
13
- REPORT_final.md # Detailed solution report
14
  submission/ # Pre-generated submission
15
  Final_submission_plurality_pp.csv # Phase 2 submission (score: 0.9582)
16
  ```
17
 
18
- **Quick start:** `bash run_final.sh --all` (requires `HF_TOKEN` environment variable and GPU).
19
-
20
  **Inference on a new test CSV:**
21
  ```bash
 
 
 
 
 
 
22
  pip install -r requirements.txt
23
- export HF_TOKEN=your_token
24
- bash run_final.sh --infer --test-csv path/to/new_test.csv
 
 
 
 
25
  ```
26
- The test CSV requires two columns: `ID` (unique question identifier) and `question` (full question text including data tables). Output submission CSVs are written to `outputs/inference/`.
 
 
 
 
 
 
27
 
28
  **Pre-generated submission:** `submission/Final_submission_plurality_pp.csv` scores **0.9582** on the Phase 2 test dataset.
29
 
@@ -138,4 +152,4 @@ The training pipeline is fully reproducible: `run_final.sh --all` regenerates tr
138
  - **GRPO**: 1x NVIDIA H200 NVL 141GB, vLLM 0.12.0 for generation
139
  - **Inference**: 1x NVIDIA H200 NVL 141GB, vLLM 0.12.0, bfloat16, CUDA graphs enabled, batch size 32
140
 
141
- Full pipeline: `bash run_final.sh --all` (requires HF_TOKEN environment variable).
 
10
  inference_grpo_final.py # vLLM inference with majority/plurality voting
11
  run_final.sh # End-to-end pipeline orchestration
12
  requirements.txt # Python dependencies
13
+ report.pdf # Detailed solution report
14
  submission/ # Pre-generated submission
15
  Final_submission_plurality_pp.csv # Phase 2 submission (score: 0.9582)
16
  ```
17
 
 
 
18
  **Inference on a new test CSV:**
19
  ```bash
20
+ # Clone the repo (includes model weights + inference scripts)
21
+ git lfs install
22
+ git clone https://huggingface.co/Phaedrus33/GRPO_final_submission
23
+ cd GRPO_final_submission
24
+
25
+ # Install dependencies
26
  pip install -r requirements.txt
27
+
28
+ # Run inference (model is local, no download needed)
29
+ python inference_grpo_final.py \
30
+ --model . \
31
+ --test-csv /path/to/test.csv \
32
+ --output-dir ./outputs
33
  ```
34
+ The test CSV requires two columns: `ID` (unique question identifier) and `question` (full question text including data tables). Output submission is written to `./outputs/submission_plurality.csv`.
35
+
36
+ **Requirements:** GPU with 80GB+ VRAM (A100-80GB, H100) or 2x 40GB GPUs with `--num-gpus 2`. Python 3.10+, CUDA 12.x.
37
+
38
+ **Options:**
39
+ - Reduce compute: `--num-generations 1` for single prediction per ID (no voting)
40
+ - Out-of-memory: `--batch-size 8` (or lower)
41
 
42
  **Pre-generated submission:** `submission/Final_submission_plurality_pp.csv` scores **0.9582** on the Phase 2 test dataset.
43
 
 
152
  - **GRPO**: 1x NVIDIA H200 NVL 141GB, vLLM 0.12.0 for generation
153
  - **Inference**: 1x NVIDIA H200 NVL 141GB, vLLM 0.12.0, bfloat16, CUDA graphs enabled, batch size 32
154
 
155
+ Full training pipeline (trace generation + SFT + GRPO): `bash run_final.sh --all` (requires HF_TOKEN environment variable and 80GB+ GPU).