Upload benchmarks/benchmark_code_real_llm.py
Browse files
benchmarks/benchmark_code_real_llm.py
CHANGED
|
@@ -1 +1,9 @@
|
|
| 1 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
+
"""
|
| 2 |
+
Placeholder: real LLM code benchmark scripts are in the `jobs/` directory.
|
| 3 |
+
|
| 4 |
+
- `jobs/run_real_llm_standalone.py` — v1 baseline with Qwen2.5-Coder-0.5B
|
| 5 |
+
- `jobs/run_real_llm_standalone_v2.py` — v2 with chat templating fix
|
| 6 |
+
|
| 7 |
+
These are self-contained GPU job scripts that inline the OCC components
|
| 8 |
+
to avoid import issues in sandbox environments.
|
| 9 |
+
"""
|