| license: apache-2.0 | |
| language: | |
| - en | |
| tags: | |
| - reinforcement-learning | |
| - geometry | |
| - gclc | |
| - code-generation | |
| # GCLC Code Generation - RL Fine-tuned Model | |
| This model was fine-tuned using Reinforcement Learning for GCLC (Geometry Constructions -> LaTeX Converter) code generation. | |
| ## Model Details | |
| - **Base Model**: [Add your base model] | |
| - **Training Method**: Reinforcement Learning with reward-based optimization | |
| - **Task**: Generate GCLC code from geometric problem descriptions | |
| ## Training Stats | |
| See `training_outputs/` for detailed logs and `training_curves.png` for visualization. | |
| ## Usage | |
| ```python | |
| from transformers import AutoModelForCausalLM, AutoTokenizer | |
| model = AutoModelForCausalLM.from_pretrained("Gabriel2502/gclc-rl-model-deepseek") | |
| tokenizer = AutoTokenizer.from_pretrained("Gabriel2502/gclc-rl-model-deepseek") | |
| prompt = "Generate GCLC code for: triangle ABC with AB=5, AC=7, angle A=60 degrees" | |
| inputs = tokenizer(prompt, return_tensors="pt") | |
| outputs = model.generate(**inputs, max_new_tokens=512) | |
| print(tokenizer.decode(outputs[0])) | |
| ``` | |
| ## Files | |
| - `checkpoint/`: Model weights and config | |
| - `training_outputs/`: Detailed episode logs | |
| - `training_curves.png`: Training progress visualization | |