YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
Seriguela - Symbolic Regression with Large Language Models
A research project exploring the application of GPT-2 and other LLMs to symbolic regression through fine-tuning and reinforcement learning.
π Key Results (Model Scaling Study - Feb 2025)
| Model | Parameters | Valid Rate (Quality) | Valid Rate (Nguyen) | Avg RΒ² | Max RΒ² |
|---|---|---|---|---|---|
| Base | 124M | 99.4% | 62.5% | 0.9190 | 0.9994 |
| Medium | 355M | 99.2% | 75.2% | 0.9812 | 0.9999 |
| Large | 774M | 100% π | 89.0% π | 0.9852 π | 1.0000 πβ |
Breakthrough: First model to achieve 100% valid rate and RΒ²=1.0 perfect fit on Nguyen-8
π Project Structure
Organized by research phases for systematic experimentation:
seriguela/
βββ 1_data/ # FASE 1: PreparaΓ§Γ£o de Dados
βββ 2_training/ # FASE 2: Treinamento e Fine-tuning
βββ 3_evaluation/ # FASE 3: AvaliaΓ§Γ£o
βββ 4_analysis/ # FASE 4: AnΓ‘lise e VisualizaΓ§Γ£o
βββ models/ # Modelos treinados
βββ results/ # Resultados de experimentos
βββ docs/ # DocumentaΓ§Γ£o completa
βββ src/ # CΓ³digo fonte (package)
βββ scripts/ # Scripts auxiliares
Each directory contains a README.md with detailed documentation.
π Quick Start
# 1. Clone and setup
git clone https://github.com/augustocsc/seriguela.git
cd seriguela
python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt
# 2. Train model
cd 2_training/supervised
python train_with_json.py --model_size gpt2-medium
# 3. Evaluate
cd ../../3_evaluation/benchmarks
python run_all_nguyen_benchmarks.py --model_path ../../models/gpt2/medium_700k_json
# 4. Analyze
cd ../../4_analysis/visualization
python create_visualizations.py
π Complete Documentation
- Scientific Report:
docs/reports/SCIENTIFIC_REPORT_MODEL_SCALING.md - Developer Guide:
docs/guides/CLAUDE.md - Model Cards:
docs/model_cards/
π Citation
@misc{seriguela2025,
title={Scaling Laws for Symbolic Regression with LLMs},
author={Augusto Cesar},
year={2025},
note={First 100% valid rate + RΒ²=1.0 achieved}
}
Status: β Production-ready | π Publication-ready | Last Updated: 2026-02-04
Inference Providers
NEW
This model isn't deployed by any Inference Provider.
π
Ask for provider support