Agentic-Data-1 / README.md
Shen-Pandi's picture
Upload README.md with huggingface_hub
011262d verified
|
raw
history blame
1.3 kB
metadata
language:
  - en
license: llama3
base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
tags:
  - data-management
  - sql
  - migration
  - grpo
  - reinforcement-learning

Agentic Data 1 — GRPO-Trained

A specialized 8B parameter model for data management, migration, and SQL tasks.

Training Pipeline

  1. Base: DeepSeek-R1-Distill-Llama-8B
  2. SFT: Fine-tuned on 1000+ data management examples (Oracle→Postgres, DB2→Snowflake, ETL, data quality)
  3. GRPO: 500 steps of Group Relative Policy Optimization on H100, with reward functions for:
    • Code parsability (SQL validation)
    • Reasoning quality (step-by-step thinking)
    • Answer accuracy

Training Metrics (GRPO)

Metric Start End
Reward 0.43 0.49
Code Parsability 0.15 0.21
KL Divergence 0.0005 0.0014
Grad Norm 0.295 0.210

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("DataManagement-AI/Agentic-Data-1")
tokenizer = AutoTokenizer.from_pretrained("DataManagement-AI/Agentic-Data-1")

Capabilities

  • Oracle → PostgreSQL migration
  • DB2 → Snowflake conversion
  • SQL generation and validation
  • ETL pipeline design
  • Data quality assessment
  • Schema analysis and optimization