Upload README.md with huggingface_hub

Browse files

Files changed (1) hide show

README.md +30 -60

README.md CHANGED Viewed

@@ -1,78 +1,48 @@
 ---
-language: en
-pipeline_tag: text-generation
-library_name: transformers
 tags:
-- llama
 - data-management
-- data-engineering
-- migration
 - sql
-- reasoning
 - grpo
-- rlhf
-license: other
-base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
 ---
-# Agentic Data 1
-A specialized 8B reasoning model fine-tuned for Data Management, Data Engineering, and Migration tasks.
-## Model Details
-- **Base**: DeepSeek-R1-Distill-Llama-8B
-- **Training**: 3-stage pipeline (SFT QLoRA → Doc-Grounded SFT → GRPO Reinforcement Learning)
-- **Format**: BF16 SafeTensors (PyTorch / HuggingFace Transformers compatible)
-- **Parameters**: 8B
 ## Training Pipeline
-| Stage | Method | Data | Hardware |
-|-------|--------|------|----------|
-| Stage 1 | QLoRA SFT (3 versions) | 14,666 synthetic pairs + 7,558 doc-grounded chunks | Apple Silicon M-Series |
-| Stage 2 | GRPO Reinforcement Learning | 100 reasoning prompts with reward functions | NVIDIA H100 80GB |
-## Capabilities
-- **SQL Dialect Conversion**: Oracle ↔ PostgreSQL ↔ T-SQL ↔ Snowflake ↔ BigQuery ↔ Databricks
-- **ETL Pipeline Migration**: Informatica → dbt, DataStage → Spark, BODS → Airflow
-- **Legacy System Modernization**: COBOL, JCL, SAS, ABAP → modern stacks
-- **Data Quality & Governance**: Assessment, validation, and compliance
-- **Migration Lifecycle**: Discovery → Risk → Planning → Conversion → Verification
-- **Step-by-Step Reasoning**: Uses `<think>...</think>` tags for chain-of-thought reasoning
 ## Usage
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
-import torch
-model = AutoModelForCausalLM.from_pretrained(
-    "DataManagement-AI/Agentic-Data-1",
-    torch_dtype=torch.bfloat16,
-    device_map="auto",
-)
 tokenizer = AutoTokenizer.from_pretrained("DataManagement-AI/Agentic-Data-1")
-messages = [
-    {"role": "system", "content": "You are Agentic Data 1, an expert data management and migration reasoning model. Think step-by-step before answering."},
-    {"role": "user", "content": "Convert this Oracle PL/SQL stored procedure to PostgreSQL PL/pgSQL."}
-]
-inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
-outputs = model.generate(inputs, max_new_tokens=1500)
-print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))
 ```
-## Benchmarks (SFT V3)
-| Metric | Base Model | Agentic Data 1 | Improvement |
-|--------|-----------|-----------------|-------------|
-| Overall Score | 0.554 | **0.636** | +14.8% |
-| Implementation Quality | 0.584 | **0.761** | +30.3% |
-| Think-Tag Rate | 0% | **100%** | ∞ |
-| Reasoning Quality | 0.534 | **0.622** | +16.5% |
-## License
-For research and educational purposes.

 ---
+language:
+- en
+license: llama3
+base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
 tags:
 - data-management
 - sql
+- migration
 - grpo
+- reinforcement-learning
 ---
+# Agentic Data 1 — GRPO-Trained
+A specialized 8B parameter model for data management, migration, and SQL tasks.
 ## Training Pipeline
+1. **Base**: DeepSeek-R1-Distill-Llama-8B
+2. **SFT**: Fine-tuned on 1000+ data management examples (Oracle→Postgres, DB2→Snowflake, ETL, data quality)
+3. **GRPO**: 500 steps of Group Relative Policy Optimization on H100, with reward functions for:
+   - Code parsability (SQL validation)
+   - Reasoning quality (step-by-step thinking)
+   - Answer accuracy
+## Training Metrics (GRPO)
+| Metric | Start | End |
+|---|---|---|
+| Reward | 0.43 | 0.49 |
+| Code Parsability | 0.15 | 0.21 |
+| KL Divergence | 0.0005 | 0.0014 |
+| Grad Norm | 0.295 | 0.210 |
 ## Usage
 ```python
 from transformers import AutoModelForCausalLM, AutoTokenizer
+model = AutoModelForCausalLM.from_pretrained("DataManagement-AI/Agentic-Data-1")
 tokenizer = AutoTokenizer.from_pretrained("DataManagement-AI/Agentic-Data-1")
 ```
+## Capabilities
+- Oracle → PostgreSQL migration
+- DB2 → Snowflake conversion
+- SQL generation and validation
+- ETL pipeline design
+- Data quality assessment
+- Schema analysis and optimization