Shen-Pandi commited on
Commit
011262d
·
verified ·
1 Parent(s): acc422e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +30 -60
README.md CHANGED
@@ -1,78 +1,48 @@
1
  ---
2
- language: en
3
- pipeline_tag: text-generation
4
- library_name: transformers
 
5
  tags:
6
- - llama
7
  - data-management
8
- - data-engineering
9
- - migration
10
  - sql
11
- - reasoning
12
  - grpo
13
- - rlhf
14
- license: other
15
- base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
16
  ---
17
 
18
- # Agentic Data 1
19
 
20
- A specialized 8B reasoning model fine-tuned for Data Management, Data Engineering, and Migration tasks.
21
-
22
- ## Model Details
23
-
24
- - **Base**: DeepSeek-R1-Distill-Llama-8B
25
- - **Training**: 3-stage pipeline (SFT QLoRA → Doc-Grounded SFT → GRPO Reinforcement Learning)
26
- - **Format**: BF16 SafeTensors (PyTorch / HuggingFace Transformers compatible)
27
- - **Parameters**: 8B
28
 
29
  ## Training Pipeline
30
-
31
- | Stage | Method | Data | Hardware |
32
- |-------|--------|------|----------|
33
- | Stage 1 | QLoRA SFT (3 versions) | 14,666 synthetic pairs + 7,558 doc-grounded chunks | Apple Silicon M-Series |
34
- | Stage 2 | GRPO Reinforcement Learning | 100 reasoning prompts with reward functions | NVIDIA H100 80GB |
35
-
36
- ## Capabilities
37
-
38
- - **SQL Dialect Conversion**: Oracle PostgreSQL ↔ T-SQL ↔ Snowflake ↔ BigQuery ↔ Databricks
39
- - **ETL Pipeline Migration**: Informatica → dbt, DataStage → Spark, BODS → Airflow
40
- - **Legacy System Modernization**: COBOL, JCL, SAS, ABAP → modern stacks
41
- - **Data Quality & Governance**: Assessment, validation, and compliance
42
- - **Migration Lifecycle**: Discovery Risk Planning → Conversion → Verification
43
- - **Step-by-Step Reasoning**: Uses `<think>...</think>` tags for chain-of-thought reasoning
44
 
45
  ## Usage
46
-
47
  ```python
48
  from transformers import AutoModelForCausalLM, AutoTokenizer
49
- import torch
50
 
51
- model = AutoModelForCausalLM.from_pretrained(
52
- "DataManagement-AI/Agentic-Data-1",
53
- torch_dtype=torch.bfloat16,
54
- device_map="auto",
55
- )
56
  tokenizer = AutoTokenizer.from_pretrained("DataManagement-AI/Agentic-Data-1")
57
-
58
- messages = [
59
- {"role": "system", "content": "You are Agentic Data 1, an expert data management and migration reasoning model. Think step-by-step before answering."},
60
- {"role": "user", "content": "Convert this Oracle PL/SQL stored procedure to PostgreSQL PL/pgSQL."}
61
- ]
62
- inputs = tokenizer.apply_chat_template(messages, return_tensors="pt", add_generation_prompt=True).to(model.device)
63
- outputs = model.generate(inputs, max_new_tokens=1500)
64
- print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))
65
  ```
66
 
67
- ## Benchmarks (SFT V3)
68
-
69
- | Metric | Base Model | Agentic Data 1 | Improvement |
70
- |--------|-----------|-----------------|-------------|
71
- | Overall Score | 0.554 | **0.636** | +14.8% |
72
- | Implementation Quality | 0.584 | **0.761** | +30.3% |
73
- | Think-Tag Rate | 0% | **100%** | ∞ |
74
- | Reasoning Quality | 0.534 | **0.622** | +16.5% |
75
-
76
- ## License
77
-
78
- For research and educational purposes.
 
1
  ---
2
+ language:
3
+ - en
4
+ license: llama3
5
+ base_model: deepseek-ai/DeepSeek-R1-Distill-Llama-8B
6
  tags:
 
7
  - data-management
 
 
8
  - sql
9
+ - migration
10
  - grpo
11
+ - reinforcement-learning
 
 
12
  ---
13
 
14
+ # Agentic Data 1 — GRPO-Trained
15
 
16
+ A specialized 8B parameter model for data management, migration, and SQL tasks.
 
 
 
 
 
 
 
17
 
18
  ## Training Pipeline
19
+ 1. **Base**: DeepSeek-R1-Distill-Llama-8B
20
+ 2. **SFT**: Fine-tuned on 1000+ data management examples (Oracle→Postgres, DB2→Snowflake, ETL, data quality)
21
+ 3. **GRPO**: 500 steps of Group Relative Policy Optimization on H100, with reward functions for:
22
+ - Code parsability (SQL validation)
23
+ - Reasoning quality (step-by-step thinking)
24
+ - Answer accuracy
25
+
26
+ ## Training Metrics (GRPO)
27
+ | Metric | Start | End |
28
+ |---|---|---|
29
+ | Reward | 0.43 | 0.49 |
30
+ | Code Parsability | 0.15 | 0.21 |
31
+ | KL Divergence | 0.0005 | 0.0014 |
32
+ | Grad Norm | 0.295 | 0.210 |
33
 
34
  ## Usage
 
35
  ```python
36
  from transformers import AutoModelForCausalLM, AutoTokenizer
 
37
 
38
+ model = AutoModelForCausalLM.from_pretrained("DataManagement-AI/Agentic-Data-1")
 
 
 
 
39
  tokenizer = AutoTokenizer.from_pretrained("DataManagement-AI/Agentic-Data-1")
 
 
 
 
 
 
 
 
40
  ```
41
 
42
+ ## Capabilities
43
+ - Oracle → PostgreSQL migration
44
+ - DB2 Snowflake conversion
45
+ - SQL generation and validation
46
+ - ETL pipeline design
47
+ - Data quality assessment
48
+ - Schema analysis and optimization