MuratcanKoylan's picture
Upload folder using huggingface_hub
685d968 verified
# Model Card: Memory Routing Agent (Llama-8B + LoRA)
## Model Details
- **Model Name**: memory-routing-llama-8b-lora
- **Base Model**: meta-llama/Llama-3.1-8B
- **Architecture**: LoRA (Low-Rank Adaptation), rank 32
- **Training Platform**: Tinker (Thinking Machines)
- **Training Method**: SFT (Supervised Fine-Tuning) + RL (Reinforcement Learning)
- **Parameters**: ~8B base + ~100M LoRA adapters
- **License**: Apache 2.0
## Intended Use
This model classifies marketing conversations into memory categories for AI assistant systems. It determines which pieces of information from a conversation should be stored in long-term memory and how they should be categorized.
### Primary Use Cases
- Marketing AI assistants that need to remember user preferences
- CRM systems that extract structured data from conversations
- Knowledge management systems for marketing teams
### Out-of-Scope Uses
- General-purpose chatbots
- Non-marketing domains (healthcare, legal, finance)
- Real-time conversation generation
## Training Data
### Synthetic Dataset
- **Size**: 2,001 conversations
- **Generation**: Cohere Command-R-Plus (104B) as teacher model
- **Format**: Multi-turn marketing conversations with category labels
### Category Taxonomy (13 categories)
| Category | Description | Persistence |
|----------|-------------|-------------|
| company.brand_core | Voice, values, positioning | Long (>1y) |
| company.strategic_signatures | Decision frameworks | Long (>1y) |
| company.knowledge_artifacts | Docs, style guides | Long (>1y) |
| company.business_priorities | Quarterly goals | Short (<3m) |
| company.tools_config | Integrations, APIs | Medium (~6m) |
| company.performance_context | Campaign metrics | Rolling (~6m) |
| user.communication_style | Tone, format preferences | Long (>1y) |
| user.strategic_approach | Personal priorities | Long (>1y) |
| user.role_context | Title, scope | Medium (~1y) |
| user.workflow_patterns | Review cadence | Medium (~1y) |
| user.session_history | Immediate context | Short (<2w) |
| user.interaction_preferences | Coaching style | Evolving |
| none | Irrelevant content | N/A |
## Training Procedure
### Phase 1: Supervised Fine-Tuning (SFT)
- **Steps**: 100
- **Batch Size**: 128
- **Learning Rate**: 2.86e-4 (Tinker default for Llama-8B)
- **Optimizer**: Adam (β1=0.9, β2=0.95)
- **Loss Function**: Cross-entropy
### Phase 2: Reinforcement Learning (RL)
- **Iterations**: 12
- **Groups per Batch**: 64
- **Group Size**: 32
- **Learning Rate**: 2e-5
- **Loss Function**: Importance sampling policy gradient
- **Reward Function**:
- R_F1 (60%): F1 score vs gold labels
- R_temp (20%): Temporal alignment
- R_parity (10%): Company/user scope
- R_eff (10%): Storage efficiency
## Evaluation Results
### Marketing Routing Benchmark (50 scenarios)
| Model | Any Match | Exact Match | Avg F1 |
|-------|-----------|-------------|--------|
| **Ours (8B + LoRA)** | 72% | **60%** | **0.68** |
| Cohere Command-R-Plus (104B) | 82% | 26% | 0.61 |
### Key Findings
- **11.1% higher F1** than the 104B teacher model
- **2.3x better exact match** accuracy
- **13x smaller** than the teacher model
- Excels at single-category classification (86% exact on easy cases)
- Struggles with multi-label scenarios (10% exact on hard cases)
### Performance by Difficulty
| Difficulty | Our Model (F1) | Cohere (F1) | Delta |
|------------|----------------|-------------|-------|
| Easy | 0.86 | 0.48 | +79% |
| Medium | 0.65 | 0.64 | +2% |
| Hard | 0.50 | 0.72 | -31% |
## Limitations
1. **Multi-label Detection**: Under-predicts when multiple categories apply
2. **Company vs User Confusion**: Sometimes confuses `company.strategic_signatures` with `user.strategic_approach`
3. **Hard Cases**: Performance drops on complex overlapping categories
4. **Domain Specificity**: Trained only on marketing scenarios
## Ethical Considerations
- Model trained on synthetic data; may not capture all real-world edge cases
- Should be used with human oversight for critical decisions
- Privacy: Does not store or transmit conversation data
## Citation
```bibtex
@misc{memory-routing-agent-2025,
title={Memory Routing Agent: Prompt Distillation for Marketing AI},
author={Muratcan Koylan},
year={2025},
howpublished={\url{https://github.com/muratcankoylan/memory-routing-agent}},
}
```
## Model Files
- `training/checkpoints/rl_iter_012/` - Final RL checkpoint
- `training/benchmarks/marketing_routing_benchmark.json` - Benchmark dataset
- `synthetic_data/merged_training_dataset_2001.jsonl` - Training data
## Contact
For questions or issues, please open a GitHub issue.