# Model Card: Memory Routing Agent (Llama-8B + LoRA) ## Model Details - **Model Name**: memory-routing-llama-8b-lora - **Base Model**: meta-llama/Llama-3.1-8B - **Architecture**: LoRA (Low-Rank Adaptation), rank 32 - **Training Platform**: Tinker (Thinking Machines) - **Training Method**: SFT (Supervised Fine-Tuning) + RL (Reinforcement Learning) - **Parameters**: ~8B base + ~100M LoRA adapters - **License**: Apache 2.0 ## Intended Use This model classifies marketing conversations into memory categories for AI assistant systems. It determines which pieces of information from a conversation should be stored in long-term memory and how they should be categorized. ### Primary Use Cases - Marketing AI assistants that need to remember user preferences - CRM systems that extract structured data from conversations - Knowledge management systems for marketing teams ### Out-of-Scope Uses - General-purpose chatbots - Non-marketing domains (healthcare, legal, finance) - Real-time conversation generation ## Training Data ### Synthetic Dataset - **Size**: 2,001 conversations - **Generation**: Cohere Command-R-Plus (104B) as teacher model - **Format**: Multi-turn marketing conversations with category labels ### Category Taxonomy (13 categories) | Category | Description | Persistence | |----------|-------------|-------------| | company.brand_core | Voice, values, positioning | Long (>1y) | | company.strategic_signatures | Decision frameworks | Long (>1y) | | company.knowledge_artifacts | Docs, style guides | Long (>1y) | | company.business_priorities | Quarterly goals | Short (<3m) | | company.tools_config | Integrations, APIs | Medium (~6m) | | company.performance_context | Campaign metrics | Rolling (~6m) | | user.communication_style | Tone, format preferences | Long (>1y) | | user.strategic_approach | Personal priorities | Long (>1y) | | user.role_context | Title, scope | Medium (~1y) | | user.workflow_patterns | Review cadence | Medium (~1y) | | user.session_history | Immediate context | Short (<2w) | | user.interaction_preferences | Coaching style | Evolving | | none | Irrelevant content | N/A | ## Training Procedure ### Phase 1: Supervised Fine-Tuning (SFT) - **Steps**: 100 - **Batch Size**: 128 - **Learning Rate**: 2.86e-4 (Tinker default for Llama-8B) - **Optimizer**: Adam (β1=0.9, β2=0.95) - **Loss Function**: Cross-entropy ### Phase 2: Reinforcement Learning (RL) - **Iterations**: 12 - **Groups per Batch**: 64 - **Group Size**: 32 - **Learning Rate**: 2e-5 - **Loss Function**: Importance sampling policy gradient - **Reward Function**: - R_F1 (60%): F1 score vs gold labels - R_temp (20%): Temporal alignment - R_parity (10%): Company/user scope - R_eff (10%): Storage efficiency ## Evaluation Results ### Marketing Routing Benchmark (50 scenarios) | Model | Any Match | Exact Match | Avg F1 | |-------|-----------|-------------|--------| | **Ours (8B + LoRA)** | 72% | **60%** | **0.68** | | Cohere Command-R-Plus (104B) | 82% | 26% | 0.61 | ### Key Findings - **11.1% higher F1** than the 104B teacher model - **2.3x better exact match** accuracy - **13x smaller** than the teacher model - Excels at single-category classification (86% exact on easy cases) - Struggles with multi-label scenarios (10% exact on hard cases) ### Performance by Difficulty | Difficulty | Our Model (F1) | Cohere (F1) | Delta | |------------|----------------|-------------|-------| | Easy | 0.86 | 0.48 | +79% | | Medium | 0.65 | 0.64 | +2% | | Hard | 0.50 | 0.72 | -31% | ## Limitations 1. **Multi-label Detection**: Under-predicts when multiple categories apply 2. **Company vs User Confusion**: Sometimes confuses `company.strategic_signatures` with `user.strategic_approach` 3. **Hard Cases**: Performance drops on complex overlapping categories 4. **Domain Specificity**: Trained only on marketing scenarios ## Ethical Considerations - Model trained on synthetic data; may not capture all real-world edge cases - Should be used with human oversight for critical decisions - Privacy: Does not store or transmit conversation data ## Citation ```bibtex @misc{memory-routing-agent-2025, title={Memory Routing Agent: Prompt Distillation for Marketing AI}, author={Muratcan Koylan}, year={2025}, howpublished={\url{https://github.com/muratcankoylan/memory-routing-agent}}, } ``` ## Model Files - `training/checkpoints/rl_iter_012/` - Final RL checkpoint - `training/benchmarks/marketing_routing_benchmark.json` - Benchmark dataset - `synthetic_data/merged_training_dataset_2001.jsonl` - Training data ## Contact For questions or issues, please open a GitHub issue.