MuratcanKoylan's picture
Upload folder using huggingface_hub
685d968 verified

Model Card: Memory Routing Agent (Llama-8B + LoRA)

Model Details

  • Model Name: memory-routing-llama-8b-lora
  • Base Model: meta-llama/Llama-3.1-8B
  • Architecture: LoRA (Low-Rank Adaptation), rank 32
  • Training Platform: Tinker (Thinking Machines)
  • Training Method: SFT (Supervised Fine-Tuning) + RL (Reinforcement Learning)
  • Parameters: ~8B base + ~100M LoRA adapters
  • License: Apache 2.0

Intended Use

This model classifies marketing conversations into memory categories for AI assistant systems. It determines which pieces of information from a conversation should be stored in long-term memory and how they should be categorized.

Primary Use Cases

  • Marketing AI assistants that need to remember user preferences
  • CRM systems that extract structured data from conversations
  • Knowledge management systems for marketing teams

Out-of-Scope Uses

  • General-purpose chatbots
  • Non-marketing domains (healthcare, legal, finance)
  • Real-time conversation generation

Training Data

Synthetic Dataset

  • Size: 2,001 conversations
  • Generation: Cohere Command-R-Plus (104B) as teacher model
  • Format: Multi-turn marketing conversations with category labels

Category Taxonomy (13 categories)

Category Description Persistence
company.brand_core Voice, values, positioning Long (>1y)
company.strategic_signatures Decision frameworks Long (>1y)
company.knowledge_artifacts Docs, style guides Long (>1y)
company.business_priorities Quarterly goals Short (<3m)
company.tools_config Integrations, APIs Medium (~6m)
company.performance_context Campaign metrics Rolling (~6m)
user.communication_style Tone, format preferences Long (>1y)
user.strategic_approach Personal priorities Long (>1y)
user.role_context Title, scope Medium (~1y)
user.workflow_patterns Review cadence Medium (~1y)
user.session_history Immediate context Short (<2w)
user.interaction_preferences Coaching style Evolving
none Irrelevant content N/A

Training Procedure

Phase 1: Supervised Fine-Tuning (SFT)

  • Steps: 100
  • Batch Size: 128
  • Learning Rate: 2.86e-4 (Tinker default for Llama-8B)
  • Optimizer: Adam (β1=0.9, β2=0.95)
  • Loss Function: Cross-entropy

Phase 2: Reinforcement Learning (RL)

  • Iterations: 12
  • Groups per Batch: 64
  • Group Size: 32
  • Learning Rate: 2e-5
  • Loss Function: Importance sampling policy gradient
  • Reward Function:
    • R_F1 (60%): F1 score vs gold labels
    • R_temp (20%): Temporal alignment
    • R_parity (10%): Company/user scope
    • R_eff (10%): Storage efficiency

Evaluation Results

Marketing Routing Benchmark (50 scenarios)

Model Any Match Exact Match Avg F1
Ours (8B + LoRA) 72% 60% 0.68
Cohere Command-R-Plus (104B) 82% 26% 0.61

Key Findings

  • 11.1% higher F1 than the 104B teacher model
  • 2.3x better exact match accuracy
  • 13x smaller than the teacher model
  • Excels at single-category classification (86% exact on easy cases)
  • Struggles with multi-label scenarios (10% exact on hard cases)

Performance by Difficulty

Difficulty Our Model (F1) Cohere (F1) Delta
Easy 0.86 0.48 +79%
Medium 0.65 0.64 +2%
Hard 0.50 0.72 -31%

Limitations

  1. Multi-label Detection: Under-predicts when multiple categories apply
  2. Company vs User Confusion: Sometimes confuses company.strategic_signatures with user.strategic_approach
  3. Hard Cases: Performance drops on complex overlapping categories
  4. Domain Specificity: Trained only on marketing scenarios

Ethical Considerations

  • Model trained on synthetic data; may not capture all real-world edge cases
  • Should be used with human oversight for critical decisions
  • Privacy: Does not store or transmit conversation data

Citation

@misc{memory-routing-agent-2025,
  title={Memory Routing Agent: Prompt Distillation for Marketing AI},
  author={Muratcan Koylan},
  year={2025},
  howpublished={\url{https://github.com/muratcankoylan/memory-routing-agent}},
}

Model Files

  • training/checkpoints/rl_iter_012/ - Final RL checkpoint
  • training/benchmarks/marketing_routing_benchmark.json - Benchmark dataset
  • synthetic_data/merged_training_dataset_2001.jsonl - Training data

Contact

For questions or issues, please open a GitHub issue.