agentic-intent-classifier / known_limitations.md
manikumargouni's picture
Upload folder using huggingface_hub
0584798 verified

Known Limitations

Model Limits

  • intent_type v0.1 now uses a manually tuned fallback threshold, but some ambiguous prompts still clear as non-safe intent labels.
  • intent_subtype v0.1 is useful for commercial patterns, but still weak on sparse support and price-seeking cases.
  • decision_phase v0.1 is behaviorally useful but still weak in aggregate metrics.
  • iab_content now uses local embedding retrieval over taxonomy nodes, but exact-path quality still depends on how well the embedding model separates close sibling categories.
  • the composed output is still gated by the weaker calibrated components, so action-style queries can over-fallback.

Boundary Confusion

  • support and post_purchase are cleaner than before, but still fragile on phrasing that mixes setup with a broken flow.
  • consideration and decision can still collapse when a buying query is phrased like a general question.
  • awareness and research remain close for “help me understand” style prompts.
  • deal_seeking and education still blur on price questions phrased as general explanations.
  • some broad software topics still need nearest-equivalent IAB mappings because the taxonomy does not have exact SaaS product nodes.
  • CRM-style prompts are still a weak spot because retrieval can drift toward adjacent business software categories without a reranker.

Intent-Type Gaps

  • the expanded intent_type taxonomy now includes support, exploratory, creative_generation, chit_chat, and prohibited, but these new classes are still data-light until the next full retraining pass.
  • short contextual follow-ups are still sensitive to wording and may be over-fallbacked.
  • price-seeking prompts can still flip between commercial, informational, and fallback depending on phrasing.

System-Layer Caveats

  • the current policy and opportunity logic is still rule-based even though it now uses subtype.
  • commercial_score is heuristic, not learned.
  • the current intent_type threshold is manually sweep-tuned rather than learned jointly with the system layer.
  • fallback is still intentionally conservative, which is useful for demos but suppresses some strong action signals.
  • subtype is now part of the system decision, but the system still trusts decision_phase more than subtype on support safety.
  • IAB routing uses the local 3.0 taxonomy TSV plus a local embedding index; exact path quality still depends on the retrieval model and the candidate text stored for each taxonomy node.

Regression Tracking

Product Limits

  • no multi-turn memory is used in the combined classifier.
  • no real-traffic labeled IAB dataset is produced yet.
  • no audit store, trace viewer, or UI is included beyond the local JSON demo endpoint.