Armour AI - Hinglish Financial NER Model

A multilingual Named Entity Recognition (NER) model fine-tuned specifically for financial conversations in Hinglish (mixture of Hindi and English).

🎯 Model Summary

  • Framework: Transformers (HuggingFace)
  • Base Model: bert-base-multilingual-cased
  • Task: Named Entity Recognition (Token Classification)
  • Language: Hinglish (Hindi-English mix)
  • Domain: Financial Services & Insurance
  • Training Data: Armour AI financial conversation dataset
  • Performance: F1 Score ~0.88

πŸ“¦ Installation

pip install transformers torch

πŸš€ Quick Start

Using the Pipeline API (Easiest)

from transformers import pipeline

# Load the model
ner = pipeline(
    "token-classification", 
    model="rohin30n/armour-ai-ner",
    aggregation_strategy="simple"
)

# Inference
text = "kya aap 20 lakh ka term insurance lena chahiye?"
results = ner(text)

# Print results
for result in results:
    print(f"{result['word']:20} | {result['entity']:10} | {result['score']:.4f}")

Output:

20                   | AMOUNT     | 0.9985
lakh                 | AMOUNT     | 0.9992
term insurance       | INSTRUMENT | 0.9981

Using Raw Model & Tokenizer

from transformers import AutoModelForTokenClassification, AutoTokenizer
import torch

# Load model and tokenizer
model = AutoModelForTokenClassification.from_pretrained("rohin30n/armour-ai-ner")
tokenizer = AutoTokenizer.from_pretrained("rohin30n/armour-ai-ner")

# Prepare input
text = "kya aap 20 lakh ka term insurance lena chahiye?"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)

# Inference
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.argmax(outputs.logits, dim=2)

# Decode predictions
tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
labels = predictions[0].cpu().numpy()

for token, label_id in zip(tokens, labels):
    label = model.config.id2label.get(label_id, "O")
    print(f"{token:15} | {label}")

🏷️ Entity Types

This model recognizes 5 entity types:

Entity Description Example
AMOUNT Financial amounts and values "20 lakh", "β‚Ή50,000", "10 percent"
INSTRUMENT Financial products/instruments "term insurance", "mutual fund", "savings account"
DURATION Time periods "1 saal", "2 years", "3 mahine"
DECISION Business decisions/actions "approved", "rejected", "pending"
PERSON Person names "Raj Kumar", "Priya Singh"

πŸ“Š Training Details

Dataset

  • Size: Hinglish financial conversation corpus
  • Domain: Insurance, investments, banking advice
  • Annotation: BIO (Begin-Inside-Outside) tagging scheme
  • Split: 80% training, 20% evaluation

Training Configuration

{
    "num_epochs": 3,
    "train_batch_size": 16,
    "eval_batch_size": 16,
    "learning_rate": 2e-5,
    "max_seq_length": 512,
    "optimizer": "adam"
}

Performance Metrics

  • Precision: ~0.89
  • Recall: ~0.87
  • F1 Score: ~0.88
  • Training Time: ~45 minutes (GPU)

πŸ’‘ Use Cases

  1. Financial Chatbot: Extract entities from customer queries

    Input: "Mujhe 25 lakh ka jeevan bima chahiye"
    Entities: AMOUNT=25 lakh, INSTRUMENT=jeevan bima
    
  2. Intent Recognition: Route conversations based on extracted entities

    If AMOUNT + INSTRUMENT β†’ Product recommendation
    
  3. Information Extraction: Build structured databases from conversations

    {
      "customer_intent": "insurance_inquiry",
      "amount_interested": "20 lakh",
      "product": "term insurance"
    }
    

βš™οΈ Model Architecture

Input Text (Hinglish)
    ↓
[Tokenizer: bert-base-multilingual-cased]
    ↓
[BERT Encoder Layers]
    ↓
[Token Classification Head]
    ↓
[BIO Entity Labels]
    ↓
Output: Named Entities with Scores

πŸ”§ Advanced Usage

Batch Processing

from transformers import pipeline

ner = pipeline("token-classification", model="rohin30n/armour-ai-ner")

texts = [
    "kya aap 20 lakh ka term insurance lena chahiye?",
    "Mujhe 50 lakh ka investment plan chahiye"
]

results = ner(texts)

Fine-tuning on Custom Data

from transformers import Trainer, TrainingArguments

# Your custom dataset
train_dataset = ...
eval_dataset = ...

training_args = TrainingArguments(
    output_dir="./fine_tuned_ner",
    num_train_epochs=3,
    per_device_train_batch_size=16,
    evaluation_strategy="epoch",
    save_strategy="epoch",
    logging_steps=100,
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_dataset,
    eval_dataset=eval_dataset,
)

trainer.train()

πŸ“ Limitations

  • Language: Optimized for Hinglish; may not work well with pure Hindi or pure English
  • Domain: Fine-tuned on financial conversations; performance may vary on other domains
  • Out-of-vocabulary: May struggle with very new financial products/terms
  • Code-mixing: Works best with natural Hindi-English mixing patterns

⚑ Performance Notes

  • Inference Speed: ~100-200ms per sentence (CPU), ~20-50ms (GPU)
  • Memory: ~500MB RAM minimum, ~2GB with batch processing
  • GPU: Optional but recommended for production use

πŸ“š Related Resources

πŸ‘¨β€πŸ’Ό Project: Armour AI

This model is part of Armour AI, an intelligent financial advisory platform designed for mobile-first interactions with voice, text, and multilingual support.

Features:

  • 🎀 Voice-based financial queries
  • πŸ”€ Text-based conversations
  • πŸ“± Mobile-optimized API
  • 🌍 Multilingual support (Hinglish)
  • πŸ’¬ Real-time entity extraction
  • 🧠 intelligent routing & recommendations

πŸ“„ Citation

If you find this model helpful, please cite it:

@model{rohin30n_armour_ai_ner_2026,
  author = {Armour AI Team},
  title = {Armour AI - Hinglish Financial NER Model},
  year = {2026},
  url = {https://huggingface.co/rohin30n/armour-ai-ner},
  note = {Based on BERT-base-multilingual-cased}
}

πŸ“ž Support & Questions

For issues, questions, or suggestions:

  • Open an issue on the model repository
  • Check existing discussions in the Community tab

Status: βœ… Production Ready | Last Updated: April 2026 | Version: 1.0

Downloads last month
56
Safetensors
Model size
0.2B params
Tensor type
F32
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Evaluation results