Hai Indexer 7B

HAI Indexer is a fine-tuned Mistral-7B-Instruct model specialized for RAG (Retrieval Augmented Generation), company knowledge base QA, entity classification, and safety-aware responses.

Model Details

  • Base model: mistralai/Mistral-7B-Instruct-v0.2
  • Training: Supervised fine-tuning (SFT) via LoRA, merged into base
  • Architecture: MistralForCausalLM, 7B parameters
  • Max context: 32,768 tokens
  • License: Apache 2.0

Training Data

The model was trained on multiple datasets including:

  • RAG / retrieval โ€“ answering from provided context
  • Business integration โ€“ domain-specific knowledge
  • Company knowledge base โ€“ internal KB QA
  • Entity classification โ€“ entity recognition
  • Anti-hallucination โ€“ staying grounded in context
  • Safety guardrails โ€“ safe responses
  • Introduction / operator โ€“ assistant identity and behavior

Intended Use

  • RAG pipelines with retrieved context
  • Company or internal knowledge base Q&A
  • Instruction-following assistant with grounding in provided documents
  • General chat when used with appropriate system prompts

How to Use

With Transformers

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained(
    "Haiintel/hai-indexer-7B",
    torch_dtype="auto",
    device_map="auto",
)
tokenizer = AutoTokenizer.from_pretrained("Haiintel/hai-indexer-7B")

messages = [{"role": "user", "content": "What is HAI Indexer?"}]
text = tokenizer.apply_chat_template(
    messages,
    tokenize=False,
    add_generation_prompt=True,
)
inputs = tokenizer(text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
response = tokenizer.decode(
    outputs[0][inputs["input_ids"].shape[1]:],
    skip_special_tokens=True,
)
print(response)

RAG-style (with context)

context = "Your retrieved documents here..."
query = "Your question here"

messages = [
    {"role": "system", "content": "Answer based on the context provided."},
    {"role": "user", "content": f"Context:\n{context}\n\nQuestion: {query}"},
]
# Then apply_chat_template + generate as above

Limitations

  • Performance depends on retrieval quality in RAG setups
  • May reflect biases or errors in training data
  • Not designed for medical, legal, or high-stakes decisions without review

Acknowledgments

Downloads last month
23
Safetensors
Model size
7B params
Tensor type
BF16
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for Haiintel/hai-indexer-7B

Finetuned
(1064)
this model