ABIRHINv1 — Hindi-First Transformer Language Model
ABIRHINv1 is a custom transformer-based causal language model designed specifically for Hindi and Hinglish text generation. It is trained from scratch using a custom tokenizer and architecture optimized for efficient Hindi language understanding and generation.
This model focuses on providing lightweight, efficient Hindi NLP capabilities while maintaining strong contextual understanding.
Model Details
Model Description
ABIRHINv1 is a decoder-only transformer model trained entirely from scratch without using pretrained weights. It uses a custom tokenizer trained on Hindi-focused data and is optimized for efficient inference and fine-tuning.
- Developed by: Abir Maheshwari
- Funded by: Independent Research
- Shared by: Abir Maheshwari
- Model type: Causal Language Model (Decoder-only Transformer)
- Language(s): Hindi, Hinglish, English
- License: MIT
- Finetuned from model: None (trained from scratch)
Model Sources
- Repository: https://huggingface.co/abirmaheshwari/abirhinv1
- Architecture: Custom Transformer
- Framework: PyTorch + HuggingFace Transformers
Uses
Direct Use
ABIRHINv1 is suitable for:
- Hindi text generation
- Hinglish conversational AI
- Hindi chatbots
- Text completion
- NLP research
- Educational purposes
Example applications:
- Hindi AI assistants
- Content generation
- Research on small language models
Downstream Use
This model can be fine-tuned for:
- Hindi instruction models
- Question answering
- Domain-specific NLP tasks
- Hindi conversational agents
Out-of-Scope Use
Not recommended for:
- Critical decision systems
- Medical diagnosis
- Legal advice
- Safety-critical applications
This is an early version model.
Bias, Risks, and Limitations
ABIRHINv1 may:
- Produce incorrect information
- Reflect biases present in training data
- Generate incomplete or nonsensical outputs
This is expected for models trained on limited datasets.
Recommendations
Use this model:
- For research
- For experimentation
- For fine-tuning
Not recommended for production without further training.
How to Get Started
from transformers import AutoTokenizer, AutoModelForCausalLM
tokenizer = AutoTokenizer.from_pretrained("abirmaheshwari/abirhinv1")
model = AutoModelForCausalLM.from_pretrained("abirmaheshwari/abirhinv1")
input_text = "भारत एक महान देश है क्योंकि"
inputs = tokenizer(input_text, return_tensors="pt")
outputs = model.generate(
**inputs,
max_length=100,
temperature=0.7
)
print(tokenizer.decode(outputs[0]))
---
# Training Details
## Training Data
ABIRHINv1 was trained on a custom Hindi-focused dataset designed to capture linguistic patterns, conversational structures, and general language usage.
The dataset includes:
- Hindi natural text corpus
- Hinglish conversational text
- General-purpose text data
Dataset specifications:
- Total samples: ~161,000
- Tokenizer: Custom-trained BPE tokenizer
- Tokenizer trained entirely from scratch
---
## Training Procedure
### Preprocessing
The dataset was processed using a custom Byte Pair Encoding (BPE) tokenizer with the following specifications:
- Vocabulary size: 32,000 tokens
- Maximum sequence length: 512 tokens
- Tokenizer trained on the same dataset as the model
---
### Training Hyperparameters
The model was trained using the following configuration:
- Optimizer: AdamW
- Learning rate: 5 × 10⁻⁵
- Training precision: FP16 mixed precision
- Epochs: 3
- Batch size: 4
- Training objective: Causal Language Modeling
---
### Training Hardware
Training was performed using GPU acceleration.
- GPU: NVIDIA GPU (CUDA-enabled)
- Framework: PyTorch
- Library: HuggingFace Transformers
---
# Evaluation
## Testing Data
Evaluation was performed using custom Hindi text samples representative of real-world usage.
---
## Metrics
Primary evaluation metric:
- Training loss monitoring
---
## Results
The model successfully learns:
- Hindi sentence structure
- Token relationships
- Language continuity
- Contextual text generation
The model demonstrates functional Hindi generation capability suitable for research and fine-tuning.
---
# Technical Specifications
## Architecture
ABIRHINv1 uses a decoder-only Transformer architecture consisting of:
- Token embedding layer
- Learned positional embeddings
- Multi-head self-attention layers
- Feedforward neural network layers
- GELU activation function
- Weight tying between embedding and output layers
---
## Model Size
- Total parameters: ~96 Million
- Context length: 512 tokens
- Vocabulary size: 32,000 tokens
---
# Compute Infrastructure
## Hardware
- NVIDIA GPU
---
## Software
- Python
- PyTorch
- HuggingFace Transformers
- SafeTensors
---
# Environmental Impact
Training specifications:
- Hardware type: NVIDIA GPU
- Training duration: ~3–4 hours
- Framework: PyTorch
---
# Author
Abir Maheshwari
Independent AI Researcher
HuggingFace Profile:
https://huggingface.co/abirmaheshwari
---
# Version
ABIRHINv1
Initial release version.
---
# Contact
For questions, collaboration, or research inquiries:
HuggingFace:
https://huggingface.co/abirmaheshwari
---
- Downloads last month
- 13