YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

Manthan-T1

A from-scratch scaffold for a custom Transformers vision-language architecture named Manthan-T1.

What you get (today)

  • A clean project layout under manthan_t1/
  • A full HF custom architecture:
    • ManthanConfig in manthan_t1/configuration_manthan.py
    • ManthanForCausalLM in manthan_t1/modeling_manthan.py
  • A no-download HF forward smoke test:
    • python -m manthan_t1.hf_smoke
  • An MLX smoke test (kept for Apple Silicon readiness):
    • python -m manthan_t1.smoke_test

What we’ll add next

  • Qwen3-0.6B backbone wiring + weight loading (keeping the model type = manthan_t1)
  • SigLIP2 vision tower wiring + projector alignment
  • LoRA fine-tuning recipes for M4 16GB (MLX +/or PyTorch)
  • Multilingual “reply in user language” policy (Indian languages)

Quick smoke test

After installing dependencies, you should be able to run:

python -m manthan_t1.smoke_test
python -m manthan_t1.hf_smoke

This does not download any external models yet.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support