LLama For Kathakali

A fine-tuned Llama-3.1-8B-Instruct model specialized for Kathakali ontology creation, narrative modelling, metadata generation, and domain-specific cultural reasoning.

This model assists in generating structured descriptions, story annotations, performer metadata, act interpretations, and domain-aligned outputs for classical Indian dance (Kathakali) research and digital humanities applications.

Model Details

Model Description

llama-for-kathakali is a causal language model fine-tuned on curated Kathakali-specific datasets, including:

  • character ontology descriptions
  • gestures and mudras metadata
  • performance sequences
  • story summaries (Aattakatha)
  • scene-level video annotations
  • performer attributes
  • domain-specific semantic structures

It supports applications such as ontology building, dataset generation, metadata expansion, and structured textual outputs for Kathakali-centered ML pipelines.

  • Developed by: Ashiq Firoz
  • Model type: Causal LM (Instruction-tuned)
  • Language: English (with limited Malayalam terminology support)
  • License: Follows upstream Meta Llama 3.1 Community License
  • Finetuned from: meta-llama/Llama-3.1-8B-Instruct
  • Access Requirement: You must have access to the base model (meta-llama/Llama-3.1-8B-Instruct) to fully use this model.

Model Sources [optional]

Uses

Direct Use

The model is designed for:

  • Automatic generation of Kathakali ontology entities
  • Metadata generation for cultural datasets
  • Semantic structuring of dance movements, mudras, and characters
  • Generating interpretive or descriptive text for research
  • Knowledge graph population
  • LLM-assisted dataset creation for downstream ML pipelines

Downstream Use

  • Video-to-text pipelines where generated text becomes training data
  • Dataset augmentation for gesture recognition or performance analysis
  • Text classification, extraction, and structured knowledge modeling
  • Fine-tuned modules for cultural heritage preservation systems

Out-of-Scope Use

The model is not intended for:

  • High-fidelity translation or Malayalam language generation
  • Medical, legal, or safety-critical decision-making
  • Improper cultural reinterpretation or misinformation
  • Generating factual claims about historical events without verification

Bias, Risks, and Limitations

  • Training data includes curated domain-specific texts; hence the model may exhibit:

    • cultural bias towards traditional interpretations
    • limited general reasoning outside Kathakali
    • incomplete understanding of regional linguistic nuances
    • hallucination in historical or artistic contexts
  • Model may over-generalize gestures, scenes, or characters if prompts are vague.

Recommendations

  • Use precise prompts to reduce hallucination
  • Prefer structured, schema-based prompts for ontology generation
  • Avoid using outputs as factual without expert review
  • Do not deploy in cultural or social decision-making contexts without human oversight

How to Get Started with the Model

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

HF_MODEL_NAME = "Edith08/llama-for-kathakali"

tokenizer = AutoTokenizer.from_pretrained(HF_MODEL_NAME)

model = AutoModelForCausalLM.from_pretrained(
    HF_MODEL_NAME,
    torch_dtype=torch.float16,
    device_map="auto"
)

prompt = "How to represent the word 'sun' in kathakali"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

output = model.generate(**inputs, max_new_tokens=300)
print(tokenizer.decode(output[0], skip_special_tokens=True))

Note: You must have access to meta-llama/Llama-3.1-8B-Instruct to load dependency weights.

Training Details

Training Data

Training data included:

  • Manually curated Kathakali ontology texts
  • Public-domain Aattakatha literature
  • Annotated descriptions of mudras, characters, costumes
  • Semi-structured metadata designed for ontology frameworks
  • Additional synthetic domain-aligned text

The dataset focuses on structured and semi-structured cultural knowledge.

Training Procedure

  • Base Model: Llama-3.1-8B-Instruct
  • Fine-Tuning Method: Supervised fine-tuning (SFT)
  • Precision: fp16 mixed precision
  • Sequence length: 4096
  • Optimizer: AdamW
  • Learning rate: 2e-5
  • Epochs: 3–5 depending on dataset splits

Speeds, Sizes, Times [optional]

  • Parameter count: ~8B (inherits upstream)
  • Checkpoint size: ~16 GB (fp16)
  • Training hardware: AMD MI300X
  • Training duration: Approximately 6–8 hours for SFT

Evaluation

Testing Data, Factors & Metrics

Testing Data

Evaluation was conducted on withheld domain-specific texts:

  • Ontology concept completion
  • Story-element summarization
  • Gesture-movement consistency checks
  • Metadata structuring tasks

Factors

Evaluations considered:

  • Cultural accuracy
  • Ontology structure correctness
  • Sequence consistency
  • Reduction of hallucinations
  • Token-level coherence

Metrics

  • Manual qualitative evaluation
  • BLEU / ROUGE-L for descriptive tasks
  • Schema adherence score (custom heuristic)

Results

  • Strong performance on structured ontology generation
  • High consistency with Kathakali terminology
  • Occasional hallucinations in abstract narrative tasks
  • Good alignment with research workflows
  • Limited general conversational ability

Environmental Impact

  • Hardware Type: GPUs (AMD MI300X)
  • Hours used: ~8 hours of fine-tuning
  • Cloud Provider: AMD Developer Cloud
  • Compute Region: US-Central
  • Estimated CO₂ Emitted: ~15–20 kg CO₂eq (approximate)

Technical Specifications [optional]

Model Architecture and Objective

  • Transformer-based causal language model
  • 32-layer architecture (inherits from Llama-3.1-8B)
  • Objective: next-token prediction under instruction tuning

Compute Infrastructure

  • Hardware: MI300X GPUs

  • Software:

    • PyTorch
    • Hugging Face Transformers
    • BitsAndBytes (optional)
    • Accelerate
    • Python 3.10

More Information

For more information about the project or dataset creation, you may contact the author.

[More Information Needed]

Model Card Authors

  • Ashiq Firoz

Model Card Contact

For questions, issues, or collaboration inquiries: Email: (ashiqfiroz08@gmail.com) HF Profile: https://huggingface.co/Edith08

Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support