nayan90k's picture
Upload README.md with huggingface_hub
3fdd102 verified
metadata
language: en
tags:
  - question-generation
  - education
  - SLM
  - causal-lm
license: mit

SLM Question Generator - SFT (114M)

A 114-million-parameter Small Language Model fine-tuned to generate educational questions from a given passage of text.

Model Details

  • Architecture: Decoder-only Transformer
  • Parameters: 114.1 M
  • Layers: 12
  • d_model: 768
  • Attention: GQA (12 query heads / 4 KV heads)
  • Vocabulary: tiktoken r50k_base + 3 special tokens (50,260 total)
  • Context window: 4096 tokens

Evaluation Metrics

Evaluated on SQuAD v2 validation set (50 samples):

Metric Score
BERTScore Precision 0.8972
BERTScore Recall 0.8920
BERTScore F1 0.8945

Dataset

  • Training samples: 83,861
  • Validation samples: 4,414
  • Sources: SQuAD v2, HotpotQA

Usage

from src.inference.generate import load_model, generate_from_text

model, enc, device = load_model()
questions = generate_from_text(
    "Photosynthesis is the process by which green plants prepare food using sunlight.",
    question_type="short_answer",
    model=model,
    enc=enc,
    device=device,
    temperature=0.0,
)
print(questions[0])

Links

Tokenizer Note

This model uses the tiktoken library with the r50k_base encoding, plus <|im_start|>, <|im_end|>, and <|pad|> tokens.