contexto-api / models /README.md
Dev-ks04
feat: Contexto FastAPI backend - intent-aware summarization engine
39028c9

Models Directory

This directory contains pre-trained and fine-tuned models for the Intent-Aware Context-Preserving Summarization project.

Directory Structure

models/
β”œβ”€β”€ README.md                 # This file
β”œβ”€β”€ download_models.py        # Script to download pre-trained models
β”œβ”€β”€ model_configs.json        # Model configurations and metadata
β”œβ”€β”€ checkpoints/              # Fine-tuned model checkpoints
└── tokenizers/               # Tokenizer files

Available Models

Pre-trained Models from Hugging Face

Model Name Model ID Size Best For
T5-Small google-t5/t5-small ~77MB Quick testing, prototyping
T5-Base google-t5/t5-base ~220MB General use, production
T5-Large google-t5/t5-large ~738MB High-quality summaries
BART-Base facebook/bart-base ~558MB General summarization
BART-Large-CNN facebook/bart-large-cnn ~1.6GB News/article summarization
PEGASUS-arXiv google/pegasus-arxiv ~568MB Scientific papers
PEGASUS-PubMed google/pegasus-pubmed ~562MB Medical/biomedical documents
LED-Base allenai/led-base-16384 ~660MB Long documents (4096 tokens)

Downloading Models

Automatic Download

python models/download_models.py

Manual Download

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Download T5-base
tokenizer = AutoTokenizer.from_pretrained("google-t5/t5-base")
model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

# Save locally
tokenizer.save_pretrained("models/t5-base-tokenizer")
model.save_pretrained("models/t5-base-model")

Fine-tuned Models

Fine-tuned models will be stored in models/checkpoints/ after training:

  • intent-classifier-v1/ - Intent detection model
  • summarizer-technical-v1/ - Fine-tuned summarization model
  • summarizer-intent-aware-v1/ - Intent-aware summarization model

Model Configuration

Edit model_configs.json to configure:

  • Model selection
  • Tokenization parameters
  • Generation settings
  • Evaluation metrics

Using Models

Load Pre-trained Model

from src.models import SummarizationModelLoader

loader = SummarizationModelLoader(model_name='t5-base')
model, tokenizer = loader.load_model()

Load Fine-tuned Checkpoint

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_path = "models/checkpoints/summarizer-technical-v1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSeq2SeqLM.from_pretrained(model_path)

Storage Requirements

  • Small Models: ~500MB total
  • Large Models: ~3GB total
  • With Fine-tuning Data: ~5-10GB

GPU Memory Requirements

Model GPU Memory
T5-Small 2GB
T5-Base 6GB
T5-Large 12GB+
BART-Base 6GB
BART-Large-CNN 12GB+
LED-Base 8GB

Best Practices

  1. Start with small models for testing and development
  2. Use cached models to avoid repeated downloads
  3. Monitor GPU memory when loading large models
  4. Save fine-tuned models with meaningful version numbers
  5. Document model changes in model metadata

Troubleshooting

Out of Memory Error

import torch
torch.cuda.empty_cache()  # Clear GPU cache
# Or use CPU: device='cpu'

Model Download Issues

  • Check internet connection
  • Verify Hugging Face API is accessible
  • Try downloading specific model versions
  • Use cache_dir parameter to specify custom location

Tokenizer Mismatch

Ensure tokenizer version matches model version:

tokenizer = AutoTokenizer.from_pretrained(model_path)  # Load matching tokenizer

References

Contributing

If you fine-tune new models:

  1. Save with meaningful names and versions
  2. Document performance metrics
  3. Include training parameters
  4. Add model cards with descriptions
  5. Update this README

Last Updated: January 15, 2024