Spaces:

Dev-ks04
/

contexto-api

Running

App Files Files Community

contexto-api / models /README.md

Dev-ks04

feat: Contexto FastAPI backend - intent-aware summarization engine

39028c9 2 days ago

preview code

raw

history blame contribute delete

4.18 kB

Models Directory

This directory contains pre-trained and fine-tuned models for the Intent-Aware Context-Preserving Summarization project.

Directory Structure

models/
├── README.md                 # This file
├── download_models.py        # Script to download pre-trained models
├── model_configs.json        # Model configurations and metadata
├── checkpoints/              # Fine-tuned model checkpoints
└── tokenizers/               # Tokenizer files

Available Models

Pre-trained Models from Hugging Face

Model Name	Model ID	Size	Best For
T5-Small	google-t5/t5-small	~77MB	Quick testing, prototyping
T5-Base	google-t5/t5-base	~220MB	General use, production
T5-Large	google-t5/t5-large	~738MB	High-quality summaries
BART-Base	facebook/bart-base	~558MB	General summarization
BART-Large-CNN	facebook/bart-large-cnn	~1.6GB	News/article summarization
PEGASUS-arXiv	google/pegasus-arxiv	~568MB	Scientific papers
PEGASUS-PubMed	google/pegasus-pubmed	~562MB	Medical/biomedical documents
LED-Base	allenai/led-base-16384	~660MB	Long documents (4096 tokens)

Downloading Models

Automatic Download

python models/download_models.py

Manual Download

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

# Download T5-base
tokenizer = AutoTokenizer.from_pretrained("google-t5/t5-base")
model = AutoModelForSeq2SeqLM.from_pretrained("google-t5/t5-base")

# Save locally
tokenizer.save_pretrained("models/t5-base-tokenizer")
model.save_pretrained("models/t5-base-model")

Fine-tuned Models

Fine-tuned models will be stored in models/checkpoints/ after training:

intent-classifier-v1/ - Intent detection model
summarizer-technical-v1/ - Fine-tuned summarization model
summarizer-intent-aware-v1/ - Intent-aware summarization model

Model Configuration

Edit model_configs.json to configure:

Model selection
Tokenization parameters
Generation settings
Evaluation metrics

Using Models

Load Pre-trained Model

from src.models import SummarizationModelLoader

loader = SummarizationModelLoader(model_name='t5-base')
model, tokenizer = loader.load_model()

Load Fine-tuned Checkpoint

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM

model_path = "models/checkpoints/summarizer-technical-v1"
tokenizer = AutoTokenizer.from_pretrained(model_path)
model = AutoModelForSeq2SeqLM.from_pretrained(model_path)

Storage Requirements

Small Models: ~500MB total
Large Models: ~3GB total
With Fine-tuning Data: ~5-10GB

GPU Memory Requirements

Model	GPU Memory
T5-Small	2GB
T5-Base	6GB
T5-Large	12GB+
BART-Base	6GB
BART-Large-CNN	12GB+
LED-Base	8GB

Best Practices

Start with small models for testing and development
Use cached models to avoid repeated downloads
Monitor GPU memory when loading large models
Save fine-tuned models with meaningful version numbers
Document model changes in model metadata

Troubleshooting

Out of Memory Error

import torch
torch.cuda.empty_cache()  # Clear GPU cache
# Or use CPU: device='cpu'

Model Download Issues

Check internet connection
Verify Hugging Face API is accessible
Try downloading specific model versions
Use cache_dir parameter to specify custom location

Tokenizer Mismatch

Ensure tokenizer version matches model version:

tokenizer = AutoTokenizer.from_pretrained(model_path)  # Load matching tokenizer

References

Hugging Face Models: https://huggingface.co/models
Transformers Documentation: https://huggingface.co/docs/transformers
Model Cards: https://huggingface.co/docs/hub/models-cards

Contributing

If you fine-tune new models:

Save with meaningful names and versions
Document performance metrics
Include training parameters
Add model cards with descriptions
Update this README

Last Updated: January 15, 2024