Sentiment Analysis with DistilBERT

Complete ML project with training, advanced inference, interpretability and production deployment

74% Accuracy
66.9M Parameters
~100ms Inference Time

Interactive Demo

Conectando a la API...

Individual Text Analysis

Batch Analysis

Model Configuration

Model Interpretability

Explore how the model makes decisions through attention visualizations and SHAP analysis

Interpretability Analysis

Attention Visualization

Analyze a text to see how the model's attention mechanism focuses on different words and phrases.

The visualization will show:

  • Attention patterns across all layers
  • Heatmap of token relationships
  • Interactive layer and head exploration

SHAP Explanation

SHAP (SHapley Additive exPlanations) provides detailed feature importance analysis.

Understanding SHAP values:

  • Shows positive and negative contributions
  • Highlights impactful words in red/blue
  • Based on game theory principles

Token Importance

See which words contribute most to the model's decision.

This visualization shows:

  • Relative importance of each token
  • Attention weight distribution
  • Key words influencing the prediction

Model Metrics

Métricas de Entrenamiento

Epochs: 3
Learning Rate: 2e-05
Batch Size: 16

Rendimiento del Modelo

74%
73%
0.59

Arquitectura del Modelo

DistilBERT-base-uncased
6 Transformer Layers
12 Attention Heads
768 Hidden Size
30,522 Vocabulary

Arquitectura del Sistema

Datos

Dataset IMDB
50K reseñas

Preprocesamiento

Tokenización
DistilBERT

Modelo

DistilBERT
Fine-tuning

API

FastAPI
Inferencia

Frontend

React/JS
UI Interactiva

Stack Tecnológico

Python
PyTorch
Transformers
FastAPI
Docker
JavaScript

Acerca del Proyecto

Este proyecto demuestra una implementación completa de análisis de sentimientos usando Transformers, desde el entrenamiento hasta el deployment en producción.

Características Principales:

  • Fine-tuning de DistilBERT en dataset IMDB
  • API de producción con FastAPI
  • Procesamiento por lotes optimizado
  • Visualización de métricas en tiempo real
  • Interpretabilidad con attention weights
  • Deployment con Docker
  • Testing comprehensivo

Rendimiento

Accuracy: 74%
Latencia: ~100ms
Throughput: 1000+ req/s

Escalabilidad

Horizontal scaling
Load balancing
Auto-restart