---
title: Infy
emoji: 🐢
colorFrom: gray
colorTo: purple
sdk: gradio
sdk_version: 5.23.1
python_version: 3.11
app_file: app.py
pinned: false
---

# 🤗 HuggingFace Enabling Sessions
**Interactive Demo Platform for Transformers, Hub APIs, and NLP Pipelines**

## 📋 Overview

This is an interactive Gradio application designed for the **HuggingFace Enabling Sessions** workshop. It provides hands-on demonstrations of:

- **Session 1 (45 min):** Introduction to the HuggingFace ecosystem, Transformers architecture, and best practices
- **Session 2 (90 min):** Hands-on developer workshop with tokenization deep dives and inference playground across 5+ NLP tasks

## 🚀 Quick Start

The app is hosted on HuggingFace Spaces and requires **no local installation**. Simply:

1. Open the Spaces URL
2. Explore the 3 main tabs:
   - **Session 1: Introduction** — Embedded slides + live NLP demos
   - **Session 2: Hands-On Developer** — Tokenizer explorer + inference playground
   - **Resources & Next Steps** — Documentation links and learning resources

### 🎯 Pre-Session Setup (For Presenters)

**Want instant, offline demos with zero network dependencies?**

If you're presenting and need models pre-cached (e.g., company network restrictions), follow these guides:

- **[QUICK_SETUP.md](QUICK_SETUP.md)** — 10-minute setup (recommended for demos)
  - Download models locally
  - Test everything works
  - Push to Spaces for instant loading
  
- **[scripts/USING_LOCAL_MODELS.md](scripts/USING_LOCAL_MODELS.md)** — Deep dive guide
  - How local model caching works
  - Git LFS for large files
  - Troubleshooting

**TL;DR:** `python3 scripts/download_lightweight_models.py && git add models/ && git push origin main` ✅

This ensures models are available **without any external downloads during your session**.

## 📚 Session Contents

### Session 1: Introduction to HuggingFace (45 minutes)
**Topics Covered:**
- HuggingFace Platform overview (Hub, Transformers, Datasets, Spaces)
- Core abstractions: Pipelines, Models, Tokenizers
- Architecture patterns: Encoders (BERT), Decoders (GPT), Encoder-Decoders (T5/BART)
- Enterprise NLP landscape (licensing, open-source vs. commercial)

**Live Demos:**
- Sentiment Analysis using DistilBERT
- Named Entity Recognition (NER) with BERT

**Materials:** [SESSION1_SLIDES.md](slides/SESSION1_SLIDES.md)

---

### Session 2: Hands-On Developer Workshop (90 minutes)
**Topics Covered:**
- Tokenization mechanics and strategies
- Inference across 5+ NLP tasks
- Understanding model outputs and confidence scores
- Production considerations and optimization

**Interactive Tasks:**
- 🔤 **Tokenization Explorer** — Visualize how text becomes token IDs
- 📊 **Sentiment Analysis** — Classify text emotions
- 🏷️ **Named Entity Recognition** — Extract persons, organizations, locations
- ❓ **Question Answering** — Answer questions from context
- 📝 **Text Summarization** — Generate concise summaries
- 🔗 **Semantic Similarity** — Compare text meaning

**Materials:** [SESSION2_SLIDES.md](slides/SESSION2_SLIDES.md)

---

## 🛠️ Project Structure

```
infy/
├── app.py                          # Main Gradio application
├── config.py                       # Configuration (model IDs, task definitions)
├── utils.py                        # Utility functions for inference
├── requirements.txt                # Python dependencies
├── README.md                       # This file
├── SPEAKER_NOTES.md               # Presenter guide with timing
├── slides/
│   ├── SESSION1_SLIDES.md        # Session 1 presentation content
│   └── SESSION2_SLIDES.md        # Session 2 presentation content
└── data/
    ├── sample_texts.csv           # Sample texts for demos
    └── demo_samples/
        ├── sentiment.txt
        ├── ner.txt
        ├── qa.txt
        ├── summarization.txt
        └── embeddings.txt
```

## 🤖 Models Used

| Task | Model | Type | License |
|------|-------|------|---------|
| Sentiment Analysis | distilbert-base-uncased-finetuned-sst-2-english | Encoder | Apache 2.0 |
| Named Entity Recognition | dslim/bert-base-NER | Encoder | Apache 2.0 |
| Question Answering | deepset/roberta-base-squad2 | Encoder | Apache 2.0 |
| Summarization | facebook/bart-large-cnn | Encoder-Decoder | MIT |
| Semantic Similarity | sentence-transformers/all-MiniLM-L6-v2 | Encoder | Apache 2.0 |

## 📖 How to Use

### During Sessions

1. **Access the Spaces URL** — Attendees join via shared link
2. **Session 1 (45 min)**
   - Presenter screens shares and narrates through slides
   - Live demos showcase "click-to-run" NLP tasks
   - Q&A after each major section

3. **Session 2 (90 min)**
   - Presenter guides attendees through tokenization and inference
   - Attendees observe interactive widgets
   - Exercise checkpoints for hands-on exploration
   - Discussion on production considerations

### After Sessions

1. **Clone the repository:**
   ```bash
   git clone https://huggingface.co/spaces/[your-username]/infy
   ```

2. **Install dependencies:**
   ```bash
   pip install -r requirements.txt
   ```

3. **Run locally:**
   ```bash
   python app.py
   ```

4. **Explore further:**
   - Modify sample data in `data/sample_texts.csv`
   - Add more models to `config.py`
   - Create custom tasks in `app.py`

## 🎓 Learning Resources

### Official Documentation
- [Transformers Library Docs](https://huggingface.co/docs/transformers/)
- [Datasets Library Docs](https://huggingface.co/docs/datasets/)
- [HuggingFace Course (Free)](https://huggingface.co/course/)
- [Hub Documentation](https://huggingface.co/docs/hub/)

### Model Hub
- Browse 100K+ models: https://huggingface.co/models
- Search by task, language, or architecture

### Community
- [HuggingFace Forums](https://discuss.huggingface.co/)
- [GitHub Issues](https://github.com/huggingface/transformers/issues)
- Twitter: [@huggingface](https://twitter.com/huggingface)

### Next Steps
- **Fine-tune on your data** — Adapt pre-trained models for domain-specific tasks
- **Deploy to Spaces** — Create interactive demos like this
- **Publish to the Hub** — Share models and datasets with the community
- **Explore advanced techniques** — Quantization, distillation, multi-model pipelines

## 🔧 Customization

### Add a New Task

1. **Add model to `config.py`:**
   ```python
   "new_task": {
       "name": "Task Name",
       "model": "model-id-from-hub",
       "example": "example text",
   }
   ```

2. **Add function to `utils.py`:**
   ```python
   def run_new_task(text):
       pipe = load_pipeline("new_task")
       return pipe(text)
   ```

3. **Add widget to `app.py`:**
   ```python
   with gr.Tab("New Task"):
       input_box = gr.Textbox()
       output_box = gr.Markdown()
       btn.click(run_new_task, inputs=[input_box], outputs=[output_box])
   ```

### Modify Sample Data

Edit `data/sample_texts.csv` or add `.txt` files to `data/demo_samples/`

## 📝 Environment

- **Python:** 3.8+
- **Framework:** Gradio 6.9.0
- **ML:** Transformers, Torch
- **Hosting:** HuggingFace Spaces

## 📄 License

This project is open-source and available for educational and commercial use. Model licenses vary—see individual model cards for details.

## 👨‍🏫 Presenter Notes

See [SPEAKER_NOTES.md](SPEAKER_NOTES.md) for:
- Session timing breakdowns
- Demo sequences and talking points
- Troubleshooting common issues
- Tips for live presentations

## 📧 Questions & Feedback

- Ask during the sessions
- Post on HuggingFace Forums
- Follow up on company Slack/Teams

---

**Ready to dive into NLP? Start with Session 1: Introduction! 🚀**