Spaces:

Shouryahere
/

infy

Running

App Files Files Community

infy / README.md

shourya

Downgrade Gradio for Spaces compatibility

de8cf16 2 months ago

preview code

raw

history blame contribute delete

7.75 kB

A newer version of the Gradio SDK is available: 6.16.0

Upgrade

metadata

title: Infy
emoji: 🐢
colorFrom: gray
colorTo: purple
sdk: gradio
sdk_version: 5.23.1
python_version: 3.11
app_file: app.py
pinned: false

🤗 HuggingFace Enabling Sessions

Interactive Demo Platform for Transformers, Hub APIs, and NLP Pipelines

📋 Overview

This is an interactive Gradio application designed for the HuggingFace Enabling Sessions workshop. It provides hands-on demonstrations of:

Session 1 (45 min): Introduction to the HuggingFace ecosystem, Transformers architecture, and best practices
Session 2 (90 min): Hands-on developer workshop with tokenization deep dives and inference playground across 5+ NLP tasks

🚀 Quick Start

The app is hosted on HuggingFace Spaces and requires no local installation. Simply:

Open the Spaces URL
Explore the 3 main tabs:
- Session 1: Introduction — Embedded slides + live NLP demos
- Session 2: Hands-On Developer — Tokenizer explorer + inference playground
- Resources & Next Steps — Documentation links and learning resources

🎯 Pre-Session Setup (For Presenters)

Want instant, offline demos with zero network dependencies?

If you're presenting and need models pre-cached (e.g., company network restrictions), follow these guides:

QUICK_SETUP.md — 10-minute setup (recommended for demos)
- Download models locally
- Test everything works
- Push to Spaces for instant loading
scripts/USING_LOCAL_MODELS.md — Deep dive guide
- How local model caching works
- Git LFS for large files
- Troubleshooting

TL;DR: python3 scripts/download_lightweight_models.py && git add models/ && git push origin main ✅

This ensures models are available without any external downloads during your session.

📚 Session Contents

Session 1: Introduction to HuggingFace (45 minutes)

Topics Covered:

HuggingFace Platform overview (Hub, Transformers, Datasets, Spaces)
Core abstractions: Pipelines, Models, Tokenizers
Architecture patterns: Encoders (BERT), Decoders (GPT), Encoder-Decoders (T5/BART)
Enterprise NLP landscape (licensing, open-source vs. commercial)

Live Demos:

Sentiment Analysis using DistilBERT
Named Entity Recognition (NER) with BERT

Materials: SESSION1_SLIDES.md

Session 2: Hands-On Developer Workshop (90 minutes)

Topics Covered:

Tokenization mechanics and strategies
Inference across 5+ NLP tasks
Understanding model outputs and confidence scores
Production considerations and optimization

Interactive Tasks:

🔤 Tokenization Explorer — Visualize how text becomes token IDs
📊 Sentiment Analysis — Classify text emotions
🏷️ Named Entity Recognition — Extract persons, organizations, locations
❓ Question Answering — Answer questions from context
📝 Text Summarization — Generate concise summaries
🔗 Semantic Similarity — Compare text meaning

Materials: SESSION2_SLIDES.md

🛠️ Project Structure

infy/
├── app.py                          # Main Gradio application
├── config.py                       # Configuration (model IDs, task definitions)
├── utils.py                        # Utility functions for inference
├── requirements.txt                # Python dependencies
├── README.md                       # This file
├── SPEAKER_NOTES.md               # Presenter guide with timing
├── slides/
│   ├── SESSION1_SLIDES.md        # Session 1 presentation content
│   └── SESSION2_SLIDES.md        # Session 2 presentation content
└── data/
    ├── sample_texts.csv           # Sample texts for demos
    └── demo_samples/
        ├── sentiment.txt
        ├── ner.txt
        ├── qa.txt
        ├── summarization.txt
        └── embeddings.txt

🤖 Models Used

Task	Model	Type	License
Sentiment Analysis	distilbert-base-uncased-finetuned-sst-2-english	Encoder	Apache 2.0
Named Entity Recognition	dslim/bert-base-NER	Encoder	Apache 2.0
Question Answering	deepset/roberta-base-squad2	Encoder	Apache 2.0
Summarization	facebook/bart-large-cnn	Encoder-Decoder	MIT
Semantic Similarity	sentence-transformers/all-MiniLM-L6-v2	Encoder	Apache 2.0

📖 How to Use

During Sessions

Access the Spaces URL — Attendees join via shared link
Session 1 (45 min)
- Presenter screens shares and narrates through slides
- Live demos showcase "click-to-run" NLP tasks
- Q&A after each major section
Session 2 (90 min)
- Presenter guides attendees through tokenization and inference
- Attendees observe interactive widgets
- Exercise checkpoints for hands-on exploration
- Discussion on production considerations

After Sessions

Clone the repository:

git clone https://huggingface.co/spaces/[your-username]/infy

Install dependencies:
```
pip install -r requirements.txt
```
Run locally:
```
python app.py
```
Explore further:
- Modify sample data in data/sample_texts.csv
- Add more models to config.py
- Create custom tasks in app.py

🎓 Learning Resources

Official Documentation

Model Hub

Browse 100K+ models: https://huggingface.co/models
Search by task, language, or architecture

Community

Next Steps

Fine-tune on your data — Adapt pre-trained models for domain-specific tasks
Deploy to Spaces — Create interactive demos like this
Publish to the Hub — Share models and datasets with the community
Explore advanced techniques — Quantization, distillation, multi-model pipelines

🔧 Customization

Add a New Task

Add model to config.py:

"new_task": {
    "name": "Task Name",
    "model": "model-id-from-hub",
    "example": "example text",
}

Add function to utils.py:

def run_new_task(text):
    pipe = load_pipeline("new_task")
    return pipe(text)

Add widget to app.py:

with gr.Tab("New Task"):
    input_box = gr.Textbox()
    output_box = gr.Markdown()
    btn.click(run_new_task, inputs=[input_box], outputs=[output_box])

Modify Sample Data

Edit data/sample_texts.csv or add .txt files to data/demo_samples/

📝 Environment

Python: 3.8+
Framework: Gradio 6.9.0
ML: Transformers, Torch
Hosting: HuggingFace Spaces

📄 License

This project is open-source and available for educational and commercial use. Model licenses vary—see individual model cards for details.

👨‍🏫 Presenter Notes

See SPEAKER_NOTES.md for:

Session timing breakdowns
Demo sequences and talking points
Troubleshooting common issues
Tips for live presentations

📧 Questions & Feedback

Ask during the sessions
Post on HuggingFace Forums
Follow up on company Slack/Teams

Ready to dive into NLP? Start with Session 1: Introduction! 🚀