Spaces:

Shouryahere
/

infy

Running

App Files Files Community

infy / README.md

shourya

Downgrade Gradio for Spaces compatibility

de8cf16 2 months ago

preview code

raw

history blame contribute delete

7.75 kB

	---
	title: Infy
	emoji: 🐢
	colorFrom: gray
	colorTo: purple
	sdk: gradio
	sdk_version: 5.23.1
	python_version: 3.11
	app_file: app.py
	pinned: false
	---

	# 🤗 HuggingFace Enabling Sessions
	Interactive Demo Platform for Transformers, Hub APIs, and NLP Pipelines

	## 📋 Overview

	This is an interactive Gradio application designed for the HuggingFace Enabling Sessions workshop. It provides hands-on demonstrations of:

	- Session 1 (45 min): Introduction to the HuggingFace ecosystem, Transformers architecture, and best practices
	- Session 2 (90 min): Hands-on developer workshop with tokenization deep dives and inference playground across 5+ NLP tasks

	## 🚀 Quick Start

	The app is hosted on HuggingFace Spaces and requires no local installation. Simply:

	1. Open the Spaces URL
	2. Explore the 3 main tabs:
	- Session 1: Introduction — Embedded slides + live NLP demos
	- Session 2: Hands-On Developer — Tokenizer explorer + inference playground
	- Resources & Next Steps — Documentation links and learning resources

	### 🎯 Pre-Session Setup (For Presenters)

	Want instant, offline demos with zero network dependencies?

	If you're presenting and need models pre-cached (e.g., company network restrictions), follow these guides:

	- [QUICK_SETUP.md](QUICK_SETUP.md) — 10-minute setup (recommended for demos)
	- Download models locally
	- Test everything works
	- Push to Spaces for instant loading

	- [scripts/USING_LOCAL_MODELS.md](scripts/USING_LOCAL_MODELS.md) — Deep dive guide
	- How local model caching works
	- Git LFS for large files
	- Troubleshooting

	TL;DR: `python3 scripts/download_lightweight_models.py && git add models/ && git push origin main` ✅

	This ensures models are available without any external downloads during your session.

	## 📚 Session Contents

	### Session 1: Introduction to HuggingFace (45 minutes)
	Topics Covered:
	- HuggingFace Platform overview (Hub, Transformers, Datasets, Spaces)
	- Core abstractions: Pipelines, Models, Tokenizers
	- Architecture patterns: Encoders (BERT), Decoders (GPT), Encoder-Decoders (T5/BART)
	- Enterprise NLP landscape (licensing, open-source vs. commercial)

	Live Demos:
	- Sentiment Analysis using DistilBERT
	- Named Entity Recognition (NER) with BERT

	Materials: [SESSION1_SLIDES.md](slides/SESSION1_SLIDES.md)

	---

	### Session 2: Hands-On Developer Workshop (90 minutes)
	Topics Covered:
	- Tokenization mechanics and strategies
	- Inference across 5+ NLP tasks
	- Understanding model outputs and confidence scores
	- Production considerations and optimization

	Interactive Tasks:
	- 🔤 Tokenization Explorer — Visualize how text becomes token IDs
	- 📊 Sentiment Analysis — Classify text emotions
	- 🏷️ Named Entity Recognition — Extract persons, organizations, locations
	- ❓ Question Answering — Answer questions from context
	- 📝 Text Summarization — Generate concise summaries
	- 🔗 Semantic Similarity — Compare text meaning

	Materials: [SESSION2_SLIDES.md](slides/SESSION2_SLIDES.md)

	---

	## 🛠️ Project Structure

	```
	infy/
	├── app.py # Main Gradio application
	├── config.py # Configuration (model IDs, task definitions)
	├── utils.py # Utility functions for inference
	├── requirements.txt # Python dependencies
	├── README.md # This file
	├── SPEAKER_NOTES.md # Presenter guide with timing
	├── slides/
	│ ├── SESSION1_SLIDES.md # Session 1 presentation content
	│ └── SESSION2_SLIDES.md # Session 2 presentation content
	└── data/
	├── sample_texts.csv # Sample texts for demos
	└── demo_samples/
	├── sentiment.txt
	├── ner.txt
	├── qa.txt
	├── summarization.txt
	└── embeddings.txt
	```

	## 🤖 Models Used

	\| Task \| Model \| Type \| License \|
	\|------\|-------\|------\|---------\|
	\| Sentiment Analysis \| distilbert-base-uncased-finetuned-sst-2-english \| Encoder \| Apache 2.0 \|
	\| Named Entity Recognition \| dslim/bert-base-NER \| Encoder \| Apache 2.0 \|
	\| Question Answering \| deepset/roberta-base-squad2 \| Encoder \| Apache 2.0 \|
	\| Summarization \| facebook/bart-large-cnn \| Encoder-Decoder \| MIT \|
	\| Semantic Similarity \| sentence-transformers/all-MiniLM-L6-v2 \| Encoder \| Apache 2.0 \|

	## 📖 How to Use

	### During Sessions

	1. Access the Spaces URL — Attendees join via shared link
	2. Session 1 (45 min)
	- Presenter screens shares and narrates through slides
	- Live demos showcase "click-to-run" NLP tasks
	- Q&A after each major section

	3. Session 2 (90 min)
	- Presenter guides attendees through tokenization and inference
	- Attendees observe interactive widgets
	- Exercise checkpoints for hands-on exploration
	- Discussion on production considerations

	### After Sessions

	1. Clone the repository:
	```bash
	git clone https://huggingface.co/spaces/[your-username]/infy
	```

	2. Install dependencies:
	```bash
	pip install -r requirements.txt
	```

	3. Run locally:
	```bash
	python app.py
	```

	4. Explore further:
	- Modify sample data in `data/sample_texts.csv`
	- Add more models to `config.py`
	- Create custom tasks in `app.py`

	## 🎓 Learning Resources

	### Official Documentation
	- [Transformers Library Docs](https://huggingface.co/docs/transformers/)
	- [Datasets Library Docs](https://huggingface.co/docs/datasets/)
	- [HuggingFace Course (Free)](https://huggingface.co/course/)
	- [Hub Documentation](https://huggingface.co/docs/hub/)

	### Model Hub
	- Browse 100K+ models: https://huggingface.co/models
	- Search by task, language, or architecture

	### Community
	- [HuggingFace Forums](https://discuss.huggingface.co/)
	- [GitHub Issues](https://github.com/huggingface/transformers/issues)
	- Twitter: [@huggingface](https://twitter.com/huggingface)

	### Next Steps
	- Fine-tune on your data — Adapt pre-trained models for domain-specific tasks
	- Deploy to Spaces — Create interactive demos like this
	- Publish to the Hub — Share models and datasets with the community
	- Explore advanced techniques — Quantization, distillation, multi-model pipelines

	## 🔧 Customization

	### Add a New Task

	1. Add model to `config.py`:
	```python
	"new_task": {
	"name": "Task Name",
	"model": "model-id-from-hub",
	"example": "example text",
	}
	```

	2. Add function to `utils.py`:
	```python
	def run_new_task(text):
	pipe = load_pipeline("new_task")
	return pipe(text)
	```

	3. Add widget to `app.py`:
	```python
	with gr.Tab("New Task"):
	input_box = gr.Textbox()
	output_box = gr.Markdown()
	btn.click(run_new_task, inputs=[input_box], outputs=[output_box])
	```

	### Modify Sample Data

	Edit `data/sample_texts.csv` or add `.txt` files to `data/demo_samples/`

	## 📝 Environment

	- Python: 3.8+
	- Framework: Gradio 6.9.0
	- ML: Transformers, Torch
	- Hosting: HuggingFace Spaces

	## 📄 License

	This project is open-source and available for educational and commercial use. Model licenses vary—see individual model cards for details.

	## 👨‍🏫 Presenter Notes

	See [SPEAKER_NOTES.md](SPEAKER_NOTES.md) for:
	- Session timing breakdowns
	- Demo sequences and talking points
	- Troubleshooting common issues
	- Tips for live presentations

	## 📧 Questions & Feedback

	- Ask during the sessions
	- Post on HuggingFace Forums
	- Follow up on company Slack/Teams

	---

	Ready to dive into NLP? Start with Session 1: Introduction! 🚀