| --- |
| title: AI Based Data Cleaner |
| emoji: π |
| colorFrom: red |
| colorTo: red |
| sdk: streamlit |
| app_file: src/streamlit_app.py |
| app_port: 8501 |
| tags: |
| - streamlit |
| pinned: false |
| short_description: Comprehensive AI-powered data cleaning and validation web ap |
| license: mit |
| sdk_version: 1.46.1 |
| --- |
| |
| # π€ Hugging Face |
|
|
| [](https://python.org) |
| [](https://pytorch.org) |
| [](https://tensorflow.org) |
| [](LICENSE) |
|
|
| Hugging Face is the AI community building the future. Our platform provides tools, libraries, and resources to discover, collaborate on, and build with state-of-the-art machine learning models. |
|
|
| ## π Features |
|
|
| ### π Model Hub |
| - Access thousands of pre-trained models for NLP, computer vision, audio, and more |
| - Filter models by task, framework, language, and license |
| - Community-contributed models with documentation and examples |
|
|
| ### π§ Transformers Library |
| - Easy-to-use API for state-of-the-art models (BERT, GPT, T5, LLaMA, etc.) |
| - Multi-framework support (PyTorch, TensorFlow, JAX) |
| - Optimized for research and production |
|
|
| ### π Datasets |
| - Thousands of ready-to-use datasets for various ML tasks |
| - Standardized access pattern across all datasets |
| - Efficient data loading and preprocessing |
|
|
| ### π οΈ Spaces |
| - Interactive ML demos and applications |
| - Share your models with the community |
| - Built-in deployment and hosting |
|
|
| ## π Installation |
|
|
| ### Basic Installation |
| ```bash |
| pip install transformers |
| ``` |
|
|
| ### With TensorFlow |
| ```bash |
| pip install 'transformers[tf-cpu]' |
| ``` |
|
|
| ### With Flax |
| ```bash |
| pip install 'transformers[flax]' |
| ``` |
|
|
| ### For Apple Silicon (M1/ARM) |
| ```bash |
| # Install prerequisites |
| brew install cmake |
| brew install pkg-config |
| |
| # Then install TensorFlow |
| pip install 'transformers[tf-cpu]' |
| ``` |
|
|
| ## π Quick Start |
|
|
| ### Verify Installation |
| ```python |
| from transformers import pipeline |
| print(pipeline('sentiment-analysis')('we love you')) |
| # Output: [{'label': 'POSITIVE', 'score': 0.9998704791069031}] |
| ``` |
|
|
| ## π₯ Popular Models |
|
|
| ### LLaMA & LLaVA Models |
| - LLaMA: High-performance foundation models |
| - LLaVA-NeXT: Improved reasoning, OCR, and world knowledge |
| - VipLLaVA: Understanding arbitrary visual prompts |
|
|
| ### Multimodal Models |
| - CLIP: Connect images and text |
| - Stable Diffusion: Generate images from text |
| - Whisper: Speech recognition and translation |
|
|
| ## π§ͺ MLX Support |
| - Native support for Apple silicon |
| - Efficient model training and serving |
| - Examples for text generation, fine-tuning, image generation, and speech recognition |
|
|
| ## π Example Use Cases |
|
|
| ### Text Classification |
| ```python |
| from transformers import pipeline |
| classifier = pipeline("sentiment-analysis") |
| result = classifier("I love working with Hugging Face!") |
| print(result) |
| ``` |
|
|
| ### Image Analysis |
| ```python |
| from transformers import pipeline |
| image_classifier = pipeline("image-classification") |
| result = image_classifier("path/to/image.jpg") |
| print(result) |
| ``` |
|
|
| ### Multimodal Analysis |
| ```python |
| # Analyzing artistic styles with multimodal embeddings |
| import fiftyone as fo |
| import fiftyone.utils.huggingface as fouh |
| |
| dataset = fouh.load_from_hub( |
| "huggan/wikiart", |
| format="parquet", |
| classification_fields=["artist", "style", "genre"], |
| max_samples=1000, |
| name="wikiart", |
| ) |
| ``` |
|
|
| ## π Documentation |
| Visit [huggingface.co/docs](https://huggingface.co/docs) for comprehensive documentation. |
|
|
| ## π€ Contributing |
| Join the Hugging Face community to collaborate on models, datasets, and Spaces. |
|
|
| ## π License |
| This project is licensed under the Apache 2.0 License - see the [LICENSE](LICENSE) file for details. |
|
|
| --- |
|
|
| **Made with β€οΈ by the Hugging Face team and community** |