--- title: EmissionFactor Mapper emoji: ๐ŸŒฟ colorFrom: purple colorTo: pink sdk: docker pinned: false license: mit short_description: AI-powered transaction classifier for carbon accounting --- # ๐ŸŒฟ Emission Factor Mapper **Intelligent AI-powered classification system for sustainability transaction data** Automatically map your financial transactions (like *"hotel booking for conference"*, *"electric vehicle charging"*, or *"office furniture purchase"*) to standardized emission factor categories (Cat1, Cat2) used for accurate COโ‚‚ footprint analysis and ESG reporting. [![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT) [![Python 3.8+](https://img.shields.io/badge/python-3.8+-blue.svg)](https://www.python.org/downloads/) [![HuggingFace](https://img.shields.io/badge/๐Ÿค—-HuggingFace-yellow)](https://huggingface.co/yassine123Z/EmissionFactor-mapper2-v2) --- ## ๐ŸŽฏ What Does This Do? This application solves a critical challenge in **carbon accounting**: automatically categorizing thousands of financial transactions into standardized emission categories. Instead of manually reviewing each purchase, expense, or invoice, the AI model: - โœ… **Classifies** transactions into 12 primary emission categories - โœ… **Maps** to 82 detailed subcategories for precise carbon calculations - โœ… **Provides** confidence scores for quality assurance - โœ… **Enables** batch processing of CSV files with review capabilities - โœ… **Tracks** manual corrections for continuous model improvement - โœ… **Compares** different AI models to optimize accuracy Perfect for **sustainability teams**, **carbon accountants**, **ESG analysts**, and **finance departments** working on Scope 3 emissions reporting. --- ## ๐Ÿš€ Demo ๐ŸŸข **Try the web UI:** [https://yassine123z-emissionfactor-mapper2-v2-gradio2ui.hf.space/](https://yassine123z-emissionfactor-mapper2-v2-gradio2ui.hf.space/) ### ๐Ÿ“ฑ Four Powerful Modes: #### 1๏ธโƒฃ **Single Transaction** - Quick Classification Enter any transaction description and get instant predictions: - **Input**: `"Business class flight from London to New York"` - **Output**: - Cat1: `Mobility (passengers)` - Cat2: `Air transport` - Confidence: `0.94` #### 2๏ธโƒฃ **Batch Review** - Process Hundreds at Once Upload a CSV file with your transactions and: - โœจ Get automatic classifications for all rows - ๐Ÿ“Š Review results in an interactive table - โœ๏ธ Edit predictions directly (dropdown menus included) - ๐Ÿ’พ Download corrected dataset - ๐Ÿ“ˆ Export training data for model retraining #### 3๏ธโƒฃ **Corrections History** - Track & Improve - ๐Ÿ“‹ View all manual corrections you've made - ๐Ÿ• Timestamp tracking for audit trails - ๐Ÿ“ค Export correction logs for model fine-tuning - ๐Ÿ“Š Analyze patterns in misclassifications #### 4๏ธโƒฃ **Model Comparison** - A/B Testing - ๐Ÿงช Compare current model vs. any HuggingFace model - ๐Ÿ“‰ Side-by-side predictions with match rates - ๐ŸŽฏ Evaluate performance before deployment - ๐Ÿ”ฌ Test on your own dataset --- ## ๐Ÿง  API Usage ### Base URL ``` https://yassine123z-emissionfactor-mapper2-v2-gradio.hf.space/map_categories ``` --- ### ๐Ÿ”Œ Endpoint 1: Batch Classification **POST** `/map_categories` Classify multiple transactions in a single API call. **Example JSON:** ```json { "transactions": [ "Train ticket Paris to Berlin", "Office lighting electricity", "Laptop purchase for employee" ] } ``` **Response:** ```json { "matches": [ { "input_text": "Train ticket Paris to Berlin", "best_Cat1": "Mobility (passengers)", "best_Cat2": "Train transport", "similarity": 0.96 }, { "input_text": "Office lighting electricity", "best_Cat1": "Use of electricity", "best_Cat2": "Standard", "similarity": 0.89 }, { "input_text": "Laptop purchase for employee", "best_Cat1": "Purchase of goods", "best_Cat2": "Electrical equipment", "similarity": 0.92 } ] } ``` ## ๐Ÿ—‚๏ธ Emission Categories ### ๐Ÿ“‹ Complete Category Structure The model classifies into **12 primary categories** and **82 subcategories**: #### 1. **Purchase of Goods** (10 subcategories) Sporting goods, Buildings, Office supplies, Water consumption, Household appliances, Electrical equipment, Machinery and equipment, Furniture, Textiles and clothing, Vehicles #### 2. **Purchase of Materials** (6 subcategories) Construction materials, Organic materials, Paper and cardboard, Plastics and rubber, Chemicals, Refrigerants and others #### 3. **Purchase of Services** (14 subcategories) Equipment rental, Building rental, Furniture rental, Vehicle rental, Information/cultural services, Catering, Health services, Specialized crafts, Admin/consulting, Cleaning, IT services, Logistics, Marketing, Technical services #### 4. **Food & Beverages** (10 subcategories) Alcoholic beverages, Non-alcoholic beverages, Condiments, Desserts, Fruits and vegetables, Fats and oils, Prepared meals, Animal products, Cereal products, Dairy products #### 5. **Heating and Air Conditioning** (2 subcategories) Heat and steam, Air conditioning and refrigeration #### 6. **Fuels** (6 subcategories) Fossil fuels, Mobile fossil fuels, Organic fuels, Gaseous fossil fuels, Liquid fossil fuels, Solid fossil fuels #### 7. **Mobility (Freight)** (5 subcategories) Air transport, Ship transport, Truck transport, Combined transport, Train transport #### 8. **Mobility (Passengers)** (11 subcategories) Air transport, Coach/Urban bus, Ship transport, Combined transport, E-Bike, Accommodation/Events, Soft mobility, Motorcycle/Scooter, Train transport, Public transport, Car #### 9. **Process and Fugitive Emissions** (3 subcategories) Agriculture, Global warming potential, Industrial processes #### 10. **Waste Treatment** (12 subcategories) Commercial/industrial, Wastewater, Electrical equipment, Households, Metal, Organic materials, Paper and cardboard, Batteries, Plastics, Fugitive emissions, Textiles, Glass #### 11. **Use of Electricity** (3 subcategories) Electricity for electric vehicles, Renewables, Standard --- ## ๐Ÿ“‚ CSV File Format ### Required Format Your CSV must contain a column named **`transaction`** (lowercase): ```csv transaction Hotel stay in Berlin for 3 nights Train ticket from Amsterdam to Brussels Office supplies - pens and notebooks Electric vehicle charging Restaurant lunch for team meeting ``` ### Processing Results After processing, you'll get: ```csv ID,Transaction,Cat1,Cat2,Confidence,Status 1,Hotel stay in Berlin,Mobility (passengers),Accommodation / Events,0.91,โœ… OK 2,Train ticket Amsterdam-Brussels,Mobility (passengers),Train transport,0.96,โœ… OK 3,Office supplies,Purchase of goods,Office supplies,0.93,โœ… OK ``` ### Status Indicators - **โœ… OK**: High confidence (>0.8) - Auto-approved - **โš ๏ธ Review**: Lower confidence - Needs manual review --- ## ๐Ÿง  Model Architecture ### Technical Details **Model**: `yassine123Z/EmissionFactor-mapper2-v2` - **Type**: SetFit (Sentence Transformer Fine-tuning) - **Base**: Optimized sentence transformer architecture - **Training**: Few-shot learning on emission factor data - **Embeddings**: 384-dimensional semantic vectors - **Matching**: Cosine similarity scoring ### Performance Metrics - โšก **Speed**: ~50ms per transaction - ๐Ÿ“Š **Throughput**: 100+ transactions/minute - ๐ŸŽฏ **Accuracy**: 85%+ on test set - ๐Ÿ’พ **Model Size**: ~400MB - ๐Ÿ”‹ **Average Confidence**: 0.87 --- --- ## ๐Ÿ“š Resources - ๐Ÿค— **Model Card**: [yassine123Z/EmissionFactor-mapper2-v2](https://huggingface.co/yassine123Z/EmissionFactor-mapper2-v2) - ๐ŸŒ **Live Demo**: [Web Interface](https://yassine123z-emissionfactor-mapper2-v2-gradio2ui.hf.space/) - ๐Ÿ“– **SetFit Documentation**: [GitHub](https://github.com/huggingface/setfit) --- ## ๐Ÿ“„ License MIT License - Feel free to use in commercial and open-source projects. --- ## ๐Ÿ‘จโ€๐Ÿ’ป Author **Yassine** - ๐Ÿค— HuggingFace: [@yassine123Z](https://huggingface.co/yassine123Z) ---
**๐ŸŒฑ Making sustainability data smarter, one transaction at a time** Built with โค๏ธ using SetFit, Gradio & FastAPI