--- title: EmissionFactor Mapper emoji: ๐ฟ colorFrom: purple colorTo: pink sdk: docker pinned: false license: mit short_description: AI-powered transaction classifier for carbon accounting --- # ๐ฟ Emission Factor Mapper **Intelligent AI-powered classification system for sustainability transaction data** Automatically map your financial transactions (like *"hotel booking for conference"*, *"electric vehicle charging"*, or *"office furniture purchase"*) to standardized emission factor categories (Cat1, Cat2) used for accurate COโ footprint analysis and ESG reporting. [](https://opensource.org/licenses/MIT) [](https://www.python.org/downloads/) [](https://huggingface.co/yassine123Z/EmissionFactor-mapper2-v2) --- ## ๐ฏ What Does This Do? This application solves a critical challenge in **carbon accounting**: automatically categorizing thousands of financial transactions into standardized emission categories. Instead of manually reviewing each purchase, expense, or invoice, the AI model: - โ **Classifies** transactions into 12 primary emission categories - โ **Maps** to 82 detailed subcategories for precise carbon calculations - โ **Provides** confidence scores for quality assurance - โ **Enables** batch processing of CSV files with review capabilities - โ **Tracks** manual corrections for continuous model improvement - โ **Compares** different AI models to optimize accuracy Perfect for **sustainability teams**, **carbon accountants**, **ESG analysts**, and **finance departments** working on Scope 3 emissions reporting. --- ## ๐ Demo ๐ข **Try the web UI:** [https://yassine123z-emissionfactor-mapper2-v2-gradio2ui.hf.space/](https://yassine123z-emissionfactor-mapper2-v2-gradio2ui.hf.space/) ### ๐ฑ Four Powerful Modes: #### 1๏ธโฃ **Single Transaction** - Quick Classification Enter any transaction description and get instant predictions: - **Input**: `"Business class flight from London to New York"` - **Output**: - Cat1: `Mobility (passengers)` - Cat2: `Air transport` - Confidence: `0.94` #### 2๏ธโฃ **Batch Review** - Process Hundreds at Once Upload a CSV file with your transactions and: - โจ Get automatic classifications for all rows - ๐ Review results in an interactive table - โ๏ธ Edit predictions directly (dropdown menus included) - ๐พ Download corrected dataset - ๐ Export training data for model retraining #### 3๏ธโฃ **Corrections History** - Track & Improve - ๐ View all manual corrections you've made - ๐ Timestamp tracking for audit trails - ๐ค Export correction logs for model fine-tuning - ๐ Analyze patterns in misclassifications #### 4๏ธโฃ **Model Comparison** - A/B Testing - ๐งช Compare current model vs. any HuggingFace model - ๐ Side-by-side predictions with match rates - ๐ฏ Evaluate performance before deployment - ๐ฌ Test on your own dataset --- ## ๐ง API Usage ### Base URL ``` https://yassine123z-emissionfactor-mapper2-v2-gradio.hf.space/map_categories ``` --- ### ๐ Endpoint 1: Batch Classification **POST** `/map_categories` Classify multiple transactions in a single API call. **Example JSON:** ```json { "transactions": [ "Train ticket Paris to Berlin", "Office lighting electricity", "Laptop purchase for employee" ] } ``` **Response:** ```json { "matches": [ { "input_text": "Train ticket Paris to Berlin", "best_Cat1": "Mobility (passengers)", "best_Cat2": "Train transport", "similarity": 0.96 }, { "input_text": "Office lighting electricity", "best_Cat1": "Use of electricity", "best_Cat2": "Standard", "similarity": 0.89 }, { "input_text": "Laptop purchase for employee", "best_Cat1": "Purchase of goods", "best_Cat2": "Electrical equipment", "similarity": 0.92 } ] } ``` ## ๐๏ธ Emission Categories ### ๐ Complete Category Structure The model classifies into **12 primary categories** and **82 subcategories**: #### 1. **Purchase of Goods** (10 subcategories) Sporting goods, Buildings, Office supplies, Water consumption, Household appliances, Electrical equipment, Machinery and equipment, Furniture, Textiles and clothing, Vehicles #### 2. **Purchase of Materials** (6 subcategories) Construction materials, Organic materials, Paper and cardboard, Plastics and rubber, Chemicals, Refrigerants and others #### 3. **Purchase of Services** (14 subcategories) Equipment rental, Building rental, Furniture rental, Vehicle rental, Information/cultural services, Catering, Health services, Specialized crafts, Admin/consulting, Cleaning, IT services, Logistics, Marketing, Technical services #### 4. **Food & Beverages** (10 subcategories) Alcoholic beverages, Non-alcoholic beverages, Condiments, Desserts, Fruits and vegetables, Fats and oils, Prepared meals, Animal products, Cereal products, Dairy products #### 5. **Heating and Air Conditioning** (2 subcategories) Heat and steam, Air conditioning and refrigeration #### 6. **Fuels** (6 subcategories) Fossil fuels, Mobile fossil fuels, Organic fuels, Gaseous fossil fuels, Liquid fossil fuels, Solid fossil fuels #### 7. **Mobility (Freight)** (5 subcategories) Air transport, Ship transport, Truck transport, Combined transport, Train transport #### 8. **Mobility (Passengers)** (11 subcategories) Air transport, Coach/Urban bus, Ship transport, Combined transport, E-Bike, Accommodation/Events, Soft mobility, Motorcycle/Scooter, Train transport, Public transport, Car #### 9. **Process and Fugitive Emissions** (3 subcategories) Agriculture, Global warming potential, Industrial processes #### 10. **Waste Treatment** (12 subcategories) Commercial/industrial, Wastewater, Electrical equipment, Households, Metal, Organic materials, Paper and cardboard, Batteries, Plastics, Fugitive emissions, Textiles, Glass #### 11. **Use of Electricity** (3 subcategories) Electricity for electric vehicles, Renewables, Standard --- ## ๐ CSV File Format ### Required Format Your CSV must contain a column named **`transaction`** (lowercase): ```csv transaction Hotel stay in Berlin for 3 nights Train ticket from Amsterdam to Brussels Office supplies - pens and notebooks Electric vehicle charging Restaurant lunch for team meeting ``` ### Processing Results After processing, you'll get: ```csv ID,Transaction,Cat1,Cat2,Confidence,Status 1,Hotel stay in Berlin,Mobility (passengers),Accommodation / Events,0.91,โ OK 2,Train ticket Amsterdam-Brussels,Mobility (passengers),Train transport,0.96,โ OK 3,Office supplies,Purchase of goods,Office supplies,0.93,โ OK ``` ### Status Indicators - **โ OK**: High confidence (>0.8) - Auto-approved - **โ ๏ธ Review**: Lower confidence - Needs manual review --- ## ๐ง Model Architecture ### Technical Details **Model**: `yassine123Z/EmissionFactor-mapper2-v2` - **Type**: SetFit (Sentence Transformer Fine-tuning) - **Base**: Optimized sentence transformer architecture - **Training**: Few-shot learning on emission factor data - **Embeddings**: 384-dimensional semantic vectors - **Matching**: Cosine similarity scoring ### Performance Metrics - โก **Speed**: ~50ms per transaction - ๐ **Throughput**: 100+ transactions/minute - ๐ฏ **Accuracy**: 85%+ on test set - ๐พ **Model Size**: ~400MB - ๐ **Average Confidence**: 0.87 --- --- ## ๐ Resources - ๐ค **Model Card**: [yassine123Z/EmissionFactor-mapper2-v2](https://huggingface.co/yassine123Z/EmissionFactor-mapper2-v2) - ๐ **Live Demo**: [Web Interface](https://yassine123z-emissionfactor-mapper2-v2-gradio2ui.hf.space/) - ๐ **SetFit Documentation**: [GitHub](https://github.com/huggingface/setfit) --- ## ๐ License MIT License - Feel free to use in commercial and open-source projects. --- ## ๐จโ๐ป Author **Yassine** - ๐ค HuggingFace: [@yassine123Z](https://huggingface.co/yassine123Z) ---