File size: 2,077 Bytes
a820abc
 
 
 
 
 
bebbeee
a820abc
 
b614da3
 
a820abc
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
bebbeee
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
---
title: ModelSmith AI
emoji: 🤖
colorFrom: blue
colorTo: green
sdk: gradio
sdk_version: 6.5.1
app_file: app.py
pinned: false
license: apache-2.0
short_description: 'An intelligent ML platform '
---

# ModelSmith AI

An intelligent ML platform that automates tabular classification and regression tasks. It analyzes datasets, recommends optimal strategies, trains models, and provides explanations.

## Features

- **Dataset Analysis**: Automatic detection of data types, missing values, and potential issues
- **Strategy Reasoning**: Intelligent model selection based on dataset characteristics
- **Automated Training**: End-to-end model training with preprocessing pipelines
- **Explainability**: SHAP-based feature importance explanations
- **FastAPI Backend**: RESTful API for seamless integration

## Supported Scope

- **Task**: Tabular classification and regression
- **Input**: CSV files with ≥1200 rows
- **Target**: Binary or multiclass classification, regression
- **Features**: At least 2 usable features after preprocessing

## API Endpoints

- `POST /analyze`: Analyze dataset and get strategy recommendations
- `POST /train`: Train a model on the dataset
- `POST /explain`: Get model explanations and feature importance
- `POST /predict`: Make predictions with trained model
- `GET /health`: Health check

## Deployment

This project is designed for deployment on Hugging Face Spaces using Docker.

### Files for Deployment

- `Dockerfile`
- `requirements.txt`
- `backend/` (entire directory)

### Running Locally

```bash
pip install -r requirements.txt
uvicorn backend.api.main:app --host 0.0.0.0 --port 7860
```

## Limitations

- NLP functionality is disabled
- Requires datasets with ≥1200 rows
- CPU-only, no GPU support
- Stateless API (models saved temporarily)

## Architecture

- **Orchestrator**: Main workflow coordinator
- **Dataset Analyzer**: Data profiling and preprocessing
- **Strategy Reasoner**: Model selection logic
- **Model Factory**: Training and evaluation
- **Explainability Engine**: SHAP explanations

## License

MIT License