File size: 841 Bytes
5c21269
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
d8d95d7
5c21269
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
---
license: mit
tags:
- sklearn
- tabular-classification
- excel
- spreadsheet
---

# sheet-cell-classifier

RandomForest cell role classifier for Excel spreadsheets, used by
[sheet-call-tree](https://github.com/roksechs/sheet-call-tree).

## Model

Predicts whether a spreadsheet cell is a **header** (0) or **data** (1).
Trained on CTC (CIUS + SAUS) and ENTRANT datasets.

**Features (23):** gap-proximity (dist_above, dist_left), row/col numeric fractions,
format fields (bold, italic, colors, borders, alignment, data type), value type flags.

## Usage

```python
from huggingface_hub import hf_hub_download
import joblib

model_path = hf_hub_download(repo_id="roksechs/sheet-cell-classifier", filename="cell_classifier.joblib")
clf = joblib.load(model_path)
```

`sheet-call-tree >= 0.1.2` downloads and uses this model automatically.