File size: 5,154 Bytes
446ae13
 
 
 
 
 
 
 
 
 
 
c6a3f44
dc06d4c
c6a3f44
dc06d4c
c6a3f44
dc06d4c
c6a3f44
dc06d4c
c6a3f44
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
2a1daef
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
---
title: MasterMap Cleaner
emoji: 🧹
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
short_description: Clean MasterMap Excel files with guided review.
---

# MasterMap Cleaner

MasterMap Cleaner is a web tool used to clean and standardize MasterMap Excel files.

The tool takes an uploaded workbook, checks selected columns against approved reference lists, uses AI only when needed, and creates a cleaned version of the workbook. It also generates a review file called a **Blueprint**, where uncertain values can be checked and corrected by a human before the final workbook is downloaded.

## What The Tool Produces

After a cleaning run, the tool can produce:

- a cleaned workbook with a new cleaned sheet
- a Blueprint file for human review
- a final workbook after the reviewed Blueprint has been applied

## Before You Start

You need:

- the Hugging Face Space link for the tool
- the username and password provided by the tool administrator
- the Excel workbook you want to clean
- access to Excel or another spreadsheet editor to review the Blueprint

Accepted workbook formats:

- `.xlsx`
- `.xlsm`

## Recommended Workflow

1. Open the tool.
2. Upload the Excel workbook.
3. Select the source sheet to clean.
4. Run the cleaning process.
5. Download the cleaned workbook and Blueprint.
6. Review the Blueprint in Excel.
7. Upload the reviewed Blueprint back into the tool.
8. Apply the Blueprint.
9. Download the final cleaned workbook.
10. Save manual references if new approved values should be remembered for future runs.

## Step 1: Open The Tool

Open the Hugging Face Space link in your browser.

If prompted, enter the username and password provided by the tool administrator.

## Step 2: Upload The Dataset

In the **Dataset to Clean** section:

1. Drop the Excel file into the upload box, or click the box and select the file.
2. Wait until the file is loaded.
3. Select the source sheet that contains the data to clean.
4. Choose the output sheet name.

The output sheet is the new sheet that will be created inside the workbook for cleaned data.

Do not use the same name as the source sheet.

## Step 3: Run Cleaning

Click **Run Cleaning**.

While the tool is running, it will show progress for each cleaned column. Some files may take time depending on the number of rows and the number of values that require AI review.

When the run finishes, the tool will show download links for:

- **Blueprint**
- **Cleaned Workbook**

Download both files.

## Step 4: Review The Blueprint

Open the Blueprint file in Excel.

The Blueprint contains values that the tool wants a human to review. Each row represents one value or correction candidate.

Main columns:

- `Row_Index`: the row in the workbook where the value appears
- `Column`: the field being reviewed
- `Original_Raw_Text`: the original value from the uploaded file
- `AI_Suggested_Match`: the tool's suggested cleaned value
- `Human_Override`: the reviewer correction field
- `Confidence`: how confident the tool was
- `Match_Source`: how the suggestion was produced

How to review each row:

- If the suggested value is correct, leave `Human_Override` empty.
- If the suggested value is wrong, choose the correct value from the dropdown.
- If the correct value is not in the dropdown, type it manually.
- Focus especially on rows marked `LOW` or `MEDIUM` confidence.

The dropdown is there to help, but manual typing is allowed.

After reviewing, save the Blueprint file.

## Step 5: Apply The Reviewed Blueprint

Return to the tool and go to the **Apply Blueprint** section.

1. Upload the workbook that should receive the corrections.
2. Select the sheet to update.
3. Upload the reviewed Blueprint file.
4. Click **Apply Blueprint**.

Use the same cleaned sheet that was created during the cleaning step unless you were instructed otherwise.

When the apply step finishes, download the final cleaned workbook.

## Step 6: Save Manual References

After applying a Blueprint, the tool may learn newly approved values so future runs can recognize them automatically.

If the **Save Manual References** button is available, click it after applying the Blueprint.

Use this button when:

- the Blueprint contained manually approved values
- those values should be remembered in future cleaning runs
- the administrator instructed you to preserve the updated references

If the button is disabled, continue using the tool normally. The administrator may handle reference saving separately.

## Important Usage Notes

- Keep the browser tab open while a cleaning or apply process is running.
- Download your files before closing the page.
- If you refresh the page, you may need to upload the files again.
- Do not share the tool password outside the approved user group.
- Do not upload files unless they are intended to be processed by this tool.

## If The Cleaning Fails

If the cleaning process fails during AI matching, it may be caused by external API issues such as daily rate limits being exhausted, a selected AI model becoming unavailable, or a model being deprecated.

If this happens, check the details in the technical documentation.