Spaces:
Running
title: MasterMap Cleaner
emoji: 🧹
colorFrom: blue
colorTo: green
sdk: docker
app_port: 7860
pinned: false
short_description: Clean MasterMap Excel files with guided review.
MasterMap Cleaner
MasterMap Cleaner is a web tool used to clean and standardize MasterMap Excel files.
The tool takes an uploaded workbook, checks selected columns against approved reference lists, uses AI only when needed, and creates a cleaned version of the workbook. It also generates a review file called a Blueprint, where uncertain values can be checked and corrected by a human before the final workbook is downloaded.
What The Tool Produces
After a cleaning run, the tool can produce:
- a cleaned workbook with a new cleaned sheet
- a Blueprint file for human review
- a final workbook after the reviewed Blueprint has been applied
Before You Start
You need:
- the Hugging Face Space link for the tool
- the username and password provided by the tool administrator
- the Excel workbook you want to clean
- access to Excel or another spreadsheet editor to review the Blueprint
Accepted workbook formats:
.xlsx.xlsm
Recommended Workflow
- Open the tool.
- Upload the Excel workbook.
- Select the source sheet to clean.
- Run the cleaning process.
- Download the cleaned workbook and Blueprint.
- Review the Blueprint in Excel.
- Upload the reviewed Blueprint back into the tool.
- Apply the Blueprint.
- Download the final cleaned workbook.
- Save manual references if new approved values should be remembered for future runs.
Step 1: Open The Tool
Open the Hugging Face Space link in your browser.
If prompted, enter the username and password provided by the tool administrator.
Step 2: Upload The Dataset
In the Dataset to Clean section:
- Drop the Excel file into the upload box, or click the box and select the file.
- Wait until the file is loaded.
- Select the source sheet that contains the data to clean.
- Choose the output sheet name.
The output sheet is the new sheet that will be created inside the workbook for cleaned data.
Do not use the same name as the source sheet.
Step 3: Run Cleaning
Click Run Cleaning.
While the tool is running, it will show progress for each cleaned column. Some files may take time depending on the number of rows and the number of values that require AI review.
When the run finishes, the tool will show download links for:
- Blueprint
- Cleaned Workbook
Download both files.
Step 4: Review The Blueprint
Open the Blueprint file in Excel.
The Blueprint contains values that the tool wants a human to review. Each row represents one value or correction candidate.
Main columns:
Row_Index: the row in the workbook where the value appearsColumn: the field being reviewedOriginal_Raw_Text: the original value from the uploaded fileAI_Suggested_Match: the tool's suggested cleaned valueHuman_Override: the reviewer correction fieldConfidence: how confident the tool wasMatch_Source: how the suggestion was produced
How to review each row:
- If the suggested value is correct, leave
Human_Overrideempty. - If the suggested value is wrong, choose the correct value from the dropdown.
- If the correct value is not in the dropdown, type it manually.
- Focus especially on rows marked
LOWorMEDIUMconfidence.
The dropdown is there to help, but manual typing is allowed.
After reviewing, save the Blueprint file.
Step 5: Apply The Reviewed Blueprint
Return to the tool and go to the Apply Blueprint section.
- Upload the workbook that should receive the corrections.
- Select the sheet to update.
- Upload the reviewed Blueprint file.
- Click Apply Blueprint.
Use the same cleaned sheet that was created during the cleaning step unless you were instructed otherwise.
When the apply step finishes, download the final cleaned workbook.
Step 6: Save Manual References
After applying a Blueprint, the tool may learn newly approved values so future runs can recognize them automatically.
If the Save Manual References button is available, click it after applying the Blueprint.
Use this button when:
- the Blueprint contained manually approved values
- those values should be remembered in future cleaning runs
- the administrator instructed you to preserve the updated references
If the button is disabled, continue using the tool normally. The administrator may handle reference saving separately.
Important Usage Notes
- Keep the browser tab open while a cleaning or apply process is running.
- Download your files before closing the page.
- If you refresh the page, you may need to upload the files again.
- Do not share the tool password outside the approved user group.
- Do not upload files unless they are intended to be processed by this tool.
If The Cleaning Fails
If the cleaning process fails during AI matching, it may be caused by external API issues such as daily rate limits being exhausted, a selected AI model becoming unavailable, or a model being deprecated.
If this happens, check the details in the technical documentation.