Spaces:
Sleeping
Sleeping
Fredrik Sitje
commited on
Commit
Β·
6df93c7
1
Parent(s):
24c6160
Update README.md to reflect new Grading Answers App features and usage instructions. Added detailed sections on private repository usage, jurisdiction structure, and configuration for Hugging Face Spaces, enhancing clarity for users on how to set up and utilize the application.
Browse files
README.md
CHANGED
|
@@ -11,9 +11,90 @@ pinned: false
|
|
| 11 |
short_description: A space for grading generated answers
|
| 12 |
---
|
| 13 |
|
| 14 |
-
#
|
| 15 |
|
| 16 |
-
|
| 17 |
|
| 18 |
-
|
| 19 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 11 |
short_description: A space for grading generated answers
|
| 12 |
---
|
| 13 |
|
| 14 |
+
# Grading Answers App
|
| 15 |
|
| 16 |
+
A Streamlit application for grading AI-generated legal answers across multiple jurisdictions. The app connects to a private Hugging Face dataset repository to store user credentials and grading data.
|
| 17 |
|
| 18 |
+
## Private Repository Usage
|
| 19 |
+
|
| 20 |
+
This app connects to the existing private Hugging Face dataset repository: [TransLegal/grading-answers](https://huggingface.co/datasets/TransLegal/grading-answers/tree/main)
|
| 21 |
+
|
| 22 |
+
### Repository Structure
|
| 23 |
+
|
| 24 |
+
The app expects the following structure (jurisdictions are discovered automatically):
|
| 25 |
+
|
| 26 |
+
```
|
| 27 |
+
TransLegal/grading-answers/
|
| 28 |
+
βββ en-us/
|
| 29 |
+
β βββ grading_template.parquet
|
| 30 |
+
β βββ users/
|
| 31 |
+
βββ hr-hr/
|
| 32 |
+
β βββ grading_template.parquet
|
| 33 |
+
β βββ users/
|
| 34 |
+
βββ [jurisdiction-code]/
|
| 35 |
+
βββ grading_template.parquet
|
| 36 |
+
βββ users/
|
| 37 |
+
```
|
| 38 |
+
|
| 39 |
+
**How It Works:**
|
| 40 |
+
- The app automatically discovers jurisdictions by scanning for subdirectories containing `grading_template.parquet`
|
| 41 |
+
- Each jurisdiction has isolated user accounts and data
|
| 42 |
+
- The `users/` subdirectory is created automatically when the first user registers in that jurisdiction
|
| 43 |
+
|
| 44 |
+
### Adding New Jurisdictions
|
| 45 |
+
|
| 46 |
+
To add a new jurisdiction to the repository:
|
| 47 |
+
|
| 48 |
+
1. **Create jurisdiction subdirectory** in the [TransLegal/grading-answers](https://huggingface.co/datasets/TransLegal/grading-answers) repository:
|
| 49 |
+
- Use format: `{language-code}-{country-code}` (e.g., `sv-se`, `fr-fr`, `es-es`)
|
| 50 |
+
- Example: Create `sv-se/` directory
|
| 51 |
+
|
| 52 |
+
2. **Add grading template file:**
|
| 53 |
+
- Upload `grading_template.parquet` to `{jurisdiction}/grading_template.parquet`
|
| 54 |
+
- **Required Structure:** The parquet file must contain the following columns:
|
| 55 |
+
- `term` (string) - The legal term being assessed
|
| 56 |
+
- `category` (string) - Category within the term
|
| 57 |
+
- `subcategory` (string) - Subcategory within the category
|
| 58 |
+
- `question` (string) - The question being asked
|
| 59 |
+
- `answer` (string) - The AI-generated answer to be graded
|
| 60 |
+
- **Special Values:** Answers can be `"Unknown."` or `"Unknown"` to indicate unknown/unavailable information (these are automatically scored as "Irrelevant / NA")
|
| 61 |
+
|
| 62 |
+
3. **Create users directory:**
|
| 63 |
+
- Create `{jurisdiction}/users/` directory with an empty `.gitkeep` file (so the directory is tracked in Git)
|
| 64 |
+
- The `users.json` file will be created automatically on first user registration
|
| 65 |
+
|
| 66 |
+
4. **Verify:**
|
| 67 |
+
- The new jurisdiction will appear automatically in the spaces's jurisdiction selector
|
| 68 |
+
- No code changes or redeployment needed - discovery is dynamic
|
| 69 |
+
|
| 70 |
+
**File Structure Per Jurisdiction:**
|
| 71 |
+
- `{jurisdiction}/grading_template.parquet` - Required (grading questions/answers template)
|
| 72 |
+
- `{jurisdiction}/users/` - Created automatically (stores user data)
|
| 73 |
+
- `{jurisdiction}/users/users.json` - Created on first registration (user credentials)
|
| 74 |
+
- `{jurisdiction}/users/{username}_answers.parquet` - Created per user (grading data)
|
| 75 |
+
|
| 76 |
+
## Configuration (Hugging Face Spaces)
|
| 77 |
+
|
| 78 |
+
The following is already configured in the Hugging Face Space settings. If you need to change these settings, ensure they are implemented correctly:
|
| 79 |
+
|
| 80 |
+
### Variables
|
| 81 |
+
- **`HF_DATASET_REPO`**: The name of your private dataset repository
|
| 82 |
+
- Currently set to: `TransLegal/grading-answers` [LINK to dataset repo](https://huggingface.co/datasets/TransLegal/grading-answers)
|
| 83 |
+
- Location: TransLegal/grading-answers (SPACES) Settings β Variables and secrets β Variables [LINK](https://huggingface.co/spaces/TransLegal/grading-answers/settings)
|
| 84 |
+
- Default: `TransLegal/grading-answers` (if not set)
|
| 85 |
+
|
| 86 |
+
### Secrets
|
| 87 |
+
- **`HF_TOKEN`**: A Hugging Face access token with read/write permissions to the private dataset repository
|
| 88 |
+
- Location: TransLegal/grading-answers (SPACES) Settings β Variables and secrets β Secrets [LINK](https://huggingface.co/spaces/TransLegal/grading-answers/settings)
|
| 89 |
+
- **Required Permission:** Enable "Write access to contents/settings of selected repos" when generating the token
|
| 90 |
+
- Generate at: https://huggingface.co/settings/tokens
|
| 91 |
+
|
| 92 |
+
## How It Works
|
| 93 |
+
|
| 94 |
+
1. **Jurisdiction Discovery:** The app automatically discovers available jurisdictions by scanning the repository for subdirectories containing `grading_template.parquet`
|
| 95 |
+
2. **User Accounts:** Each jurisdiction has separate user accounts (same username can exist in different jurisdictions)
|
| 96 |
+
3. **Data Storage:** All user data is stored in the private Hugging Face dataset repository, organized by jurisdiction
|
| 97 |
+
|
| 98 |
+
## Deployment
|
| 99 |
+
|
| 100 |
+
This app is designed to run on Hugging Face Spaces using Docker. After configuring the variables and secrets above, push this repository using `git push` and it will automatically deploy.
|