Fredrik Sitje commited on
Commit
6df93c7
Β·
1 Parent(s): 24c6160

Update README.md to reflect new Grading Answers App features and usage instructions. Added detailed sections on private repository usage, jurisdiction structure, and configuration for Hugging Face Spaces, enhancing clarity for users on how to set up and utilize the application.

Browse files
Files changed (1) hide show
  1. README.md +85 -4
README.md CHANGED
@@ -11,9 +11,90 @@ pinned: false
11
  short_description: A space for grading generated answers
12
  ---
13
 
14
- # Welcome to Streamlit!
15
 
16
- Edit `/src/streamlit_app.py` to customize this app to your heart's desire. :heart:
17
 
18
- If you have any questions, checkout our [documentation](https://docs.streamlit.io) and [community
19
- forums](https://discuss.streamlit.io).
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
  short_description: A space for grading generated answers
12
  ---
13
 
14
+ # Grading Answers App
15
 
16
+ A Streamlit application for grading AI-generated legal answers across multiple jurisdictions. The app connects to a private Hugging Face dataset repository to store user credentials and grading data.
17
 
18
+ ## Private Repository Usage
19
+
20
+ This app connects to the existing private Hugging Face dataset repository: [TransLegal/grading-answers](https://huggingface.co/datasets/TransLegal/grading-answers/tree/main)
21
+
22
+ ### Repository Structure
23
+
24
+ The app expects the following structure (jurisdictions are discovered automatically):
25
+
26
+ ```
27
+ TransLegal/grading-answers/
28
+ β”œβ”€β”€ en-us/
29
+ β”‚ β”œβ”€β”€ grading_template.parquet
30
+ β”‚ └── users/
31
+ β”œβ”€β”€ hr-hr/
32
+ β”‚ β”œβ”€β”€ grading_template.parquet
33
+ β”‚ └── users/
34
+ └── [jurisdiction-code]/
35
+ β”œβ”€β”€ grading_template.parquet
36
+ └── users/
37
+ ```
38
+
39
+ **How It Works:**
40
+ - The app automatically discovers jurisdictions by scanning for subdirectories containing `grading_template.parquet`
41
+ - Each jurisdiction has isolated user accounts and data
42
+ - The `users/` subdirectory is created automatically when the first user registers in that jurisdiction
43
+
44
+ ### Adding New Jurisdictions
45
+
46
+ To add a new jurisdiction to the repository:
47
+
48
+ 1. **Create jurisdiction subdirectory** in the [TransLegal/grading-answers](https://huggingface.co/datasets/TransLegal/grading-answers) repository:
49
+ - Use format: `{language-code}-{country-code}` (e.g., `sv-se`, `fr-fr`, `es-es`)
50
+ - Example: Create `sv-se/` directory
51
+
52
+ 2. **Add grading template file:**
53
+ - Upload `grading_template.parquet` to `{jurisdiction}/grading_template.parquet`
54
+ - **Required Structure:** The parquet file must contain the following columns:
55
+ - `term` (string) - The legal term being assessed
56
+ - `category` (string) - Category within the term
57
+ - `subcategory` (string) - Subcategory within the category
58
+ - `question` (string) - The question being asked
59
+ - `answer` (string) - The AI-generated answer to be graded
60
+ - **Special Values:** Answers can be `"Unknown."` or `"Unknown"` to indicate unknown/unavailable information (these are automatically scored as "Irrelevant / NA")
61
+
62
+ 3. **Create users directory:**
63
+ - Create `{jurisdiction}/users/` directory with an empty `.gitkeep` file (so the directory is tracked in Git)
64
+ - The `users.json` file will be created automatically on first user registration
65
+
66
+ 4. **Verify:**
67
+ - The new jurisdiction will appear automatically in the spaces's jurisdiction selector
68
+ - No code changes or redeployment needed - discovery is dynamic
69
+
70
+ **File Structure Per Jurisdiction:**
71
+ - `{jurisdiction}/grading_template.parquet` - Required (grading questions/answers template)
72
+ - `{jurisdiction}/users/` - Created automatically (stores user data)
73
+ - `{jurisdiction}/users/users.json` - Created on first registration (user credentials)
74
+ - `{jurisdiction}/users/{username}_answers.parquet` - Created per user (grading data)
75
+
76
+ ## Configuration (Hugging Face Spaces)
77
+
78
+ The following is already configured in the Hugging Face Space settings. If you need to change these settings, ensure they are implemented correctly:
79
+
80
+ ### Variables
81
+ - **`HF_DATASET_REPO`**: The name of your private dataset repository
82
+ - Currently set to: `TransLegal/grading-answers` [LINK to dataset repo](https://huggingface.co/datasets/TransLegal/grading-answers)
83
+ - Location: TransLegal/grading-answers (SPACES) Settings β†’ Variables and secrets β†’ Variables [LINK](https://huggingface.co/spaces/TransLegal/grading-answers/settings)
84
+ - Default: `TransLegal/grading-answers` (if not set)
85
+
86
+ ### Secrets
87
+ - **`HF_TOKEN`**: A Hugging Face access token with read/write permissions to the private dataset repository
88
+ - Location: TransLegal/grading-answers (SPACES) Settings β†’ Variables and secrets β†’ Secrets [LINK](https://huggingface.co/spaces/TransLegal/grading-answers/settings)
89
+ - **Required Permission:** Enable "Write access to contents/settings of selected repos" when generating the token
90
+ - Generate at: https://huggingface.co/settings/tokens
91
+
92
+ ## How It Works
93
+
94
+ 1. **Jurisdiction Discovery:** The app automatically discovers available jurisdictions by scanning the repository for subdirectories containing `grading_template.parquet`
95
+ 2. **User Accounts:** Each jurisdiction has separate user accounts (same username can exist in different jurisdictions)
96
+ 3. **Data Storage:** All user data is stored in the private Hugging Face dataset repository, organized by jurisdiction
97
+
98
+ ## Deployment
99
+
100
+ This app is designed to run on Hugging Face Spaces using Docker. After configuring the variables and secrets above, push this repository using `git push` and it will automatically deploy.