AshleyBanksNIHR commited on
Commit
d470d98
·
verified ·
1 Parent(s): 4a68831

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +19 -10
README.md CHANGED
@@ -86,7 +86,7 @@ Overall RAG Metrics:
86
 
87
  For a comprehensive breakdown of the model's performance, including Overall Metrics, Metrics per Category across both validation and test sets and Metrics per Funder across the validation set, please refer to the detailed evaluation spreadsheet included in this repository.
88
 
89
- **Download/View the Evaluation Results](https://huggingface.co/NIHRDataInsights/HRCSResearchActivityCodes/resolve/main/evaluation/health_category_rac_evaluation_results.xlsx)** *(Located in the `Files and versions` tab of this repository)*.
90
 
91
  ## Intended use
92
  This model is intended for:
@@ -106,18 +106,27 @@ This model is intended for:
106
  * **Annotation Ambiguity and Niche Categories:** The model's performance reflects the historical consistency of human coding within the training data. Categories that are historically difficult for human coders to classify consistently under HRCS guidelines (such as 7.1, 8.1 and 8.3) are naturally more challenging for the model.
107
 
108
  ## Inference / How to use
109
- A companion script is provided to run this model (and the companion health category model) on new award data.
110
 
111
- **The script:**
112
- 1. Loads the trained model and tokenizer
113
- 2. Applies sigmoid to obtain probabilities
114
- 3. Converts probabilities to labels using the per-category thresholds stored in `metadata.json`
115
- 4. Outputs a CSV containing predicted Health Categories and confidence indicators
116
 
117
- **Expected input format:**
118
- The script expects a CSV containing at minimum: `AwardTitle`, `AwardAbstract`. Optional columns such as `ID` or `FunderAcronym` will be preserved in the output.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
119
 
120
- See the inference script in this repository for full usage details.
121
 
122
  ## Selective automation and human-in-the-loop use
123
  In addition to predicted labels, the inference script reports how close each prediction is to the model’s decision boundary in logit space. This is computed as the smallest absolute difference between any category’s logit and its corresponding decision threshold.
 
86
 
87
  For a comprehensive breakdown of the model's performance, including Overall Metrics, Metrics per Category across both validation and test sets and Metrics per Funder across the validation set, please refer to the detailed evaluation spreadsheet included in this repository.
88
 
89
+ **[Download/View the Evaluation Results](https://huggingface.co/NIHRDataInsights/HRCSResearchActivityCodes/resolve/main/evaluation/health_category_rac_evaluation_results.xlsx)** *(Located in the `Files and versions` tab of this repository)*.
90
 
91
  ## Intended use
92
  This model is intended for:
 
106
  * **Annotation Ambiguity and Niche Categories:** The model's performance reflects the historical consistency of human coding within the training data. Categories that are historically difficult for human coders to classify consistently under HRCS guidelines (such as 7.1, 8.1 and 8.3) are naturally more challenging for the model.
107
 
108
  ## Inference / How to use
 
109
 
110
+ We have provided a ready-to-use Python script that runs both this model (Research Activity Codes) and a Health Categories model on new award data simultaneously.
 
 
 
 
111
 
112
+ You can download the script and a sample dataset directly from the 'inference' subfolder in the **Files and version** tab of this repository.
113
+
114
+ ### Instructions
115
+
116
+ **Prerequisites:**
117
+ 1. Download the script and test data to your computer from the inference subfolder.
118
+ 2. Open your terminal or command prompt and install the required libraries by running:
119
+ `pip install torch pandas numpy tqdm transformers huggingface_hub`
120
+ 3. Use the 'test_data.csv' or prepare a CSV file in the same format containing your grant data. It **must** include two columns named exactly: `AwardTitle` and `AwardAbstract`.
121
+
122
+ **Running the Code:**
123
+ 1. Open the script.
124
+ 2. Under the `# --- USER SETTINGS ---` section, update `DATA_FOLDERS` to point to the folder containing your CSV. *(Leave it as `["./"]` if your CSV is in the same folder as the script).*
125
+ 3. Update `TEST_FILENAME` to match the name of your CSV.
126
+ 4. Run the script.
127
+
128
+ The script will automatically download the necessary AI models, process your text, and output a new CSV containing the predicted categories and an "AI Certainty Score" (`SmallestLogitDiff`) to help you identify which borderline grants require human review.
129
 
 
130
 
131
  ## Selective automation and human-in-the-loop use
132
  In addition to predicted labels, the inference script reports how close each prediction is to the model’s decision boundary in logit space. This is computed as the smallest absolute difference between any category’s logit and its corresponding decision threshold.