lsy9874205
/

heal-protocol-embeddings

Sentence Similarity

sentence-transformers

feature-extraction

Generated from Trainer

dataset_size:247936

loss:CosineSimilarityLoss

text-embeddings-inference

Model card Files Files and versions

lsy9874205 commited on Mar 6, 2025

Commit

07c9e13

·

verified ·

1 Parent(s): ce4635a

Update README.md

Files changed (1) hide show

README.md +32 -0

README.md CHANGED Viewed

@@ -496,7 +496,39 @@ You can finetune this model on your own dataset.
     url = "https://arxiv.org/abs/1908.10084",
 }
 ```
 <!--
 ## Glossary

     url = "https://arxiv.org/abs/1908.10084",
 }
 ```
+# HEAL Protocol Embeddings
+This model is fine-tuned from all-MiniLM-L6-v2 on HEAL Initiative clinical protocols.
+## Performance Evaluation
+Comparison with OpenAI embeddings:
+| Metric | OpenAI | Fine-tuned | Change |
+|--------|--------|------------|---------|
+| Faithfulness | 0.667 | 0.833 | ⬆️ +0.166 |
+| Answer Relevancy | 0.986 | 0.986 | = |
+| Context Precision | 1.000 | 1.000 | = |
+| Context Recall | 1.000 | 0.000 | ⬇️ -1.000 |
+### Key Findings
+- Improved faithfulness to source material
+- Maintained high answer relevancy
+- Trade-off in context recall
+## Future Improvements
+1. Retrieval Strategy
+   - Implement hybrid search combining semantic and keyword matching
+   - Add re-ranking for better result ordering
+2. Model Architecture
+   - Experiment with larger base models
+   - Fine-tune with domain-specific loss functions
+3. Data Processing
+   - Optimize chunking strategy
+   - Increase training data diversity
 <!--
 ## Glossary