noctismorty's picture
Update README.md
0ba1e2d verified
metadata
license: mit
base_model:
  - microsoft/trocr-base-printed
tags:
  - computer-vision
  - ocr
  - trocr
  - pill
  - medical
  - pytorch

Fine-tuned TrOCR for Pill Imprint OCR

Model Summary

This is a fine-tuned TrOCR model for reading pill imprints (letters/numbers) from pill images. It is used as part of a pill identification pipeline where OCR output is matched against a pill database.

Intended Use

  • Extract imprint text from pill images to support database-backed pill identification.
  • Research and demo usage.

Not Intended Use

  • Not a medical device.
  • Not for clinical decision making.

Training Data

Fine-tuned on pill images with imprint labels (RxNav-style pill images). Data includes varied lighting, blur, and embossing conditions.

Evaluation

Evaluated primarily by end-to-end retrieval performance (top-k matching in a pill database) and qualitative OCR correctness on benchmark images.

Limitations

  • Performance degrades on low-resolution, blurred, or overexposed images.
  • Embossed/low-contrast digits may be dropped or partially recognized.
  • Some imprints are inherently ambiguous; downstream ranking should return top-k candidates.

How to Use

from transformers import TrOCRProcessor, VisionEncoderDecoderModel
processor = TrOCRProcessor.from_pretrained("YOUR_NAME/YOUR_REPO")
model = VisionEncoderDecoderModel.from_pretrained("YOUR_NAME/YOUR_REPO")