YOLOv11x CODH Character

Model Description

This is a YOLOv11x model fine-tuned for detecting and recognizing characters in Japanese historical documents (古典籍/Koten-seki). The model is trained on the 日本古典籍くずし字データセット (CODH Kuzushiji Dataset).

Intended Uses

Character detection in historical Japanese manuscripts
Kuzushiji (くずし字) character localization
Pre-processing for OCR pipelines on classical Japanese texts

How to Use

from ultralytics import YOLO

# Load model
model = YOLO("nakamura196/yolov11x-codh-char")

# Run inference
results = model.predict("your_image.jpg", conf=0.25, iou=0.45)

# Process results
for result in results:
    boxes = result.boxes
    print(boxes)

Or download the model file directly:

from huggingface_hub import hf_hub_download

model_path = hf_hub_download(repo_id="nakamura196/yolov11x-codh-char", filename="best.pt")

Training Data

The model was trained on the CODH Kuzushiji Character Shape Dataset, which contains character images from various Japanese historical documents.

Model Architecture

Base Model: YOLOv11x (extra-large variant)
Task: Object Detection
Framework: Ultralytics

Limitations

Optimized for historical Japanese documents; may not perform well on modern printed text
Performance may vary depending on the document quality and writing style

Citation

If you use this model, please cite the original dataset:

@misc{codh_kuzushiji,
  title={CODH Kuzushiji Character Shape Dataset},
  author={Center for Open Data in the Humanities (CODH)},
  url={http://codh.rois.ac.jp/char-shape/}
}

License

MIT License

Downloads last month: 102

nakamura196
/

yolov11x-codh-char