Observed Hallucinations and Arabic OCR Quality Issues in the Fine-Tuned GLM Model

by JamesGs - opened Mar 12

Mar 12

In our tests, the model shows significant hallucination, particularly repeated text patterns such as:

B
نصر
M
DZ
B
M
نصر
B

We also observed issues in the JSON responses, including invalid JSON structures or duplicated fields, for example:

"arabic last name": "نصرالله",
"arabic first name": "نصرالله",
"arabic last name": "نصرالله"

Additionally, there are many spelling errors in the Arabic text.

Could you clarify how the model was fine-tuned and which dataset was used for the training?

sherif1313

Owner Mar 12

I'm working on version 2 of the model, which has very strong improvements and is much better than this model in terms of vision, language, and attention settings. I'll release it in a few days, God willing.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment

**Observed Hallucinations and Arabic OCR Quality Issues in the Fine-Tuned GLM Model**

Observed Hallucinations and Arabic OCR Quality Issues in the Fine-Tuned GLM Model