Spaces:

CohereLabs
/

c4ai-command

Running on CPU Upgrade

Factuality Evaluation Failed

#51

Jan 10

I tested the model on 'Hallucination Resistance.' I asked for a non-existent case (Harrison v. Telco-Dynamics), and instead of refusing, the model invented a fake summary.

This is a critical risk for RAG (Retrieval-Augmented Generation) systems. We build 'Counterfactual Knowledge' datasets (questions about fake events/papers) to train models to say 'I don't know' instead of lying.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment