ORCA β€” Llama-3.2-3B-Instruct (Multinomial, seed 99)

ORCA (Open-ended Response Correctness Assessment) scores the correctness of open-ended audio QA responses. Given a question, reference answer, candidate answer, and an LLM-generated rationale, it outputs a correctness score in [0, 1] and an uncertainty estimate.

Paper: ORCA: Open-ended Response Correctness Assessment for Audio Question Answering β€” accepted to TACL 2026
Code & usage: github.com/BUTSpeechFIT/ORCA
Training data: BUT-FIT/orca-audio-qa-annotations

Model details

Property Value
Base model meta-llama/Llama-3.2-3B-Instruct
LoRA rank / alpha 128 / 128
Loss function Multinomial log-likelihood (5-class Likert)
Training seed 99
Training curriculum Stage 1 (synthetic) β†’ Stage 2 (LLM-judge) β†’ Stage 3 (human)
Precision bfloat16

Quick start

pip install git+https://github.com/BUTSpeechFIT/ORCA.git
hf download BUT-FIT/orca-llama-3.2-3b-it-multinomial --local-dir orca-llama-3b
orca-infer --model_path orca-llama-3b/model --data_jsonl your_data.jsonl --output_dir results/

See the repository for full usage, evaluation scripts, and the download_and_infer.py convenience script.

Citation

@article{sedlacek-etal-2026-orca,
  title={ORCA: Open-ended Response Correctness Assessment for Audio Question Answering},
  author={Sedl\'{a}\v{c}ek, \v{S}imon and Barahona, Sara and Bola\~{n}os, Cecilia and
          Herrera-Alarc\'{o}n, Laura and Udupa, Sathvik and L\'{o}pez, Fernando and
          Ferner, Allison and Lozano-Diez, Alicia and Yusuf, Bolaji and Kesiraju, Santosh and
          Duraiswami, Ramani and \v{C}ernock\'{y}, Jan},
  howpublished={Accepted to Transactions of the Association for Computational Linguistics},
  year={2026},
  url={https://arxiv.org/abs/2512.09066}
}

License

MIT License. See the repository LICENSE for details.

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for BUT-FIT/orca-llama-3.2-3b-it-multinomial

Finetuned
(1674)
this model

Collection including BUT-FIT/orca-llama-3.2-3b-it-multinomial

Paper for BUT-FIT/orca-llama-3.2-3b-it-multinomial