Finetuned Model on Iranian Constitution Q&A
This repository hosts a language model finetuned on a dataset of question-answer pairs derived from the text of the Iranian Constitution. The goal of this finetuning is to enhance the model's ability to provide accurate and relevant information based on the constitutional text, specifically in Persian.
Model Details
- Base Model: Qwen/Qwen1.5-7B (or specify the exact Qwen 2.5 7B model name used, e.g., Qwen/Qwen2-7B-Instruct)
- Language: Persian (Farsi)
- Finetuning Data: Custom dataset of Q&A pairs extracted from the text of the Iranian Constitution.
- Training: The model was finetuned for 500 epochs on the custom Q&A dataset. (Mention specific method like LoRA/QLoRA if applicable and known).
Training Details
The model was trained on a dataset comprising question-answer pairs covering various articles, principles, rights, duties, and structures defined within the Iranian Constitution. The training process involved feeding the model these pairs to specialize its knowledge base in this specific legal document. The finetuning was performed on the [Base Model Name] for 500 epochs using [mention method if possible, e.g., QLoRA].
Intended Use
This model is intended for applications requiring knowledge recall and question answering strictly based on the text of the Iranian Constitution. Potential uses include:
- Constitutional information retrieval.
- Educational tools for studying the constitution.
- As a component in larger systems requiring specific legal text knowledge.
It is designed to respond to questions about the constitution without external context provided in the input prompt, relying instead on the knowledge acquired during finetuning.
Limitations and Bias
- Source Limitation: The model's knowledge is limited only to the text of the Iranian Constitution used for training. It does not incorporate legal interpretations, subsequent legislation, judicial precedents, or real-world application details.
- Potential Inaccuracies: While trained on specific Q&A pairs, the model might still generate answers that are not perfectly aligned with the source text, especially for complex or nuanced queries.
- Bias: The responses will reflect the inherent perspective and content of the source document itself.
- Out-of-Scope Queries: The model is not designed for general conversation or answering questions unrelated to the Iranian Constitution. Querying the model on topics outside its training scope may result in irrelevant or nonsensical output.
- Empty Input Dependence: Due to the dataset format (
"input": ""), the model is trained to generate answers solely from the instruction. It may not perform well if provided with conversational context or requiring understanding beyond the direct question.
Users should exercise caution and verify the model's output, especially in critical applications.
Evaluation
No separate evaluation dataset was used for rigorous performance measurement. Performance is expected to be best on questions similar in style and content to the training data.
Example
{
"instruction": "اعتبار حقوق انسانی افراد غیر مسلمان طبق اصل ۱۴ منوط به چیست؟",
"input": "",
"output": "این اصل (اصل ۱۴) در حق کسانی اعتبار دارد که بر ضد اسلام و جمهوری اسلامی ایران توطئه و اقدام نکنند."
}