QAEvaluater / app.py
SujanMidatani's picture
update app.py
855db56
import os
import guidance
import json
import streamlit as st
# from bardapi import Bard
import re
from dotenv import load_dotenv
load_dotenv()
def guideus(question, answer, essay):
gptThreeFive = guidance.llms.OpenAI('gpt-3.5-turbo')
mySys = guidance('''
{{#system~}}
You are an Interviewee answer grading System. You have highest precision in grading the answer. Grade the answers given by an interviewee.
Scenario :
1. The Interviewee is given the question to read for 3 minutes.
2. They were later provided with listening conceptual essay. He is allowed to take notes on the comphrehension.
3. For the next 20 minutes, they are instructed to complete their answer based on their notes and the question asked.
4. The response is then graded according to appropriate graded measures and pattern scoring.
{{~/system}}
{{#user~}}
Your task now is to assess an interviewee's response.
The question provided to the interviewee is as follows :
{{question}}
Now consider the following grading measures and their definitions.
Grading Measures:
Clear Introduction: The answer should have a clear introduction that introduces the topic intended in the question.
Coherent Body Paragraphs: The answer should have well-developed body paragraphs that support the question with relevant examples, details, and explanations.
Logical Organization: The answer should demonstrate a logical progression of ideas and use appropriate transitions between paragraphs and sentences.\
Cohesion: The answer should have coherence in terms of ideas, with connections between sentences and paragraphs clearly established.
Grammar and Sentence Structure: The answer should demonstrate a good command of grammar, with accurate sentence structures and appropriate use of verb tenses, pronouns, and modifiers.
Vocabulary: The answer should use a wide range of vocabulary appropriately and effectively to convey meaning and enhance the quality of the writing.
Idiomatic Language: The answer should demonstrate the ability to use idiomatic expressions and phrases in a natural and accurate way.
Word Choice: The answer should show precision and accuracy in word choice, avoiding repetition and using words that convey the intended meaning.
Relevance to the Topic: The answer should address the given topic directly and provide a focused and relevant response.
Content Accuracy: The answer should demonstrate a good understanding of the topic and present accurate information and examples.
Use of Examples and Details: The answer should provide specific examples and details to support ideas and arguments effectively.
Critical Thinking: The answer should show the candidate's ability to analyze and evaluate the topic, presenting a well-reasoned argument or position.
The above are the available grading measures.
"""Now, you need to pick/generate the suitable subset of above grading measures for the given question."""
{{~/user}}
{{#assistant~}}
{{gen 'grading_measures' temperature=0.7 max_tokens=100}}
{{~/assistant}}
{{#user~}}
Consider the following inputs, to perform the task.
The answer he had given to the questions is as follows :
"""{{answer}}"""
if the above ans is empty then evaluate to zero in all blocks of grading measures
The candidate's answer is to be evaluated as per below Conceptual Essay :
{{essay}}
{{~/user}}
{{#user~}}
Now consider the pattern of scoring as follows :
"""The evaluation score is awared 5 when the answer contains important information of the conceptual is found in it.
The evaluation score is awared 4 when the answer has the important information but with minor omissions, inaccuracy, vagueness, imprecision and language errors.
The evaluation score is awared 3 when the answer conveys only a global, unclear and inaccurate information. It is also done when key points
of the conceptual essay are not found in the interviewee's answer. Frequent Grammatical Usage Errors are also seen.
The evaluation score is given 2 when the answer is relevant to the conceptual essay but has major omission, inaccuracy and language difficulties.
The evaluation score is presented with 1 if the answer has little to no meaning or relevant information from the conceptual essay. The language sophistication is too low to comphrehend.
The evaluation score is determined to be 0 if the answer is a copy from the conceptual essay or has no relevance to the question or written in Non-English Language or is left as blank."""
{{~/user}}
{{#user~}}
Now, based on the given values of question, answer and the Conceptual Essay,
perform the Grading Evaluation on the Answer text based on generated grading Measures.
The answer evaluation must totally depend on the conceptual essay and pattern of scoring provided above.
{{~/user}}
{{#user~}}
Generate the Evaulation in a JSON Format. The JSON must consist of Grading Measure as the key and a dictionary of two key-value pairs.
The dict must have the first pair with key as 'numerical_score' and value as grade out of 5. The other element must have 'reason' as the key and value as a string containing reason to why setting such grade.
The final JSON must also a Key Value of 'Overall Score' and dictionary as value with 'numerical_score' key having the mean value of all grading measure numerical scores.
{{~/user}}
{{#assistant~}}
{{gen 'evaluation' temperature=0.5 max_tokens=1000}}
{{~/assistant}}
''', llm=gptThreeFive)
text = mySys(question= question, answer = answer, essay = essay)
text=text['evaluation']
start_index = text.find('{')
end_index = text.rfind('}') + 1
json_part = text[start_index:end_index]
# print(json_part)
# json_part = ''.join(json_part.split()) # Remove whitespace within the JSON string
data = json.loads(json_part)
st.write("the Evaluation is")
st.write(data)
st.title("Evaluation System")
st.write("enter the following :")
# st.write("")
# st.text_input("enter your openai api key",key="key")
st.write()
st.text_input("Question :", key="question")
st.write("")
st.text_area("Essay :",key="essay")
st.write("" )
st.text_area("Your Answer :",key="ans")
# out=guideus(st.session_state.question,st.session_state.ans,st.session_state.essay)
if st.button("Evaluate"):
if ( st.session_state.question!="" and st.session_state.ans!="" and st.session_state.essay!=""):
# try:
# st.write(st.session_state.ans)
# os.environ['_BARD_API_KEY'] ="XAjYDWIL3ogqfzICcZSTnws5vVoP7Z69K08gQ2XMu20lIKkWCT679RXrpqDmOX1nhJsGNg."
# os.environ['OPENAI_API_KEY'] = st.session_state.key
guideus(st.session_state.question,st.session_state.ans,st.session_state.essay)
# plag(st.session_state.question,st.session_state.ans)
# st.cache_data.clear()
# except:
# raise
else: st.write("please enter all details")