cris177's picture
Update README.md
65fedc6 verified
---
license: apache-2.0
language:
- en
metrics:
- accuracy
pipeline_tag: text-generation
model-index:
- name: Qwen2-Simple-Arguments
results:
- task:
type: text-generation
dataset:
name: Argument-parsing
type: Argument-parsing
metrics:
- name: Accuracy
type: Accuracy
value: 100
---
# Qwen2 Simple Arguments
![image](assets/qwen_arguments_logo.png)
[![image](assets/hire_me.png)](https://www.freelancer.com/u/cdesivo92)
This model aims to parse simple english arguments, arguments formed of two premises and a conclusion, including two propositions.
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** Cristian Desivo
- **Model type:** LLM
- **Language(s) (NLP):** English
- **License:** Apache-2.0
- **Finetuned from model:** Qwen2-0.5b
### Model Sources
<!-- Provide the basic links for the model. -->
- **Repository:** TBD
- **Demo:** TBD
### Quantization
- **Q4_K_M.gguf** https://huggingface.co/cris177/Qwen2-Simple-Arguments/resolve/main/Qwen2_arguments.Q4_K_M.gguf?download=true
## Usage
Below we share some code snippets on how to get quickly started with running the model.
### llama.cpp server [Recommended]
The recommended way of running the model is with a llama.cpp server running the quantized https://huggingface.co/cris177/Qwen2-Simple-Arguments/resolve/main/Qwen2_arguments.Q4_K_M.gguf?download=true
Then you can use the following script to use the server's model for inference:
```python
import json
import requests
def llmCompletion(prompt, **args):
url = "http://localhost:8080/completions"
headers = {
"Content-Type": "application/json"
}
data = {
'prompt': prompt
}
for arg in args:
data[arg] = args[arg]
response = requests.post(url, headers=headers, json=data)
return response.json()
def analyze_argument(argument):
instruction = 'Based on the following argument, identify the following elements: premises, conclusion, propositions, type of argument, negation of propositions and validity.'
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{}
### Input:
{}
### Response:"""
prompt = alpaca_prompt.format(instruction, argument)
with open("prompt.txt", "w") as f:
f.write(prompt)
properties = {
"Premise 1": {"type": "string"},
"Premise 2": {"type": "string"},
"Conclusion": {"type": "string"},
"Proposition 1": {"type": "string"},
"Proposition 2": {"type": "string"},
"Type of argument": {"type": "string"},
"Negation of Proposition 1": {"type": "string"},
"Negation of Proposition 2": {"type": "string"},
"Validity": {"type": "boolean"},
}
analysis = llmCompletion(prompt,
max_tokens=1000,
temperature=0,
json_schema={
"type": "object",
"properties": properties,
"required": list(properties.keys()),
},
)
return analysis['content']
argument = "If it's wednesday it's cold, and it's cold, therefore it's wednesday."
output = analyze_argument("If it's wednesday it's cold, and it's cold, therefore it's wednesday.")
print(output)
```
Output:
```
{"Premise 1": "If it's wednesday it's cold",
"Premise 2": "It's cold",
"Conclusion": "It is Wednesday",
"Proposition 1": "It is Wednesday",
"Proposition 2": "It is cold",
"Type of argument": "affirming the consequent",
"Negation of Proposition 1": "It is not Wednesday",
"Negation of Proposition 2": "It is not cold",
"Validity": true}
```
### transformers 🤗
First make sure to pip install -U transformers, then use the code below replacing the `argument` variable for the argument you want to parse:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("cris177/Qwen2-Simple-Arguments",
device_map="auto",)
tokenizer = AutoTokenizer.from_pretrained("cris177/Qwen2-Simple-Arguments")
argument = "If it's wednesday it's cold, and it's cold, therefore it's wednesday."
instruction = 'Based on the following argument, identify the following elements: premises, conclusion, propositions, type of argument, negation of propositions and validity.'
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{}
### Input:
{}
### Response:"""
prompt = alpaca_prompt.format(instruction, argument)
input_ids = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids, max_length=1000, num_return_sequences=1)
print(tokenizer.decode(outputs[0]))
```
Output:
```
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
Based on the following argument, identify the following elements: premises, conclusion, propositions, type of argument, negation of propositions and validity.
### Input:
If it's wednesday it's cold, and it's cold, therefore it's wednesday.
### Response:
{"Premise 1": "If it's wednesday it's cold",
"Premise 2": "It's cold",
"Conclusion": "It is Wednesday",
"Proposition 1": "It is Wednesday",
"Proposition 2": "It is cold",
"Type of argument": "affirming the consequent",
"Negation of Proposition 1": "It is not Wednesday",
"Negation of Proposition 2": "It is not cold",
"Validity": "false"}<|endoftext|>
```
## Training Details
### Training Data
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
The model was trained on syntethic data, based on the following types of arguments:
- Modus Ponen
- Modus Tollen
- Affirming Consequent
- Disjunctive Syllogism
- Denying Antecedent
- Invalid Conditional Syllogism
Each argument was constructed by selecting two random propositions (from a list of 400 propositions that was generated beforehand), choosing a type of argument and combining it all with randomly selected connectors (therefore, since, hence, thus, etc).
50k arguments were created to train the model, and 100 to test.
### Training Procedure
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
#### Preprocessing
[More Information Needed]
We converted the data to the Alpaca chat format before feeding it to the model.
#### Training
We used unsloth for memory reduced sped up training.
We trained for one epoch.
Less than 2.5 GB of VRAM were used for training, and it took 2.5 hours.
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
The model obtains 100% train and test accuracy on our synthetic dataset.