cris177's picture
Update README.md
dc39561 verified
---
license: apache-2.0
language:
- en
metrics:
- accuracy
pipeline_tag: text-generation
model-index:
- name: Phi3.1-Simple-Arguments
results:
- task:
type: text-generation
dataset:
name: Argument-parsing
type: Argument-parsing
metrics:
- name: Accuracy
type: Accuracy
value: 100
---
# Phi3.1 Simple Arguments
![image](assets/phi_simple_arguments.png)
[![image](assets/hire_me.png)](https://www.freelancer.com/u/cdesivo92)
This model aims to parse simple english arguments, arguments formed of two premises and a conclusion, including two propositions.
## Model Details
### Model Description
<!-- Provide a longer summary of what this model is. -->
- **Developed by:** Cristian Desivo
- **Model type:** LLM
- **Language(s) (NLP):** English
- **License:** Apache-2.0
- **Finetuned from model:** Phi3.1-mini
### Model Sources
<!-- Provide the basic links for the model. -->
- **Repository:** TBD
- **Demo:** TBD
### Quantization
<!-- - **Q4_K_M.gguf** https://huggingface.co/cris177/Qwen2-Simple-Arguments/resolve/main/Qwen2_arguments.Q4_K_M.gguf?download=true -->
## Usage
Below we share some code snippets on how to get quickly started with running the model.
### llama.cpp server [Recommended]
The recommended way of running the model is with a llama.cpp server running the quantized
Then you can use the following script to use the server's model for inference:
```python
import json
import requests
def llmCall(messages, **args):
url = "http://localhost:8080/v1/chat/completions"
headers = {
"Content-Type": "application/json"
}
data = {
'messages': messages
}
for arg in args:
data[arg] = args[arg]
response = requests.post(url, headers=headers, json=data)
return response.json()
def analyze_argument(argument):
instruction = "Based on the following argument, identify the following elements: premises, conclusion, propositions, type of argument, negation of propositions and validity."
inputText = "### Input:\n" + argument
prompt = f"""{instruction}
{inputText}
"""
messages=[{"role":"user", "content":prompt}]
properties = {
"Premise 1": {"type": "string"},
"Premise 2": {"type": "string"},
"Conclusion": {"type": "string"},
"Type of argument": {"type": "string"},
"Proposition 1": {"type": "string"},
"Proposition 2": {"type": "string"},
"Negation of Proposition 1": {"type": "string"},
"Negation of Proposition 2": {"type": "string"},
"Validity": {"type": "string"},
}
analysis = llmCall(
messages=messages,
max_tokens=1000,
temperature=0,
stop=["<|end|>"],
response_format={
"type": "json_object",
"schema": {
"type": "object",
"properties": properties,
"required": list(properties.keys()),
},
}
)['choices'][0]['message']['content']
if analysis.endswith("<|end|>"):
analysis = analysis[:-5]
return analysis
argument = "If it's wednesday it's cold, and it's cold, therefore it's wednesday."
output = analyze_argument("If it's wednesday it's cold, and it's cold, therefore it's wednesday.")
print(output)
```
Output:
```
{"Premise 1": "If it's wednesday it's cold",
"Premise 2": "It's cold",
"Conclusion": "It is Wednesday",
"Proposition 1": "It is Wednesday",
"Proposition 2": "It is cold",
"Type of argument": "affirming the consequent",
"Negation of Proposition 1": "It is not Wednesday",
"Negation of Proposition 2": "It is not cold",
"Validity": true}
```
### transformers 🤗
First make sure to pip install -U transformers, then use the code below replacing the `argument` variable for the argument you want to parse:
```python
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("cris177/Phi3.1-Simple-Arguments",
device_map="auto",)
tokenizer = AutoTokenizer.from_pretrained("cris177/Phi3.1-Simple-Arguments")
argument = "If it's wednesday it's cold, and it's cold, therefore it's wednesday."
instruction = 'Based on the following argument, identify the following elements: premises, conclusion, propositions, type of argument, negation of propositions and validity.'
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
{}
### Input:
{}
### Response:"""
prompt = alpaca_prompt.format(instruction, argument)
input_ids = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**input_ids, max_length=1000, num_return_sequences=1)
print(tokenizer.decode(outputs[0]))
```
Output:
```
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.
### Instruction:
Based on the following argument, identify the following elements: premises, conclusion, propositions, type of argument, negation of propositions and validity.
### Input:
If it's wednesday it's cold, and it's cold, therefore it's wednesday.
### Response:
{"Premise 1": "If it's wednesday it's cold",
"Premise 2": "It's cold",
"Conclusion": "It is Wednesday",
"Proposition 1": "It is Wednesday",
"Proposition 2": "It is cold",
"Type of argument": "affirming the consequent",
"Negation of Proposition 1": "It is not Wednesday",
"Negation of Proposition 2": "It is not cold",
"Validity": "false"}<|endoftext|>
```
## Training Details
### Training Data
<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->
The model was trained on syntethic data, based on the following types of arguments:
- Modus Ponen
- Modus Tollen
- Affirming Consequent
- Disjunctive Syllogism
- Denying Antecedent
- Invalid Conditional Syllogism
Each argument was constructed by selecting two random propositions (from a list of 400 propositions that was generated beforehand), choosing a type of argument and combining it all with randomly selected connectors (therefore, since, hence, thus, etc).
50k arguments were created to train the model, and 100 to test.
### Training Procedure
<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->
#### Training
We used unsloth for memory reduced sped up training.
We trained for one epoch.
Less than 3.5 GB of VRAM were used for training, and it took 3 hours.
## Evaluation
<!-- This section describes the evaluation protocols and provides the results. -->
The model obtains 100% train and test accuracy on our synthetic dataset.