File size: 6,957 Bytes

---
license: apache-2.0
language:
- en
metrics:
- accuracy
pipeline_tag: text-generation

model-index:
  - name: Phi3.1-Simple-Arguments
    results:
      - task:
          type: text-generation
        dataset:
          name: Argument-parsing
          type: Argument-parsing
        metrics:
          - name: Accuracy
            type: Accuracy
            value: 100
---
# Phi3.1 Simple Arguments
![image](assets/phi_simple_arguments.png)
[![image](assets/hire_me.png)](https://www.freelancer.com/u/cdesivo92)

This model aims to parse simple english arguments, arguments formed of two premises and a conclusion, including two propositions.

## Model Details

### Model Description

<!-- Provide a longer summary of what this model is. -->

- **Developed by:** Cristian Desivo
- **Model type:** LLM
- **Language(s) (NLP):** English
- **License:** Apache-2.0
- **Finetuned from model:** Phi3.1-mini

### Model Sources

<!-- Provide the basic links for the model. -->

- **Repository:** TBD
- **Demo:** TBD

### Quantization

<!-- - **Q4_K_M.gguf** https://huggingface.co/cris177/Qwen2-Simple-Arguments/resolve/main/Qwen2_arguments.Q4_K_M.gguf?download=true -->

## Usage

Below we share some code snippets on how to get quickly started with running the model.

### llama.cpp server [Recommended]

The recommended way of running the model is with a llama.cpp server running the quantized 

Then you can use the following script to use the server's model for inference:

```python
import json
import requests

def llmCall(messages, **args):
    url = "http://localhost:8080/v1/chat/completions"
    headers = {
        "Content-Type": "application/json"
    }
    data = {
        'messages': messages
    }
    for arg in args:
        data[arg] = args[arg]
    response = requests.post(url, headers=headers, json=data)
    return response.json()

def analyze_argument(argument):
    instruction = "Based on the following argument, identify the following elements: premises, conclusion, propositions, type of argument, negation of propositions and validity."
    inputText = "### Input:\n" + argument
    prompt = f"""{instruction}

    {inputText}
    """
    messages=[{"role":"user", "content":prompt}]
    properties = {
        "Premise 1": {"type": "string"},
        "Premise 2": {"type": "string"},
        "Conclusion": {"type": "string"},
        "Type of argument": {"type": "string"},
        "Proposition 1": {"type": "string"},
        "Proposition 2": {"type": "string"},
        "Negation of Proposition 1": {"type": "string"},
        "Negation of Proposition 2": {"type": "string"},
        "Validity": {"type": "string"},
    }
    analysis = llmCall(
        messages=messages,
        max_tokens=1000,
        temperature=0,
        stop=["<|end|>"],
        response_format={
            "type": "json_object",
            "schema": {
                "type": "object",
                "properties": properties,
                "required": list(properties.keys()),
            },
        }
        
        )['choices'][0]['message']['content']
    if analysis.endswith("<|end|>"):
        analysis = analysis[:-5]
    return analysis
argument = "If it's wednesday it's cold, and it's cold, therefore it's wednesday."
output = analyze_argument("If it's wednesday it's cold, and it's cold, therefore it's wednesday.")
print(output)
```
Output:
```
{"Premise 1": "If it's wednesday it's cold",
"Premise 2": "It's cold",
"Conclusion": "It is Wednesday",
"Proposition 1": "It is Wednesday",
"Proposition 2": "It is cold",
"Type of argument": "affirming the consequent",
"Negation of Proposition 1": "It is not Wednesday",
"Negation of Proposition 2": "It is not cold",
"Validity": true}
```

### transformers 🤗
First make sure to pip install -U transformers, then use the code below replacing the `argument` variable for the argument you want to parse:

```python
from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained("cris177/Phi3.1-Simple-Arguments", 
    device_map="auto",)
tokenizer = AutoTokenizer.from_pretrained("cris177/Phi3.1-Simple-Arguments")

argument = "If it's wednesday it's cold, and it's cold, therefore it's wednesday."

instruction = 'Based on the following argument, identify the following elements: premises, conclusion, propositions, type of argument, negation of propositions and validity.'
alpaca_prompt = """Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
{}

### Input:
{}

### Response:"""
prompt = alpaca_prompt.format(instruction, argument)
input_ids = tokenizer(prompt, return_tensors="pt").to("cuda")

outputs = model.generate(**input_ids, max_length=1000, num_return_sequences=1)
print(tokenizer.decode(outputs[0]))
```
Output:
```
Below is an instruction that describes a task, paired with an input that provides further context. Write a response that appropriately completes the request.

### Instruction:
Based on the following argument, identify the following elements: premises, conclusion, propositions, type of argument, negation of propositions and validity.

### Input:
If it's wednesday it's cold, and it's cold, therefore it's wednesday.

### Response: 
{"Premise 1": "If it's wednesday it's cold",
"Premise 2": "It's cold",
"Conclusion": "It is Wednesday",
"Proposition 1": "It is Wednesday",
"Proposition 2": "It is cold",
"Type of argument": "affirming the consequent",
"Negation of Proposition 1": "It is not Wednesday",
"Negation of Proposition 2": "It is not cold",
"Validity": "false"}<|endoftext|>
```


## Training Details

### Training Data

<!-- This should link to a Dataset Card, perhaps with a short stub of information on what the training data is all about as well as documentation related to data pre-processing or additional filtering. -->

The model was trained on syntethic data, based on the following types of arguments: 
- Modus Ponen
- Modus Tollen
- Affirming Consequent
- Disjunctive Syllogism
- Denying Antecedent
- Invalid Conditional Syllogism

Each argument was constructed by selecting two random propositions (from a list of 400 propositions that was generated beforehand), choosing a type of argument and combining it all with randomly selected connectors (therefore, since, hence, thus, etc).

50k arguments were created to train the model, and 100 to test.

### Training Procedure

<!-- This relates heavily to the Technical Specifications. Content here should link to that section when it is relevant to the training procedure. -->


#### Training

We used unsloth for memory reduced sped up training.

We trained for one epoch.

Less than 3.5 GB of VRAM were used for training, and it took 3 hours.

## Evaluation

<!-- This section describes the evaluation protocols and provides the results. -->

The model obtains 100% train and test accuracy on our synthetic dataset.