File size: 3,701 Bytes
4c93e55
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
dd0dadd
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
4c93e55
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
---
license: apache-2.0
tags:
  - text-to-json
  - t5
  - seq2seq
  - text-generation
  - json-conversion
  - machine-learning
  - nlp
base_model: t5-small
model_name: MD2JSON-T5-V1
version: V1
author: yahyakhoder
---

# MD2JSON-T5-V1: Text-to-JSON Converter with T5

This model utilizes the **T5 (Text-to-Text Transfer Transformer)** architecture to convert text strings into valid JSON objects. It is designed to take structured text and transform it into a JSON object.

## Description

The **MD2JSON-T5-V1** model is trained to interpret text strings where keys and values are separated by a colon (e.g., `#firstname: John`), and then convert them into a valid JSON object. This model can be used for a wide range of tasks where converting text to JSON is required.

### Example Input:
- Input: 
    ```text
    #firstname: John
    #lastname: Doe
    #age: 30
    #married: true
    #hobbies: ["gaming", "running"]
    #address: {"city": "Berlin", "zipcode": 10115}
    #url: "https://example.com"
    ```

- Generated JSON Output:
    ```json
    {
        "firstname": "John",
        "lastname": "Doe",
        "age": 30,
        "married": true,
        "hobbies": ["gaming", "running"],
        "address": {
            "city": "Berlin",
            "zipcode": 10115
        },
        "url": "https://example.com"
    }
    ```

### Another Example:
- Input: 
    ```text
    #name: Charlie
    #age: 29
    #isStudent: true
    #skills: ["Java", "Machine Learning"]
    #profile: {"github": "charlie29", "linkedin": "charlie-linkedin"}
    #height: 172.3
    ```

- Generated JSON Output:
    ```json
    {
        "name": "Charlie",
        "age": 29,
        "isStudent": true,
        "skills": ["Java", "Machine Learning"],
        "profile": {
            "github": "charlie29",
            "linkedin": "charlie-linkedin"
        },
        "height": 172.3
    }
    ```

## Load the Model

To use the model and perform inference, follow the steps below:

### Install Dependencies

```bash
pip install torch transformers datasets

from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch
import json

# Load the tokenizer and model
model_name = "yahyakhoder/MD2JSON-T5-V1"  # Replace with your Hugging Face model path
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)

# Example Input
input_text = """#firstname: John
#lastname: Doe
#age: 30
#married: true
#hobbies: ["gaming", "running"]
#address: {"city": "Berlin", "zipcode": 10115}
#url: "https://example.com" """

# Tokenize and generate the output
inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding=True, max_length=256)
outputs = model.generate(**inputs, max_length=256, num_beams=4, early_stopping=True)

# Decode and convert to JSON
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
try:
    output_json = json.loads(result)
    print(json.dumps(output_json, indent=2, ensure_ascii=False))
except json.JSONDecodeError:
    print("Error during JSON conversion")



### Summary of Changes:

1. The **YAML metadata** section at the beginning of the file includes:
   - **license**: `apache-2.0`
   - **tags**: Relevant keywords like `text-to-json`, `t5`, `seq2seq`, `json-conversion`, etc.
   - **base_model**: `t5-small`
   - **model_name**: `MD2JSON-T5-V1`
   - **version**: `V1`
   - **author**: `yahyakhoder`

2. **Model path** in the code (under `model_name` variable) is updated to `yahyakhoder/MD2JSON-T5-V1` to reflect your Hugging Face username and model name.

This should resolve the YAML metadata warning and provide all the necessary information for users accessing your model on Hugging Face.