File size: 3,701 Bytes
4c93e55 dd0dadd 4c93e55 | 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 | ---
license: apache-2.0
tags:
- text-to-json
- t5
- seq2seq
- text-generation
- json-conversion
- machine-learning
- nlp
base_model: t5-small
model_name: MD2JSON-T5-V1
version: V1
author: yahyakhoder
---
# MD2JSON-T5-V1: Text-to-JSON Converter with T5
This model utilizes the **T5 (Text-to-Text Transfer Transformer)** architecture to convert text strings into valid JSON objects. It is designed to take structured text and transform it into a JSON object.
## Description
The **MD2JSON-T5-V1** model is trained to interpret text strings where keys and values are separated by a colon (e.g., `#firstname: John`), and then convert them into a valid JSON object. This model can be used for a wide range of tasks where converting text to JSON is required.
### Example Input:
- Input:
```text
#firstname: John
#lastname: Doe
#age: 30
#married: true
#hobbies: ["gaming", "running"]
#address: {"city": "Berlin", "zipcode": 10115}
#url: "https://example.com"
```
- Generated JSON Output:
```json
{
"firstname": "John",
"lastname": "Doe",
"age": 30,
"married": true,
"hobbies": ["gaming", "running"],
"address": {
"city": "Berlin",
"zipcode": 10115
},
"url": "https://example.com"
}
```
### Another Example:
- Input:
```text
#name: Charlie
#age: 29
#isStudent: true
#skills: ["Java", "Machine Learning"]
#profile: {"github": "charlie29", "linkedin": "charlie-linkedin"}
#height: 172.3
```
- Generated JSON Output:
```json
{
"name": "Charlie",
"age": 29,
"isStudent": true,
"skills": ["Java", "Machine Learning"],
"profile": {
"github": "charlie29",
"linkedin": "charlie-linkedin"
},
"height": 172.3
}
```
## Load the Model
To use the model and perform inference, follow the steps below:
### Install Dependencies
```bash
pip install torch transformers datasets
from transformers import AutoTokenizer, AutoModelForSeq2SeqLM
import torch
import json
# Load the tokenizer and model
model_name = "yahyakhoder/MD2JSON-T5-V1" # Replace with your Hugging Face model path
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSeq2SeqLM.from_pretrained(model_name)
# Example Input
input_text = """#firstname: John
#lastname: Doe
#age: 30
#married: true
#hobbies: ["gaming", "running"]
#address: {"city": "Berlin", "zipcode": 10115}
#url: "https://example.com" """
# Tokenize and generate the output
inputs = tokenizer(input_text, return_tensors="pt", truncation=True, padding=True, max_length=256)
outputs = model.generate(**inputs, max_length=256, num_beams=4, early_stopping=True)
# Decode and convert to JSON
result = tokenizer.decode(outputs[0], skip_special_tokens=True)
try:
output_json = json.loads(result)
print(json.dumps(output_json, indent=2, ensure_ascii=False))
except json.JSONDecodeError:
print("Error during JSON conversion")
### Summary of Changes:
1. The **YAML metadata** section at the beginning of the file includes:
- **license**: `apache-2.0`
- **tags**: Relevant keywords like `text-to-json`, `t5`, `seq2seq`, `json-conversion`, etc.
- **base_model**: `t5-small`
- **model_name**: `MD2JSON-T5-V1`
- **version**: `V1`
- **author**: `yahyakhoder`
2. **Model path** in the code (under `model_name` variable) is updated to `yahyakhoder/MD2JSON-T5-V1` to reflect your Hugging Face username and model name.
This should resolve the YAML metadata warning and provide all the necessary information for users accessing your model on Hugging Face.
|