| --- |
| license: gpl-3.0 |
| language: |
| - en |
| base_model: |
| - openai-community/gpt2 |
| pipeline_tag: text-classification |
| tags: |
| - text-classification |
| - domotics |
| - DLMs |
| datasets: |
| - SiGiTechnologies/the-home-dataset-beta-1 |
| --- |
| # **PREDY 1.1** |
|
|
| Predy 1.1 is a finetuned version of gpt-2 available, at the moment, in the English language. |
| His goal is to predict the place of domotic commands. |
|
|
| ## Usage |
|
|
| This model is mainly made for experimental purposes, but you can still use it with pytorch model loading. |
| To try using this model you can simply run the kaggle notebook "Inference for predy V1", in this notebook there are all the requirements to inference our Predy model. |
|
|
| ## System |
|
|
| Predy series 1 is still an experiment but in the future it will be a fundamental part of light DLMs. |
|
|
| ## Implementation requirements |
|
|
| If you want to train this model with a custom dataset we recommend using this hyperparameters: |
| * Optimizer: AdamW |
| * Learning rate: 0.0005 |
| * dataset lenght in tokens: >10000 |
| * epochs: 20 |
|
|
| With this hyperparameters, you can reach up to 99% accuracy. |
|
|
| The inference requirements are really low since it only has 124M parameters, anyway here are the minimum requirements to inference Predy 1: |
| * ram > 2 GiB dedicated for the model (int4) |
| * CPU, even a raspberry Pi 5 can run it smoothly |
|
|
| # Model Characteristics |
|
|
|
|
| # Data Overview |
|
|
| The Dataset was sinthetically generated with AI, to ensure quality the model used was Gemini 3.1 Pro . |
| A line of the dataset looks like this: |
| ` "turn on the basement lights",basement` |
| As Predy is still an experiment it uses a csv structured dataset, in future versions the structure used will be **jsonl based**. |
|
|
| README in progress... |
|
|