|
|
---
|
|
|
base_model: LSXPrime/ProseFlow-v1-360M-Instruct
|
|
|
base_model_relation: quantized
|
|
|
language:
|
|
|
- en
|
|
|
library_name: gguf
|
|
|
pipeline_tag: text-generation
|
|
|
license: apache-2.0
|
|
|
datasets:
|
|
|
- LSXPrime/ProseFlow-Actions-v1
|
|
|
tags:
|
|
|
- text-generation
|
|
|
- instruction
|
|
|
- proseflow
|
|
|
- unsloth
|
|
|
- smollm
|
|
|
- writing-assistant
|
|
|
---
|
|
|
|
|
|
# ProseFlow-v1-360M-Instruct
|
|
|
|
|
|
**ProseFlow-v1-360M-Instruct** is a lightweight, experimental instruction-tuned model created for
|
|
|
the [ProseFlow desktop application](https://github.com/LSXPrime/ProseFlow). This model is a fine-tune of HuggingFace's [
|
|
|
**SmolLM-360M-Instruct**](https://huggingface.co/HuggingFaceTB/SmolLM-360M-Instruct) and was created to explore the
|
|
|
capabilities of smaller language models on a diverse set of text-processing tasks.
|
|
|
|
|
|
The model was fine-tuned on the **[ProseFlow-Actions-v1
|
|
|
](https://huggingface.co/datasets/LSXPrime/ProseFlow-Actions-v1)** dataset.
|
|
|
|
|
|
**Note:** This model is provided for research and experimental purposes and low-resource devices. For the best user
|
|
|
experience in the ProseFlow application, the larger and more capable [
|
|
|
`ProseFlow-v1-1.5B-Instruct`](https://huggingface.co/LSXPrime/ProseFlow-v1-1.5B-Instruct) model is strongly recommended.
|
|
|
|
|
|
## Model Description
|
|
|
|
|
|
ProseFlow is a universal AI text processor that allows users to create and execute custom AI "Actions" on text in any
|
|
|
application. This model was an experiment to see if a ~360M parameter model could reliably perform the wide range of
|
|
|
tasks defined in the training dataset.
|
|
|
|
|
|
### Performance and Capabilities
|
|
|
|
|
|
Evaluations show that while this model is extremely fast and has very low resource requirements, its capabilities are
|
|
|
limited.
|
|
|
|
|
|
#### Strengths:
|
|
|
|
|
|
* **Extremely Lightweight:** Can run on devices with very limited RAM and computational power.
|
|
|
* **Strict Formatting Adherence (sometimes):** In some cases where it understands the task, it can follow rigid
|
|
|
formatting instructions (like creating a bulleted list) more strictly than its larger counterpart.
|
|
|
* **Simple Data Extraction:** It shows some capability in basic data extraction and formatting tasks, such as creating
|
|
|
Markdown tables or extracting contact information.
|
|
|
|
|
|
## Provided Files & Quantization Details
|
|
|
|
|
|
This repository provides multiple versions of the model, allowing users to choose the best balance of performance and
|
|
|
resource usage for their specific hardware. All quantized versions are provided in the GGUF format for broad
|
|
|
compatibility.
|
|
|
|
|
|
| File Name (Quantization) | VRAM Usage (Approx.) | Performance | Recommended Use Case |
|
|
|
|:-------------------------|:---------------------|:---------------------------------------------------|:--------------------------------------------|
|
|
|
| `Q8_0` | ~1 GB | **Best Overall.** Nearly identical to FP16. | **The recommended default for most users.** |
|
|
|
| `Q4_K_M` | ~900 MB | **Low Quality.** Noticeable degradation in nuance. | For maximum speed on low-power devices. |
|
|
|
|
|
|
**Note on Quantization:** To maintain the highest possible quality, the token embeddings and the final output layer were
|
|
|
kept at F16 precision. Additionally, an importance matrix was used for calibration during the quantization process. This
|
|
|
is why the quantized files are larger than what might typically be expected, as this method significantly improves their
|
|
|
performance and coherence.
|
|
|
|
|
|
#### Weaknesses & Limitations:
|
|
|
|
|
|
* **Poor Reasoning:** The model struggles significantly with tasks that require logical reasoning, inference, or
|
|
|
multi-step problem-solving. It often fails on word problems and logical puzzles.
|
|
|
* **Limited Creativity:** It is not effective at creative writing tasks like continuing a story or generating novel
|
|
|
content. Its outputs are often repetitive or nonsensical.
|
|
|
* **Instructional Failures:** The model frequently violates the "no extra text" rule by adding conversational chatter.
|
|
|
In many cases, it fails the task entirely and repeats the input verbatim.
|
|
|
* **Hallucination:** On some tasks (e.g., `To Paragraph`), the model hallucinates content completely unrelated to the
|
|
|
input.
|
|
|
* **Unreliable for Complex Tasks:** It is not suitable for complex tasks like code refactoring, bug finding, or drafting
|
|
|
professional business correspondence.
|
|
|
|
|
|
### Intended Use
|
|
|
|
|
|
This model is intended for **experimental use** and for users on **extremely resource-constrained systems** who are
|
|
|
willing to accept a significant trade-off in performance and reliability. It may be suitable for a very limited subset
|
|
|
of simple, repetitive text-formatting tasks.
|
|
|
|
|
|
It is designed to be used within the **ProseFlow desktop application**, but it is **not the recommended model for
|
|
|
general use**.
|
|
|
|
|
|
## How to Use in ProseFlow
|
|
|
|
|
|
1. [Download and install the ProseFlow application](https://github.com/LSXPrime/ProseFlow/releases).
|
|
|
2. Navigate to the **Providers -> Local Provider** tab.
|
|
|
3. Click "Manage Models..." and select the desired version of `ProseFlow-v1-360M-Instruct` from the "Available for
|
|
|
Download" list. **We recommend starting with `Q8_0`.**
|
|
|
4. Once downloaded, select it from the "My Models" list.
|
|
|
5. Set your "Primary Service Type" in ProseFlow to **Local**.
|
|
|
6. Be aware of the limitations described above when executing actions.
|
|
|
|
|
|
## Training Details
|
|
|
|
|
|
* **Base Model:** [HuggingFaceTB/SmolLM-360M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM-360M-Instruct)
|
|
|
* **Dataset:** [LSXPrime/ProseFlow-Actions-v1](https://huggingface.co/datasets/LSXPrime/ProseFlow-Actions-v1)
|
|
|
* **Fine-tuning Library:** [Unsloth](https://github.com/unslothai/unsloth)
|
|
|
* **Fine-tuning Method:** Supervised fine-tuning using LoRA on a dataset of structured instruction-input-output
|
|
|
triplets.
|
|
|
|
|
|
## License
|
|
|
|
|
|
This model is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).
|
|
|
|