ProseFlow-v1-360M-Instruct-GGUF / README.md

Initial upload of ProseFlow-v1-360M-Instruct-GGUF

7e2a195 verified 6 months ago

5.99 kB

	---
	base_model: LSXPrime/ProseFlow-v1-360M-Instruct
	base_model_relation: quantized
	language:
	- en
	library_name: gguf
	pipeline_tag: text-generation
	license: apache-2.0
	datasets:
	- LSXPrime/ProseFlow-Actions-v1
	tags:
	- text-generation
	- instruction
	- proseflow
	- unsloth
	- smollm
	- writing-assistant
	---

	# ProseFlow-v1-360M-Instruct

	ProseFlow-v1-360M-Instruct is a lightweight, experimental instruction-tuned model created for
	the [ProseFlow desktop application](https://github.com/LSXPrime/ProseFlow). This model is a fine-tune of HuggingFace's [
	SmolLM-360M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM-360M-Instruct) and was created to explore the
	capabilities of smaller language models on a diverse set of text-processing tasks.

	The model was fine-tuned on the **[ProseFlow-Actions-v1
	](https://huggingface.co/datasets/LSXPrime/ProseFlow-Actions-v1)** dataset.

	Note: This model is provided for research and experimental purposes and low-resource devices. For the best user
	experience in the ProseFlow application, the larger and more capable [
	`ProseFlow-v1-1.5B-Instruct`](https://huggingface.co/LSXPrime/ProseFlow-v1-1.5B-Instruct) model is strongly recommended.

	## Model Description

	ProseFlow is a universal AI text processor that allows users to create and execute custom AI "Actions" on text in any
	application. This model was an experiment to see if a ~360M parameter model could reliably perform the wide range of
	tasks defined in the training dataset.

	### Performance and Capabilities

	Evaluations show that while this model is extremely fast and has very low resource requirements, its capabilities are
	limited.

	#### Strengths:

	* Extremely Lightweight: Can run on devices with very limited RAM and computational power.
	* Strict Formatting Adherence (sometimes): In some cases where it understands the task, it can follow rigid
	formatting instructions (like creating a bulleted list) more strictly than its larger counterpart.
	* Simple Data Extraction: It shows some capability in basic data extraction and formatting tasks, such as creating
	Markdown tables or extracting contact information.

	## Provided Files & Quantization Details

	This repository provides multiple versions of the model, allowing users to choose the best balance of performance and
	resource usage for their specific hardware. All quantized versions are provided in the GGUF format for broad
	compatibility.

	\| File Name (Quantization) \| VRAM Usage (Approx.) \| Performance \| Recommended Use Case \|
	\|:-------------------------\|:---------------------\|:---------------------------------------------------\|:--------------------------------------------\|
	\| `Q8_0` \| ~1 GB \| Best Overall. Nearly identical to FP16. \| The recommended default for most users. \|
	\| `Q4_K_M` \| ~900 MB \| Low Quality. Noticeable degradation in nuance. \| For maximum speed on low-power devices. \|

	Note on Quantization: To maintain the highest possible quality, the token embeddings and the final output layer were
	kept at F16 precision. Additionally, an importance matrix was used for calibration during the quantization process. This
	is why the quantized files are larger than what might typically be expected, as this method significantly improves their
	performance and coherence.

	#### Weaknesses & Limitations:

	* Poor Reasoning: The model struggles significantly with tasks that require logical reasoning, inference, or
	multi-step problem-solving. It often fails on word problems and logical puzzles.
	* Limited Creativity: It is not effective at creative writing tasks like continuing a story or generating novel
	content. Its outputs are often repetitive or nonsensical.
	* Instructional Failures: The model frequently violates the "no extra text" rule by adding conversational chatter.
	In many cases, it fails the task entirely and repeats the input verbatim.
	* Hallucination: On some tasks (e.g., `To Paragraph`), the model hallucinates content completely unrelated to the
	input.
	* Unreliable for Complex Tasks: It is not suitable for complex tasks like code refactoring, bug finding, or drafting
	professional business correspondence.

	### Intended Use

	This model is intended for experimental use and for users on extremely resource-constrained systems who are
	willing to accept a significant trade-off in performance and reliability. It may be suitable for a very limited subset
	of simple, repetitive text-formatting tasks.

	It is designed to be used within the ProseFlow desktop application, but it is **not the recommended model for
	general use**.

	## How to Use in ProseFlow

	1. [Download and install the ProseFlow application](https://github.com/LSXPrime/ProseFlow/releases).
	2. Navigate to the Providers -> Local Provider tab.
	3. Click "Manage Models..." and select the desired version of `ProseFlow-v1-360M-Instruct` from the "Available for
	Download" list. We recommend starting with `Q8_0`.
	4. Once downloaded, select it from the "My Models" list.
	5. Set your "Primary Service Type" in ProseFlow to Local.
	6. Be aware of the limitations described above when executing actions.

	## Training Details

	* Base Model: [HuggingFaceTB/SmolLM-360M-Instruct](https://huggingface.co/HuggingFaceTB/SmolLM-360M-Instruct)
	* Dataset: [LSXPrime/ProseFlow-Actions-v1](https://huggingface.co/datasets/LSXPrime/ProseFlow-Actions-v1)
	* Fine-tuning Library: [Unsloth](https://github.com/unslothai/unsloth)
	* Fine-tuning Method: Supervised fine-tuning using LoRA on a dataset of structured instruction-input-output
	triplets.

	## License

	This model is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0).