mengmeong's picture
fix readme
e987967
---
license: mit
datasets:
- mengmeong/coding-skill-real-world-needs
language:
- en
base_model: nisten/Biggie-SmoLlm-0.15B-Base
pipeline_tag: text-generation
inference:
parameters:
model_file: meng-coding-skill.gguf
temperature: 1
---
# Programming Skills Learning Path Model
This model is a fine-tuned version of the base mdoel designed to generate path of learning a skill based on input text. It's particularly useful for identifying emerging trends and skill combinations in the rapidly evolving tech landscape.
## Usage & Limitations
![llama.cpp demo](meng-cli.gif)
The model is intended for:
- Deploying in limited CPU resource, with average about 40 tps on 1 CPU core
The model has limits:
- The dataset might not capture the very latest tools development in programming world
- Chatbot usecase does not fit the model usecase
- The model only return the response as JSON list.
Please note that this model was trained on a custom dataset and may reflect biases present in that data.
### Training Hyperparameters
- **Batch Size:** 4
- **Optimizer:** Experimental GrokAdamW
## Little Training Metrics
![Eval Loss](eval_loss.png)
![Eval Runtime](eval_runtime.png)
![Eval Sample Per Seconds](eval_sample_per_secs.png)
![Eval Steps per Seconds](eval_sps.png)
![Loss on Train](train_loss.png)