mengmeong
/

meng-programming-skill-finetune

Text Generation

Model card Files Files and versions

meng-programming-skill-finetune / README.md

mengmeong's picture

fix readme

e987967 over 1 year ago

|

history blame contribute delete

1.32 kB

	---
	license: mit
	datasets:
	- mengmeong/coding-skill-real-world-needs
	language:
	- en
	base_model: nisten/Biggie-SmoLlm-0.15B-Base
	pipeline_tag: text-generation
	inference:
	parameters:
	model_file: meng-coding-skill.gguf
	temperature: 1
	---
	# Programming Skills Learning Path Model

	This model is a fine-tuned version of the base mdoel designed to generate path of learning a skill based on input text. It's particularly useful for identifying emerging trends and skill combinations in the rapidly evolving tech landscape.

	## Usage & Limitations

	![llama.cpp demo](meng-cli.gif)

	The model is intended for:
	- Deploying in limited CPU resource, with average about 40 tps on 1 CPU core

	The model has limits:
	- The dataset might not capture the very latest tools development in programming world
	- Chatbot usecase does not fit the model usecase
	- The model only return the response as JSON list.

	Please note that this model was trained on a custom dataset and may reflect biases present in that data.

	### Training Hyperparameters

	- Batch Size: 4
	- Optimizer: Experimental GrokAdamW

	## Little Training Metrics

	![Eval Loss](eval_loss.png)
	![Eval Runtime](eval_runtime.png)
	![Eval Sample Per Seconds](eval_sample_per_secs.png)
	![Eval Steps per Seconds](eval_sps.png)
	![Loss on Train](train_loss.png)