FluegelQueen
/

coeur-training-code

Model card Files Files and versions

coeur-training-code / README.md

FluegelQueen's picture

Update README.md

e1a85ae verified 2 months ago

|

history blame contribute delete

2.83 kB

	---
	---
	license: apache-2.0


	# Welcome to my Computer Science Capstone Project!


	This is the code that for the training pipeline that was used during my multi year Computer Science Capstone Project. It is a finetune of the most recent Command R model trained using a custom Python training pipeline from scratch.
	My goal is ultimately is understanding the process of training an LLM though the creation of an administrative assistant AI Agent powered by my own custom model.

	I started this project around the summer of my sophomore year in high school. I was just getting around to studying the mechanics of LLMs back then. My school
	offers a CS capstone class where you are allowed to work on a computer science related project of your choice for the year. This can be repeated in later years if take
	prior to senior year in order to build a new project or continue a previous one.



	# Technical Approach:

	-Multi-task Training: Curated custom dataset batches across various administrative capabilities such as tool calling, summarization and RAG

	-Iterative Fine-tuning: Progressive training runs with small learning rate to prevent catastrophic forgetting(learned this the hard way after losing 20 credits)

	-Knowledge Preservation: Mixed subsets of previous datasets into each new run

	-Quantization: 8-bit loading via BitsAndBytes for efficient training on Google Colab L4 GPUs



	# Some Challenges:


	-The very first training run I forgot I was working with a dictionary and accidentally assigned the variables wrong so every model was trained on "question"-"answer" repeatedly

	-Trying to train on long chain of thought while heavily truncating the text resulted in barely coherent checkpoints

	-Cuda dependencies were a struggle that cost a great many hours, nearly causing me to give up on quantization entirely

	-Money management. I originally used expensive H100 GPUs from cloud providers before settling on Colab

	-Finding tutorials. Since the subject is so new, I couldn't find many tutorials for younger students. Unsloth notebooks ended up being very useful.




	# Model Rationale

	-I was originally going to try Mistral Small 3 24B but it was too large and expensive

	-Qwen models felt too stiff to me in testing despite recommendation

	-Cohere models are advertised as good at tool calling and seemed good in practice

	-I emailed Cohere to see if they were okay with me using this for things that could theoretically help me make money with it and they said I was fine

	-This is still a research project first and foremost, so non commercial use wasn't really a dealbreaker for me.




	# Current Goal?

	-My current goal this senior year is phase 2 of the project, working on a custom agent, built on the smolagents framework, for the model to use in day to day life