DevQuasar
/

vintage-nextstep_os_systemadmin-ft-phi2

Text Generation

text-generation-inference

Model card Files Files and versions

csabakecskemeti commited on Mar 13, 2024

Commit

14ef358

·

verified ·

1 Parent(s): 28951e2

Add details

Files changed (1) hide show

README.md +19 -0

README.md ADDED Viewed

	@@ -0,0 +1,19 @@

+---
+pipeline_tag: question-answering
+---
+The goal of the model to provide a fine-tuned Phi2 (https://huggingface.co/microsoft/phi-2) model that has knowledge about the Vintage NEXTSTEP Operation system,
+and able to answer question in the topic.
+The model has trained on 35439 Question Answer pairs automatically generated from the NEXTSTEP 3.3 System Administrator
+documentation. For the training data generation locally running Q8 Quantized Orca2 13B (https://huggingface.co/TheBloke/Orca-2-13B-GGUF)
+model has been used. The training data generation was completely unsuperwised, with minilar sanity check (like ignore data chunks
+contains less than 100 tokens). The maximum token size for Orca2 is 4096 so a simple rule of split chunks over 3500 tokens
+(considering propt instructions) has been used. Chunking did not consider context (text data might split within the context).
+Evaluation set has been generated similar method on 1% of the raw data with LLama2 chat (https://huggingface.co/TheBloke/Llama-2-13B-chat-GGUF).
+Trained locally on 2x3090 GPU with vanila DDP with HuggingFace Accelerate for 50 Epoch.
+Chat with model sample code:
+  https://github.com/csabakecskemeti/ai_utils/blob/main/generate.py