Instructions to use myn11/gpt2_hdl with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use myn11/gpt2_hdl with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="myn11/gpt2_hdl")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("myn11/gpt2_hdl") model = AutoModelForCausalLM.from_pretrained("myn11/gpt2_hdl") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use myn11/gpt2_hdl with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "myn11/gpt2_hdl" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "myn11/gpt2_hdl", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/myn11/gpt2_hdl
- SGLang
How to use myn11/gpt2_hdl with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "myn11/gpt2_hdl" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "myn11/gpt2_hdl", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "myn11/gpt2_hdl" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "myn11/gpt2_hdl", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use myn11/gpt2_hdl with Docker Model Runner:
docker model run hf.co/myn11/gpt2_hdl
Objectives :
- Fine tune GPT-2 125M for VHDL code generation and summarization
- Build a gradle webapp for inferencing of the fine tuned model
- Fine tune a bigger model Llama 3B for VHDL code genaration and summarization and compare its performance with that of GPT-2
VHDL stands for VHSIC Hardware Description Language (yeah VHSIC is also an acronym🙂). It is a language used by hardware designers to synthesize digital hardware.
I used AWfaw/ai-hdlcoder-dataset-clean dataset to get the webscrapped VHDL code. Training the code with just VHDL code would enable the model to learn the grammer of VHDL but that doesn't serve our purpose.
So I took assistance from GPT-3.5 to generate the summary of a set of VHDL files which I would use as my dataset
LLM_for_HDL.ipynb contains the data preprocessing, dataset generation and gpt-2 model fine tuning. In Gradio_App_GPT_2_HDL.ipynb I have built a gradion webapp to interact with the finetuned model.
Results :
- The LLM was able to learn the grammer of the VHDL language (resverved words, semicolon,..etc ) but it was not able to generate meaningful code
- The model performed "fine" in summarizing small codes (like simple ROM block) but failed to do so with large code
One possibel reason can be that gpt-2 125M is too small a model to summarize and generate complex VHDL code. So I went on to fine tune Llama 3B model
Llama-3B is too big a model to train in a single T100 GPU instance with 15GB RAM. So, I employed qLoRA (quantized low-rank adapters. paper - https://arxiv.org/abs/2305.14314v1) to train it. llama-3b.ipynb file has the code for fine tuning of llama-3b. As it can be seen in training details the Llama-3b fine tuning clearly beats the gpt-2 and reasonably so.
- Downloads last month
- 13