Instructions to use FinchResearch/Gurkha-copilot-1b with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use FinchResearch/Gurkha-copilot-1b with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="FinchResearch/Gurkha-copilot-1b")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("FinchResearch/Gurkha-copilot-1b") model = AutoModelForCausalLM.from_pretrained("FinchResearch/Gurkha-copilot-1b") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use FinchResearch/Gurkha-copilot-1b with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "FinchResearch/Gurkha-copilot-1b" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FinchResearch/Gurkha-copilot-1b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/FinchResearch/Gurkha-copilot-1b
- SGLang
How to use FinchResearch/Gurkha-copilot-1b with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "FinchResearch/Gurkha-copilot-1b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FinchResearch/Gurkha-copilot-1b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "FinchResearch/Gurkha-copilot-1b" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "FinchResearch/Gurkha-copilot-1b", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use FinchResearch/Gurkha-copilot-1b with Docker Model Runner:
docker model run hf.co/FinchResearch/Gurkha-copilot-1b
Model Card: Gurkha Coding Assistant
Overview
Name: Gurkha Coding Assistant
Model Type: Text Generation
Model Size: 1 billion parameters
Functionality: Code generation, code completion, text generation
Description
Gurkha is a versatile coding assistant designed to excel in generating and completing code tasks. With a robust architecture comprising 1 billion parameters, Gurkha offers efficiency and proficiency in various coding scenarios. Whether you need code snippets, code completions, or general text generation, Gurkha is here to provide reliable assistance.
Use Cases
- Efficient code generation for multiple programming languages
- Accelerated code completion to streamline development workflows
- Automated generation of documentation and comments
- Creation of illustrative code samples and examples
- Exploring coding concepts through interactive code generation
Strengths
- Strong parameter base for robust performance
- Versatility in addressing diverse coding challenges
- Proficient generation of high-quality code
- Adaptability to different coding styles and languages
Limitations
- Focus is primarily on code-related tasks
- Complexity might require specific instructions for precise code generation
- Limited contextual understanding outside of coding domain
Ethical Considerations
Gurkha's code generation adheres to ethical guidelines, producing content that is unbiased, respectful, and non-discriminatory. Users are encouraged to review generated code to ensure alignment with industry standards and coding best practices.
- Downloads last month
- 4