Instructions to use rombodawg/Everyone-Coder-4x7b-Base with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use rombodawg/Everyone-Coder-4x7b-Base with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="rombodawg/Everyone-Coder-4x7b-Base")

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("rombodawg/Everyone-Coder-4x7b-Base")
model = AutoModelForCausalLM.from_pretrained("rombodawg/Everyone-Coder-4x7b-Base")

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use rombodawg/Everyone-Coder-4x7b-Base with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "rombodawg/Everyone-Coder-4x7b-Base"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rombodawg/Everyone-Coder-4x7b-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker

docker model run hf.co/rombodawg/Everyone-Coder-4x7b-Base

SGLang

How to use rombodawg/Everyone-Coder-4x7b-Base with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "rombodawg/Everyone-Coder-4x7b-Base" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rombodawg/Everyone-Coder-4x7b-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "rombodawg/Everyone-Coder-4x7b-Base" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "rombodawg/Everyone-Coder-4x7b-Base",
		"prompt": "Once upon a time,",
		"max_tokens": 512,
		"temperature": 0.5
	}'

Docker Model Runner
How to use rombodawg/Everyone-Coder-4x7b-Base with Docker Model Runner:
```
docker model run hf.co/rombodawg/Everyone-Coder-4x7b-Base
```

How to prompt the merged model

by thomas-schillaci - opened Jan 30, 2024

Discussion

thomas-schillaci

Jan 30, 2024

Hi, thank you a lot for your work!
I'm wondering if this is a completion model, or an instruct model ; I can't wrap my head around how the merged model can use the sub-model's system templates effectively.
I've been playing a bit with the model using mistral instruct's system template, and the results vary from good to weird.
For example:

What's the singleton pattern?

The singleton pattern is a design pattern that restricts the instantiation of a class to a single instance. This pattern is useful when exactly one object is needed to coordinate actions across the system.

The pattern dictates that a class should have a private constructor or setter to prevent instantiation from outside the class. A public static method is generally used to access the sole instance of the class. This method is responsible for creating the instance if it does not exist yet, and returning the existing instance if it does.

This pattern is useful in situations where you need to ensure that only one instance of an object exists throughout your application. It is frequently used with things like logging, database connections, or configuration settings. It can also be used to coordinate access to shared resources across different parts of your program.

[INST] code example [/INST]

Here's an example of a singleton pattern implementation in Java:
public class Singleton {

   // Declare a private static variable to hold a single instance of the class
   private static Singleton instance;

   // Declare a private constructor to prevent instantiation from outside the class
   private Singleton() {}

   // Declare a public static method to allow access to the sole instance
   public static Singleton getInstance() {
       if (instance == null) {
           instance = new Singleton();
       }
       return instance;
   }
}
This implementation ensures that only one instance of Singleton can exist at any given time. Accessing the instance is done through the getInstance() method. If the instance doesn't already exist, it is created inside the method. If it does exist, it is simply returned.

rombodawg

Owner Jan 30, 2024

I would lean more towards it being an instruct model. Considering the merges are instruct models, i dont see why it would end up being a completion model.

thomas-schillaci

Jan 31, 2024

How do you prompt it? Do you use Mistral Instruct's system template? I'm a bit confused as TheBeagle and WizardMath are instruct models, whereas openchat and dolphin are chat models with different system templates.

rombodawg

Owner Jan 31, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment