Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing

  • Log In
  • Sign Up

tiiuae
/
Falcon3-7B-Instruct

Text Generation
Transformers
Safetensors
llama
falcon3
conversational
text-generation-inference
Model card Files Files and versions
xet
Community
12

Instructions to use tiiuae/Falcon3-7B-Instruct with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

  • Libraries
  • Transformers

    How to use tiiuae/Falcon3-7B-Instruct with Transformers:

    # Use a pipeline as a high-level helper
    from transformers import pipeline
    
    pipe = pipeline("text-generation", model="tiiuae/Falcon3-7B-Instruct")
    messages = [
        {"role": "user", "content": "Who are you?"},
    ]
    pipe(messages)
    # Load model directly
    from transformers import AutoTokenizer, AutoModelForCausalLM
    
    tokenizer = AutoTokenizer.from_pretrained("tiiuae/Falcon3-7B-Instruct")
    model = AutoModelForCausalLM.from_pretrained("tiiuae/Falcon3-7B-Instruct")
    messages = [
        {"role": "user", "content": "Who are you?"},
    ]
    inputs = tokenizer.apply_chat_template(
    	messages,
    	add_generation_prompt=True,
    	tokenize=True,
    	return_dict=True,
    	return_tensors="pt",
    ).to(model.device)
    
    outputs = model.generate(**inputs, max_new_tokens=40)
    print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))
  • Notebooks
  • Google Colab
  • Kaggle
  • Local Apps
  • vLLM

    How to use tiiuae/Falcon3-7B-Instruct with vLLM:

    Install from pip and serve model
    # Install vLLM from pip:
    pip install vllm
    # Start the vLLM server:
    vllm serve "tiiuae/Falcon3-7B-Instruct"
    # Call the server using curl (OpenAI-compatible API):
    curl -X POST "http://localhost:8000/v1/chat/completions" \
    	-H "Content-Type: application/json" \
    	--data '{
    		"model": "tiiuae/Falcon3-7B-Instruct",
    		"messages": [
    			{
    				"role": "user",
    				"content": "What is the capital of France?"
    			}
    		]
    	}'
    Use Docker
    docker model run hf.co/tiiuae/Falcon3-7B-Instruct
  • SGLang

    How to use tiiuae/Falcon3-7B-Instruct with SGLang:

    Install from pip and serve model
    # Install SGLang from pip:
    pip install sglang
    # Start the SGLang server:
    python3 -m sglang.launch_server \
        --model-path "tiiuae/Falcon3-7B-Instruct" \
        --host 0.0.0.0 \
        --port 30000
    # Call the server using curl (OpenAI-compatible API):
    curl -X POST "http://localhost:30000/v1/chat/completions" \
    	-H "Content-Type: application/json" \
    	--data '{
    		"model": "tiiuae/Falcon3-7B-Instruct",
    		"messages": [
    			{
    				"role": "user",
    				"content": "What is the capital of France?"
    			}
    		]
    	}'
    Use Docker images
    docker run --gpus all \
        --shm-size 32g \
        -p 30000:30000 \
        -v ~/.cache/huggingface:/root/.cache/huggingface \
        --env "HF_TOKEN=<secret>" \
        --ipc=host \
        lmsysorg/sglang:latest \
        python3 -m sglang.launch_server \
            --model-path "tiiuae/Falcon3-7B-Instruct" \
            --host 0.0.0.0 \
            --port 30000
    # Call the server using curl (OpenAI-compatible API):
    curl -X POST "http://localhost:30000/v1/chat/completions" \
    	-H "Content-Type: application/json" \
    	--data '{
    		"model": "tiiuae/Falcon3-7B-Instruct",
    		"messages": [
    			{
    				"role": "user",
    				"content": "What is the capital of France?"
    			}
    		]
    	}'
  • Docker Model Runner

    How to use tiiuae/Falcon3-7B-Instruct with Docker Model Runner:

    docker model run hf.co/tiiuae/Falcon3-7B-Instruct

add scores for 1B

#3
by wdevazelhes - opened Dec 16, 2024
base: refs/heads/main
←
from: refs/pr/3
Discussion Files changed
+91
-138
initial commitd5821dc5
Upload tokenizerfff4a801
Upload LlamaForCausalLM63bf7a8f
Upload LlamaForCausalLM2d3dc103
Upload tokenizer_config.json (#1)47bca49f
docs(readme.md): init readme2b69b8fc
udpate config4ed3ec12
Update README.md2abf5a83
docs(readme): update templatea7daa486
update lang8b9bebc6
Update README.md0753ac9b
Upload LlamaForCausalLM79863c97
Upload tokenizera035bab1
Update README.md7f590ee7
Upload tokenizer95d5f5a6
Upload tokenizere4dabe08
Clean up ec2 path (#2)c61e8e0c
Update config.jsonca6ceed0
feat add pad_token893fefdd
feat add pad_token52582163
Update README.mde37f5870
docs(readme): benchs15015d25
feat: add license2465d9cd
docs(readme): updatec7b3473c
docs(readme): fixe8c736b4
docs(readme): fixd570153b
docs(readme): update3097aa69
Update README.mda64ebc07
docs(readme): update7aae4f39
wdevazelhes
Dec 16, 2024
No description provided.
add scores for 1B45031915
wdevazelhes
Dec 16, 2024

pushed here by mistake, closing

wdevazelhes changed pull request status to closed Dec 16, 2024

· Sign up or log in to comment

Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs