Instructions to use InstaDeepAI/ChatNT with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use InstaDeepAI/ChatNT with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="InstaDeepAI/ChatNT", trust_remote_code=True)# Load model directly from transformers import AutoModel model = AutoModel.from_pretrained("InstaDeepAI/ChatNT", trust_remote_code=True, dtype="auto") - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use InstaDeepAI/ChatNT with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "InstaDeepAI/ChatNT" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "InstaDeepAI/ChatNT", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/InstaDeepAI/ChatNT
- SGLang
How to use InstaDeepAI/ChatNT with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "InstaDeepAI/ChatNT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "InstaDeepAI/ChatNT", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "InstaDeepAI/ChatNT" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "InstaDeepAI/ChatNT", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use InstaDeepAI/ChatNT with Docker Model Runner:
docker model run hf.co/InstaDeepAI/ChatNT
Update chatNT.py
Browse files
chatNT.py
CHANGED
|
@@ -720,8 +720,11 @@ class TorchMultiOmicsModel(PreTrainedModel):
|
|
| 720 |
projected_bio_embeddings.append(proj)
|
| 721 |
for key in output.keys():
|
| 722 |
outs[f"{key}_{bio_seq_num}"] = output[key]
|
|
|
|
|
|
|
| 723 |
|
| 724 |
projected_bio_embeddings = torch.stack(projected_bio_embeddings, dim=1)
|
|
|
|
| 725 |
|
| 726 |
# decode
|
| 727 |
logits = self.biobrain_decoder(
|
|
@@ -730,13 +733,6 @@ class TorchMultiOmicsModel(PreTrainedModel):
|
|
| 730 |
)
|
| 731 |
|
| 732 |
outs["logits"] = logits
|
| 733 |
-
outs["projected_bio_embeddings"] = projected_bio_embeddings
|
| 734 |
-
|
| 735 |
-
# Just for debugging
|
| 736 |
-
print("(debug) remember to remove bio_embeddings storage")
|
| 737 |
-
if projected_bio_embeddings is not None:
|
| 738 |
-
for i, embed in enumerate(bio_embeddings_list):
|
| 739 |
-
outs[f"bio_embeddings_list_{i}"] = embed
|
| 740 |
|
| 741 |
return outs
|
| 742 |
|
|
|
|
| 720 |
projected_bio_embeddings.append(proj)
|
| 721 |
for key in output.keys():
|
| 722 |
outs[f"{key}_{bio_seq_num}"] = output[key]
|
| 723 |
+
outs[f"bio_embeddings_list_{bio_seq_num}"] = proj
|
| 724 |
+
|
| 725 |
|
| 726 |
projected_bio_embeddings = torch.stack(projected_bio_embeddings, dim=1)
|
| 727 |
+
outs["projected_bio_embeddings"] = projected_bio_embeddings.clone()
|
| 728 |
|
| 729 |
# decode
|
| 730 |
logits = self.biobrain_decoder(
|
|
|
|
| 733 |
)
|
| 734 |
|
| 735 |
outs["logits"] = logits
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 736 |
|
| 737 |
return outs
|
| 738 |
|