Text Generation
Transformers
PyTorch
code
gpt_bigcode
NarrowTransformer
Eval Results (legacy)
text-generation-inference
Instructions to use InfosysEnterprise/NT-Java-1.1B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use InfosysEnterprise/NT-Java-1.1B with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="InfosysEnterprise/NT-Java-1.1B")# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("InfosysEnterprise/NT-Java-1.1B") model = AutoModelForCausalLM.from_pretrained("InfosysEnterprise/NT-Java-1.1B") - Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- vLLM
How to use InfosysEnterprise/NT-Java-1.1B with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "InfosysEnterprise/NT-Java-1.1B" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "InfosysEnterprise/NT-Java-1.1B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker
docker model run hf.co/InfosysEnterprise/NT-Java-1.1B
- SGLang
How to use InfosysEnterprise/NT-Java-1.1B with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "InfosysEnterprise/NT-Java-1.1B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "InfosysEnterprise/NT-Java-1.1B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "InfosysEnterprise/NT-Java-1.1B" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "InfosysEnterprise/NT-Java-1.1B", "prompt": "Once upon a time,", "max_tokens": 512, "temperature": 0.5 }' - Docker Model Runner
How to use InfosysEnterprise/NT-Java-1.1B with Docker Model Runner:
docker model run hf.co/InfosysEnterprise/NT-Java-1.1B
Update README.md
Browse files
README.md
CHANGED
|
@@ -78,6 +78,8 @@ The model is intended for commercial use for Java programming tasks. The model p
|
|
| 78 |
3. Code generation/Completion task in Java
|
| 79 |
4. FIM task in Java
|
| 80 |
|
|
|
|
|
|
|
| 81 |
### Generation
|
| 82 |
```Java
|
| 83 |
# pip install -q transformers
|
|
@@ -93,7 +95,17 @@ inputs = tokenizer.encode("public class HelloWorld {\n public static void mai
|
|
| 93 |
outputs = model.generate(inputs)
|
| 94 |
print(tokenizer.decode(outputs[0]))
|
| 95 |
```
|
| 96 |
-
###
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 97 |
* _Using 8-bit precision (int8)_
|
| 98 |
|
| 99 |
```java
|
|
|
|
| 78 |
3. Code generation/Completion task in Java
|
| 79 |
4. FIM task in Java
|
| 80 |
|
| 81 |
+
## Sample inference code
|
| 82 |
+
|
| 83 |
### Generation
|
| 84 |
```Java
|
| 85 |
# pip install -q transformers
|
|
|
|
| 95 |
outputs = model.generate(inputs)
|
| 96 |
print(tokenizer.decode(outputs[0]))
|
| 97 |
```
|
| 98 |
+
### Fill-in-the-middle
|
| 99 |
+
Fill-in-the-middle uses special tokens to identify the prefix/middle/suffix part of the input and output:
|
| 100 |
+
|
| 101 |
+
```Java
|
| 102 |
+
input_text = "<fim_prefix>public class PalindromeChecker {\n public static boolean isPalindrome(String str) {\n <fim_suffix>return true;\n }\n<fim_middle>"
|
| 103 |
+
inputs = tokenizer.encode(input_text, return_tensors="pt").to(device)
|
| 104 |
+
outputs = model.generate(inputs)
|
| 105 |
+
print(tokenizer.decode(outputs[0]))
|
| 106 |
+
```
|
| 107 |
+
|
| 108 |
+
### Quantized Versions through `bitsandbytes`
|
| 109 |
* _Using 8-bit precision (int8)_
|
| 110 |
|
| 111 |
```java
|