Update Nemo FW References
Browse files
README.md
CHANGED
|
@@ -40,7 +40,7 @@ Nemotron-4-340B-Instruct is a chat model intended for use for the English langua
|
|
| 40 |
|
| 41 |
Nemotron-4-340B-Instruct is designed for Synthetic Data Generation to enable developers and enterprises for building and customizing their own large language models and LLM applications.
|
| 42 |
|
| 43 |
-
The instruct model itself can be further customized using the [NeMo Framework](https://docs.nvidia.com/nemo-framework/index.html) suite of customization tools including Parameter-Efficient Fine-Tuning (P-tuning, Adapters, LoRA, and more), and Model Alignment (SFT, SteerLM, RLHF, and more) using [NeMo-Aligner](https://github.com/NVIDIA/NeMo-Aligner).
|
| 44 |
|
| 45 |
**Model Developer:** NVIDIA
|
| 46 |
|
|
@@ -156,7 +156,7 @@ if response.endswith("<extra_id_1>"):
|
|
| 156 |
print(response)
|
| 157 |
```
|
| 158 |
|
| 159 |
-
2. Given this Python script, create a Bash script which spins up the inference server within the [NeMo container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo) (```docker pull nvcr.io/nvidia/nemo:24.
|
| 160 |
|
| 161 |
```bash
|
| 162 |
NEMO_FILE=$1
|
|
@@ -221,7 +221,7 @@ RESULTS=<PATH_TO_YOUR_SCRIPTS_FOLDER>
|
|
| 221 |
OUTFILE="${RESULTS}/slurm-%j-%n.out"
|
| 222 |
ERRFILE="${RESULTS}/error-%j-%n.out"
|
| 223 |
MODEL=<PATH_TO>/Nemotron-4-340B-Instruct
|
| 224 |
-
CONTAINER="nvcr.io/nvidia/nemo:24.
|
| 225 |
MOUNTS="--container-mounts=<PATH_TO_YOUR_SCRIPTS_FOLDER>:/scripts,MODEL:/model"
|
| 226 |
|
| 227 |
read -r -d '' cmd <<EOF
|
|
|
|
| 40 |
|
| 41 |
Nemotron-4-340B-Instruct is designed for Synthetic Data Generation to enable developers and enterprises for building and customizing their own large language models and LLM applications.
|
| 42 |
|
| 43 |
+
The instruct model itself can be further customized using the [NeMo Framework](https://docs.nvidia.com/nemo-framework/index.html) suite of customization tools including Parameter-Efficient Fine-Tuning (P-tuning, Adapters, LoRA, and more), and Model Alignment (SFT, SteerLM, RLHF, and more) using [NeMo-Aligner](https://github.com/NVIDIA/NeMo-Aligner). Refer to the [documentation](https://docs.nvidia.com/nemo-framework/user-guide/latest/llms/nemotron/index.html) for examples.
|
| 44 |
|
| 45 |
**Model Developer:** NVIDIA
|
| 46 |
|
|
|
|
| 156 |
print(response)
|
| 157 |
```
|
| 158 |
|
| 159 |
+
2. Given this Python script, create a Bash script which spins up the inference server within the [NeMo container](https://catalog.ngc.nvidia.com/orgs/nvidia/containers/nemo) (```docker pull nvcr.io/nvidia/nemo:24.05```) and calls the Python script ``call_server.py``. The Bash script ``nemo_inference.sh`` is as follows,
|
| 160 |
|
| 161 |
```bash
|
| 162 |
NEMO_FILE=$1
|
|
|
|
| 221 |
OUTFILE="${RESULTS}/slurm-%j-%n.out"
|
| 222 |
ERRFILE="${RESULTS}/error-%j-%n.out"
|
| 223 |
MODEL=<PATH_TO>/Nemotron-4-340B-Instruct
|
| 224 |
+
CONTAINER="nvcr.io/nvidia/nemo:24.05"
|
| 225 |
MOUNTS="--container-mounts=<PATH_TO_YOUR_SCRIPTS_FOLDER>:/scripts,MODEL:/model"
|
| 226 |
|
| 227 |
read -r -d '' cmd <<EOF
|