fix typo
Browse files
README.md
CHANGED
|
@@ -63,7 +63,7 @@ Model prediction is significantly faster on a GPU, and so usage of the `.to('cud
|
|
| 63 |
|
| 64 |
Furthermore, the FLAN-T5 model architecture makes use
|
| 65 |
of many normalization layers, as is common in the transformer architecture. By default, CyberSolve uses the T5 model's `T5LayerNorm` Python class; it is highly recommended that user install the Nvidia `Apex` package for Nvidia GPUs
|
| 66 |
-
or the ROCm `Apex` package for AMD GPUs. Once installed, the model will default to using the `apex.normalization.FusedRMSNorm` class when computing the normalization layers. The `FusedRMSNorm` class from `apex` makes use of an optimized
|
| 67 |
that is much faster than the standard `T5LayerNorm` class, thereby significantly improving both inference and training.
|
| 68 |
|
| 69 |
The base FLAN-T5 model is capable of answering a variety of prompts, but the domain-adapted CyberSolve LinAlg model is designed specifically for solving linear equations. As such, users must be considerate in their prompt
|
|
|
|
| 63 |
|
| 64 |
Furthermore, the FLAN-T5 model architecture makes use
|
| 65 |
of many normalization layers, as is common in the transformer architecture. By default, CyberSolve uses the T5 model's `T5LayerNorm` Python class; it is highly recommended that user install the Nvidia `Apex` package for Nvidia GPUs
|
| 66 |
+
or the ROCm `Apex` package for AMD GPUs. Once installed, the model will default to using the `apex.normalization.FusedRMSNorm` class when computing the normalization layers. The `FusedRMSNorm` class from `apex` makes use of an optimized fused kernel
|
| 67 |
that is much faster than the standard `T5LayerNorm` class, thereby significantly improving both inference and training.
|
| 68 |
|
| 69 |
The base FLAN-T5 model is capable of answering a variety of prompts, but the domain-adapted CyberSolve LinAlg model is designed specifically for solving linear equations. As such, users must be considerate in their prompt
|