MarioBarbeque commited on
Commit
17abd8d
·
verified ·
1 Parent(s): 8302bae
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -63,7 +63,7 @@ Model prediction is significantly faster on a GPU, and so usage of the `.to('cud
63
 
64
  Furthermore, the FLAN-T5 model architecture makes use
65
  of many normalization layers, as is common in the transformer architecture. By default, CyberSolve uses the T5 model's `T5LayerNorm` Python class; it is highly recommended that user install the Nvidia `Apex` package for Nvidia GPUs
66
- or the ROCm `Apex` package for AMD GPUs. Once installed, the model will default to using the `apex.normalization.FusedRMSNorm` class when computing the normalization layers. The `FusedRMSNorm` class from `apex` makes use of an optimized fushed kernel
67
  that is much faster than the standard `T5LayerNorm` class, thereby significantly improving both inference and training.
68
 
69
  The base FLAN-T5 model is capable of answering a variety of prompts, but the domain-adapted CyberSolve LinAlg model is designed specifically for solving linear equations. As such, users must be considerate in their prompt
 
63
 
64
  Furthermore, the FLAN-T5 model architecture makes use
65
  of many normalization layers, as is common in the transformer architecture. By default, CyberSolve uses the T5 model's `T5LayerNorm` Python class; it is highly recommended that user install the Nvidia `Apex` package for Nvidia GPUs
66
+ or the ROCm `Apex` package for AMD GPUs. Once installed, the model will default to using the `apex.normalization.FusedRMSNorm` class when computing the normalization layers. The `FusedRMSNorm` class from `apex` makes use of an optimized fused kernel
67
  that is much faster than the standard `T5LayerNorm` class, thereby significantly improving both inference and training.
68
 
69
  The base FLAN-T5 model is capable of answering a variety of prompts, but the domain-adapted CyberSolve LinAlg model is designed specifically for solving linear equations. As such, users must be considerate in their prompt