Update README.md

Files changed (1) hide show

README.md CHANGED Viewed

@@ -169,10 +169,14 @@ Exporting to ExecuTorch requires you clone and install [ExecuTorch](https://gith
 ## Convert quantized checkpoint to ExecuTorch's format
 ```
 python -m executorch.examples.models.phi_4_mini.convert_weights phi4-mini-8dq4w.bin phi4-mini-8dq4w-converted.bin
 ```
 ## Export to an ExecuTorch *.pte with XNNPACK
 ```
 PARAMS="executorch/examples/models/phi_4_mini/config.json"
@@ -188,7 +192,7 @@ python -m executorch.examples.models.llama.export_llama \
 ```
 ## Running in a mobile app
-The model can be run in a mobile app.  See [instructions](https://pytorch.org/executorch/main/llm/llama-demo-ios.html) for doing this in iOS.
 On iPhone 15 Pro, the model runs at 17.3 tokens/sec and uses 3206 Mb of memory.
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66049fc71116cebd1d3bdcf4/AEdAJjGK2lED7tr6seWGf.png)

 ## Convert quantized checkpoint to ExecuTorch's format
+ExecuTorch expects the checkpoint keys to have certain names in order to export.  The following script converts the quantized checkpoint from Hugging Face to the one ExecuTorch expects.
 ```
 python -m executorch.examples.models.phi_4_mini.convert_weights phi4-mini-8dq4w.bin phi4-mini-8dq4w-converted.bin
 ```
+Once the checkpoint is converted, we can export to ExecuTorch's PTE format.
 ## Export to an ExecuTorch *.pte with XNNPACK
 ```
 PARAMS="executorch/examples/models/phi_4_mini/config.json"
 ```
 ## Running in a mobile app
+The PTE file can be run with ExecuTorch.  See the [instructions](https://pytorch.org/executorch/main/llm/llama-demo-ios.html) for doing this in iOS.
 On iPhone 15 Pro, the model runs at 17.3 tokens/sec and uses 3206 Mb of memory.
 ![image/png](https://cdn-uploads.huggingface.co/production/uploads/66049fc71116cebd1d3bdcf4/AEdAJjGK2lED7tr6seWGf.png)