Instructions to use AesSedai/Step-3.5-Flash-GGUF with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- llama-cpp-python
How to use AesSedai/Step-3.5-Flash-GGUF with llama-cpp-python:
# !pip install llama-cpp-python from llama_cpp import Llama llm = Llama.from_pretrained( repo_id="AesSedai/Step-3.5-Flash-GGUF", filename="IQ2_S/Step-3.5-Flash-IQ2_S-00001-of-00003.gguf", )
llm.create_chat_completion( messages = "No input example has been defined for this model task." )
- Notebooks
- Google Colab
- Kaggle
- Local Apps Settings
- llama.cpp
How to use AesSedai/Step-3.5-Flash-GGUF with llama.cpp:
Install from brew
brew install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf AesSedai/Step-3.5-Flash-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf AesSedai/Step-3.5-Flash-GGUF:Q4_K_M
Install from WinGet (Windows)
winget install llama.cpp # Start a local OpenAI-compatible server with a web UI: llama-server -hf AesSedai/Step-3.5-Flash-GGUF:Q4_K_M # Run inference directly in the terminal: llama-cli -hf AesSedai/Step-3.5-Flash-GGUF:Q4_K_M
Use pre-built binary
# Download pre-built binary from: # https://github.com/ggerganov/llama.cpp/releases # Start a local OpenAI-compatible server with a web UI: ./llama-server -hf AesSedai/Step-3.5-Flash-GGUF:Q4_K_M # Run inference directly in the terminal: ./llama-cli -hf AesSedai/Step-3.5-Flash-GGUF:Q4_K_M
Build from source code
git clone https://github.com/ggerganov/llama.cpp.git cd llama.cpp cmake -B build cmake --build build -j --target llama-server llama-cli # Start a local OpenAI-compatible server with a web UI: ./build/bin/llama-server -hf AesSedai/Step-3.5-Flash-GGUF:Q4_K_M # Run inference directly in the terminal: ./build/bin/llama-cli -hf AesSedai/Step-3.5-Flash-GGUF:Q4_K_M
Use Docker
docker model run hf.co/AesSedai/Step-3.5-Flash-GGUF:Q4_K_M
- LM Studio
- Jan
- Ollama
How to use AesSedai/Step-3.5-Flash-GGUF with Ollama:
ollama run hf.co/AesSedai/Step-3.5-Flash-GGUF:Q4_K_M
- Unsloth Studio
How to use AesSedai/Step-3.5-Flash-GGUF with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AesSedai/Step-3.5-Flash-GGUF to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for AesSedai/Step-3.5-Flash-GGUF to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for AesSedai/Step-3.5-Flash-GGUF to start chatting
- Pi
How to use AesSedai/Step-3.5-Flash-GGUF with Pi:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf AesSedai/Step-3.5-Flash-GGUF:Q4_K_M
Configure the model in Pi
# Install Pi: npm install -g @mariozechner/pi-coding-agent # Add to ~/.pi/agent/models.json: { "providers": { "llama-cpp": { "baseUrl": "http://localhost:8080/v1", "api": "openai-completions", "apiKey": "none", "models": [ { "id": "AesSedai/Step-3.5-Flash-GGUF:Q4_K_M" } ] } } }Run Pi
# Start Pi in your project directory: pi
- Hermes Agent new
How to use AesSedai/Step-3.5-Flash-GGUF with Hermes Agent:
Start the llama.cpp server
# Install llama.cpp: brew install llama.cpp # Start a local OpenAI-compatible server: llama-server -hf AesSedai/Step-3.5-Flash-GGUF:Q4_K_M
Configure Hermes
# Install Hermes: curl -fsSL https://hermes-agent.nousresearch.com/install.sh | bash hermes setup # Point Hermes at the local server: hermes config set model.provider custom hermes config set model.base_url http://127.0.0.1:8080/v1 hermes config set model.default AesSedai/Step-3.5-Flash-GGUF:Q4_K_M
Run Hermes
hermes
- Docker Model Runner
How to use AesSedai/Step-3.5-Flash-GGUF with Docker Model Runner:
docker model run hf.co/AesSedai/Step-3.5-Flash-GGUF:Q4_K_M
- Lemonade
How to use AesSedai/Step-3.5-Flash-GGUF with Lemonade:
Pull the model
# Download Lemonade from https://lemonade-server.ai/ lemonade pull AesSedai/Step-3.5-Flash-GGUF:Q4_K_M
Run and chat with the model
lemonade run user.Step-3.5-Flash-GGUF-Q4_K_M
List all available models
lemonade list
| model_name,file_size_gb,bpw,Mean KLD_mean,0.01.235.876 I common_memory_breakdown_print,0.01.257.220 I common_params_fit_impl,0.01.257.224 I common_fit_params,0.01.277.599 I llama_model_loader,0.01.277.600 I llama_model_loader,0.01.277.604 I llama_model_loader,0.01.277.605 I llama_model_loader,0.01.277.606 I llama_model_loader,0.01.277.621 I llama_model_loader,0.01.277.627 I llama_model_loader,0.01.277.628 I llama_model_loader,0.01.277.630 I llama_model_loader,0.01.277.632 I llama_model_loader,0.01.277.633 I llama_model_loader,0.01.277.644 I llama_model_loader,0.01.293.704 I llama_model_loader,0.01.293.705 I llama_model_loader,0.01.293.706 I print_info,0.01.293.707 I print_info,0.01.331.421 I load,0.01.339.471 I load,0.01.339.542 I load,0.01.361.393 I load,0.01.361.407 I print_info,0.01.361.408 I print_info,0.01.361.409 I print_info,0.01.361.410 I print_info,0.01.361.417 I print_info,0.01.361.418 I print_info,0.01.361.419 I print_info,0.01.361.421 I print_info,0.01.361.422 I print_info,0.01.361.423 I print_info,0.01.361.424 I print_info,0.01.361.425 I print_info,0.01.361.426 I print_info,0.01.361.427 I print_info,0.01.361.429 I print_info,0.01.361.430 I print_info,0.01.361.432 I print_info,0.01.361.434 I print_info,0.01.361.435 I print_info,0.01.361.436 I print_info,0.01.361.437 I print_info,0.01.361.438 I print_info,0.01.808.645 I common_memory_breakdown_print,0.01.829.278 I common_params_fit_impl,0.01.829.282 I common_fit_params,0.01.850.076 I llama_model_loader,0.01.850.077 I llama_model_loader,0.01.850.081 I llama_model_loader,0.01.850.082 I llama_model_loader,0.01.850.083 I llama_model_loader,0.01.850.099 I llama_model_loader,0.01.850.106 I llama_model_loader,0.01.850.107 I llama_model_loader,0.01.850.108 I llama_model_loader,0.01.850.109 I llama_model_loader,0.01.850.110 I llama_model_loader,0.01.850.122 I llama_model_loader,0.01.866.219 I llama_model_loader,0.01.866.220 I llama_model_loader,0.01.866.221 I llama_model_loader,0.01.866.224 I llama_model_loader,0.01.866.226 I print_info,0.01.903.238 I load,0.01.911.208 I load,0.01.911.209 I load,0.01.911.279 I load,0.01.933.201 I load,0.01.933.217 I print_info,0.01.933.218 I print_info,0.01.933.219 I print_info,0.01.933.227 I print_info,0.01.933.229 I print_info,0.01.933.230 I print_info,0.01.933.232 I print_info,0.01.933.233 I print_info,0.01.933.234 I print_info,0.01.933.235 I print_info,0.01.933.236 I print_info,0.01.933.237 I print_info,0.01.933.238 I print_info,0.01.933.239 I print_info,0.01.933.240 I print_info,0.01.933.242 I print_info,0.01.933.243 I print_info,0.01.933.244 I print_info,0.01.933.245 I print_info,0.01.933.246 I print_info,0.01.933.247 I print_info,0.01.933.248 I print_info,0.01.933.249 I print_info,0.02.185.584 I common_memory_breakdown_print,0.02.207.362 I common_params_fit_impl,0.02.207.366 I common_fit_params,0.02.228.247 I llama_model_loader,0.02.228.248 I llama_model_loader,0.02.228.253 I llama_model_loader,0.02.228.254 I llama_model_loader,0.02.228.274 I llama_model_loader,0.02.228.280 I llama_model_loader,0.02.228.282 I llama_model_loader,0.02.228.283 I llama_model_loader,0.02.228.284 I llama_model_loader,0.02.228.285 I llama_model_loader,0.02.228.297 I llama_model_loader,0.02.244.540 I llama_model_loader,0.02.244.541 I llama_model_loader,0.02.244.542 I llama_model_loader,0.02.244.544 I llama_model_loader,0.02.244.545 I llama_model_loader,0.02.244.546 I print_info,0.02.244.547 I print_info,0.02.282.395 I load,0.02.290.487 I load,0.02.290.488 I load,0.02.290.570 I load,0.02.312.390 I load,0.02.312.405 I print_info,0.02.312.406 I print_info,0.02.312.407 I print_info,0.02.312.416 I print_info,0.02.312.417 I print_info,0.02.312.418 I print_info,0.02.312.420 I print_info,0.02.312.422 I print_info,0.02.312.423 I print_info,0.02.312.424 I print_info,0.02.312.425 I print_info,0.02.312.426 I print_info,0.02.312.427 I print_info,0.02.312.428 I print_info,0.02.312.429 I print_info,0.02.312.430 I print_info,0.02.312.431 I print_info,0.02.312.432 I print_info,0.02.312.433 I print_info,0.02.312.435 I print_info,0.02.312.436 I print_info,0.02.312.437 I print_info,0.02.312.438 I print_info,0.02.373.319 I common_memory_breakdown_print,0.02.392.992 I common_params_fit_impl,0.02.392.997 I common_fit_params,0.02.413.471 I llama_model_loader,0.02.413.472 I llama_model_loader,0.02.413.476 I llama_model_loader,0.02.413.477 I llama_model_loader,0.02.413.478 I llama_model_loader,0.02.413.494 I llama_model_loader,0.02.413.526 I llama_model_loader,0.02.413.527 I llama_model_loader,0.02.413.528 I llama_model_loader,0.02.413.530 I llama_model_loader,0.02.413.531 I llama_model_loader,0.02.413.532 I llama_model_loader,0.02.413.543 I llama_model_loader,0.02.430.064 I llama_model_loader,0.02.430.065 I llama_model_loader,0.02.430.068 I llama_model_loader,0.02.430.069 I llama_model_loader,0.02.430.070 I print_info,0.02.430.071 I print_info,0.02.467.724 I load,0.02.475.719 I load,0.02.475.720 I load,0.02.475.792 I load,0.02.494.731 I common_memory_breakdown_print,0.02.497.692 I load,0.02.497.708 I print_info,0.02.497.709 I print_info,0.02.497.710 I print_info,0.02.497.718 I print_info,0.02.497.719 I print_info,0.02.497.720 I print_info,0.02.497.722 I print_info,0.02.497.724 I print_info,0.02.497.725 I print_info,0.02.497.726 I print_info,0.02.497.727 I print_info,0.02.497.728 I print_info,0.02.497.729 I print_info,0.02.497.730 I print_info,0.02.497.731 I print_info,0.02.497.732 I print_info,0.02.497.733 I print_info,0.02.497.734 I print_info,0.02.497.735 I print_info,0.02.497.736 I print_info,0.02.497.737 I print_info,0.02.497.738 I print_info,0.02.497.739 I print_info,0.02.497.740 I print_info,0.02.516.789 I common_params_fit_impl,0.02.516.794 I common_fit_params,0.02.537.238 I llama_model_loader,0.02.537.239 I llama_model_loader,0.02.537.243 I llama_model_loader,0.02.537.244 I llama_model_loader,0.02.537.245 I llama_model_loader,0.02.537.261 I llama_model_loader,0.02.537.269 I llama_model_loader,0.02.537.271 I llama_model_loader,0.02.537.272 I llama_model_loader,0.02.537.274 I llama_model_loader,0.02.537.275 I llama_model_loader,0.02.537.287 I llama_model_loader,0.02.553.298 I llama_model_loader,0.02.553.299 I llama_model_loader,0.02.553.300 I llama_model_loader,0.02.553.302 I llama_model_loader,0.02.553.303 I llama_model_loader,0.02.553.304 I print_info,0.02.553.305 I print_info,0.02.590.387 I load,0.02.598.412 I load,0.02.598.413 I load,0.02.598.482 I load,0.02.620.317 I load,0.02.620.332 I print_info,0.02.620.333 I print_info,0.02.620.334 I print_info,0.02.620.343 I print_info,0.02.620.344 I print_info,0.02.620.347 I print_info,0.02.620.348 I print_info,0.02.620.350 I print_info,0.02.620.351 I print_info,0.02.620.352 I print_info,0.02.620.353 I print_info,0.02.620.354 I print_info,0.02.620.355 I print_info,0.02.620.375 I print_info,0.02.620.381 I print_info,0.02.620.382 I print_info,0.02.620.384 I print_info,0.02.620.385 I print_info,0.02.620.386 I print_info,0.02.620.387 I print_info,0.02.620.388 I print_info,0.02.620.390 I print_info,0.02.620.391 I print_info,0.02.620.392 I print_info,0.02.902.492 I common_memory_breakdown_print,0.02.922.749 I common_params_fit_impl,0.02.922.753 I common_fit_params,0.02.943.568 I llama_model_loader,0.02.943.569 I llama_model_loader,0.02.943.573 I llama_model_loader,0.02.943.574 I llama_model_loader,0.02.943.575 I llama_model_loader,0.02.943.590 I llama_model_loader,0.02.943.598 I llama_model_loader,0.02.943.599 I llama_model_loader,0.02.943.600 I llama_model_loader,0.02.943.602 I llama_model_loader,0.02.943.603 I llama_model_loader,0.02.943.616 I llama_model_loader,0.02.959.946 I llama_model_loader,0.02.959.947 I llama_model_loader,0.02.959.950 I llama_model_loader,0.02.959.951 I llama_model_loader,0.02.959.952 I print_info,0.02.959.953 I print_info,0.02.996.888 I load,0.03.004.844 I load,0.03.004.845 I load,0.03.004.912 I load,0.03.026.791 I load,0.03.026.806 I print_info,0.03.026.807 I print_info,0.03.026.808 I print_info,0.03.026.816 I print_info,0.03.026.817 I print_info,0.03.026.818 I print_info,0.03.026.820 I print_info,0.03.026.822 I print_info,0.03.026.823 I print_info,0.03.026.824 I print_info,0.03.026.825 I print_info,0.03.026.826 I print_info,0.03.026.827 I print_info,0.03.026.828 I print_info,0.03.026.829 I print_info,0.03.026.830 I print_info,0.03.026.831 I print_info,0.03.026.833 I print_info,0.03.026.834 I print_info,0.03.026.835 I print_info,0.1% KLD,0.1% Δp,0.23.453.225 I load_tensors,0.23.453.226 I load_tensors,0.23.453.231 I load_tensors,0.23.453.232 I load_tensors,0.23.453.233 I load_tensors,0.23.453.234 I load_tensors,0.24.591.768 I load_tensors,0.24.591.769 I load_tensors,0.24.591.775 I load_tensors,0.24.591.776 I load_tensors,0.24.591.777 I load_tensors,0.26.702.980 I llama_context,0.26.702.981 I llama_context,0.26.702.982 I llama_context,0.26.702.986 I llama_context,0.26.702.987 I llama_context,0.26.706.335 I llama_context,0.26.706.345 I llama_kv_cache_iswa,0.26.706.659 I llama_kv_cache,0.26.706.903 I llama_kv_cache,0.26.707.133 I llama_kv_cache,0.26.707.339 I llama_kv_cache,0.26.707.541 I llama_kv_cache,0.26.707.738 I llama_kv_cache,0.26.707.943 I llama_kv_cache,0.26.708.169 I llama_kv_cache,0.26.708.216 I llama_kv_cache,0.26.708.217 I llama_kv_cache_iswa,0.26.708.532 I llama_kv_cache,0.26.708.805 I llama_kv_cache,0.26.709.044 I llama_kv_cache,0.26.709.298 I llama_kv_cache,0.26.709.533 I llama_kv_cache,0.26.709.796 I llama_kv_cache,0.26.710.031 I llama_kv_cache,0.26.710.286 I llama_kv_cache,0.26.710.372 I llama_kv_cache,0.26.710.373 I llama_kv_cache,0.26.798.584 I sched_reserve,0.26.798.597 I sched_reserve,0.26.798.598 I sched_reserve,0.26.798.599 I sched_reserve,0.26.798.600 I sched_reserve,0.26.798.601 I sched_reserve,0.26.798.603 I sched_reserve,0.26.880.034 I system_info,0.26.892.668 I kl_divergence,0.27.948.440 I llama_context,0.27.948.441 I llama_context,0.27.948.442 I llama_context,0.27.948.446 I llama_context,0.27.948.447 I llama_context,0.27.951.817 I llama_context,0.27.951.828 I llama_kv_cache_iswa,0.27.952.154 I llama_kv_cache,0.27.952.405 I llama_kv_cache,0.27.952.609 I llama_kv_cache,0.27.952.809 I llama_kv_cache,0.27.953.024 I llama_kv_cache,0.27.953.220 I llama_kv_cache,0.27.953.441 I llama_kv_cache,0.27.953.634 I llama_kv_cache,0.27.953.676 I llama_kv_cache,0.27.953.677 I llama_kv_cache,0.27.953.678 I llama_kv_cache_iswa,0.27.953.970 I llama_kv_cache,0.27.954.238 I llama_kv_cache,0.27.954.492 I llama_kv_cache,0.27.954.747 I llama_kv_cache,0.27.955.002 I llama_kv_cache,0.27.955.252 I llama_kv_cache,0.27.955.940 I llama_kv_cache,0.27.956.195 I llama_kv_cache,0.27.956.280 I llama_kv_cache,0.27.956.281 I llama_kv_cache,0.28.049.214 I sched_reserve,0.28.049.227 I sched_reserve,0.28.049.228 I sched_reserve,0.28.049.229 I sched_reserve,0.28.049.230 I sched_reserve,0.28.049.231 I sched_reserve,0.28.049.232 I sched_reserve,0.28.128.953 I system_info,0.29.334.425 I kl_divergence,0.31.385.780 I load_tensors,0.31.385.786 I load_tensors,0.31.385.787 I load_tensors,0.31.385.788 I load_tensors,0.31.385.789 I load_tensors,0.31.385.790 I load_tensors,0.35.884.363 I llama_context,0.35.884.364 I llama_context,0.35.884.368 I llama_context,0.35.884.369 I llama_context,0.35.887.767 I llama_context,0.35.887.776 I llama_kv_cache_iswa,0.35.888.099 I llama_kv_cache,0.35.888.333 I llama_kv_cache,0.35.888.533 I llama_kv_cache,0.35.888.751 I llama_kv_cache,0.35.888.981 I llama_kv_cache,0.35.889.183 I llama_kv_cache,0.35.889.385 I llama_kv_cache,0.35.889.581 I llama_kv_cache,0.35.889.631 I llama_kv_cache,0.35.889.632 I llama_kv_cache,0.35.889.633 I llama_kv_cache_iswa,0.35.889.956 I llama_kv_cache,0.35.890.207 I llama_kv_cache,0.35.890.446 I llama_kv_cache,0.35.890.709 I llama_kv_cache,0.35.890.971 I llama_kv_cache,0.35.891.229 I llama_kv_cache,0.35.891.462 I llama_kv_cache,0.35.891.754 I llama_kv_cache,0.35.891.838 I llama_kv_cache,0.35.976.508 I sched_reserve,0.35.976.522 I sched_reserve,0.35.976.523 I sched_reserve,0.35.976.524 I sched_reserve,0.35.976.525 I sched_reserve,0.35.976.526 I sched_reserve,0.35.976.527 I sched_reserve,0.36.061.855 I system_info,0.37.282.248 I kl_divergence,0.39.012.315 I load_tensors,0.39.012.316 I load_tensors,0.39.012.322 I load_tensors,0.39.012.323 I load_tensors,0.39.012.324 I load_tensors,0.39.012.325 I load_tensors,0.45.512.844 I llama_context,0.45.512.845 I llama_context,0.45.512.846 I llama_context,0.45.512.850 I llama_context,0.45.512.851 I llama_context,0.45.516.024 I llama_context,0.45.516.034 I llama_kv_cache_iswa,0.45.516.352 I llama_kv_cache,0.45.516.617 I llama_kv_cache,0.45.516.837 I llama_kv_cache,0.45.517.039 I llama_kv_cache,0.45.517.239 I llama_kv_cache,0.45.517.437 I llama_kv_cache,0.45.517.650 I llama_kv_cache,0.45.517.849 I llama_kv_cache,0.45.517.891 I llama_kv_cache,0.45.517.893 I llama_kv_cache_iswa,0.45.518.192 I llama_kv_cache,0.45.518.454 I llama_kv_cache,0.45.518.694 I llama_kv_cache,0.45.518.978 I llama_kv_cache,0.45.519.214 I llama_kv_cache,0.45.519.466 I llama_kv_cache,0.45.519.703 I llama_kv_cache,0.45.522.124 I llama_kv_cache,0.45.522.211 I llama_kv_cache,0.45.522.212 I llama_kv_cache,0.45.603.406 I sched_reserve,0.45.603.420 I sched_reserve,0.45.603.421 I sched_reserve,0.45.603.422 I sched_reserve,0.45.603.423 I sched_reserve,0.45.603.424 I sched_reserve,0.45.603.425 I sched_reserve,0.45.686.996 I system_info,0.46.933.558 I kl_divergence,0.47.377.623 I load_tensors,0.47.377.629 I load_tensors,0.47.377.630 I load_tensors,0.47.377.631 I load_tensors,0.47.377.632 I load_tensors,0.54.560.160 I llama_context,0.54.560.161 I llama_context,0.54.560.162 I llama_context,0.54.560.168 I llama_context,0.54.560.169 I llama_context,0.54.563.291 I llama_context,0.54.563.301 I llama_kv_cache_iswa,0.54.563.602 I llama_kv_cache,0.54.563.832 I llama_kv_cache,0.54.564.055 I llama_kv_cache,0.54.564.270 I llama_kv_cache,0.54.564.471 I llama_kv_cache,0.54.564.670 I llama_kv_cache,0.54.564.872 I llama_kv_cache,0.54.565.074 I llama_kv_cache,0.54.565.117 I llama_kv_cache,0.54.565.118 I llama_kv_cache,0.54.565.119 I llama_kv_cache_iswa,0.54.565.443 I llama_kv_cache,0.54.565.699 I llama_kv_cache,0.54.565.947 I llama_kv_cache,0.54.566.197 I llama_kv_cache,0.54.566.431 I llama_kv_cache,0.54.566.678 I llama_kv_cache,0.54.566.926 I llama_kv_cache,0.54.567.175 I llama_kv_cache,0.54.571.585 I llama_kv_cache,0.54.652.807 I sched_reserve,0.54.652.820 I sched_reserve,0.54.652.821 I sched_reserve,0.54.652.822 I sched_reserve,0.54.652.823 I sched_reserve,0.54.652.824 I sched_reserve,0.54.652.825 I sched_reserve,0.54.736.184 I system_info,0.55.778.476 I kl_divergence,1.0% KLD,1.0% Δp,1.15.586.031 I load_tensors,1.15.586.037 I load_tensors,1.15.586.038 I load_tensors,1.15.586.039 I load_tensors,1.33.341.187 I llama_context,1.33.341.188 I llama_context,1.33.341.192 I llama_context,1.33.341.193 I llama_context,1.33.344.411 I llama_context,1.33.344.420 I llama_kv_cache_iswa,1.33.344.732 I llama_kv_cache,1.33.344.964 I llama_kv_cache,1.33.345.169 I llama_kv_cache,1.33.345.366 I llama_kv_cache,1.33.345.581 I llama_kv_cache,1.33.345.793 I llama_kv_cache,1.33.346.000 I llama_kv_cache,1.33.346.195 I llama_kv_cache,1.33.346.236 I llama_kv_cache,1.33.346.237 I llama_kv_cache,1.33.346.238 I llama_kv_cache_iswa,1.33.346.560 I llama_kv_cache,1.33.346.866 I llama_kv_cache,1.33.347.109 I llama_kv_cache,1.33.347.367 I llama_kv_cache,1.33.347.611 I llama_kv_cache,1.33.352.353 I llama_kv_cache,1.33.352.612 I llama_kv_cache,1.33.352.901 I llama_kv_cache,1.33.352.988 I llama_kv_cache,1.33.352.989 I llama_kv_cache,1.33.445.326 I sched_reserve,1.33.445.338 I sched_reserve,1.33.445.339 I sched_reserve,1.33.445.340 I sched_reserve,1.33.445.341 I sched_reserve,1.33.445.342 I sched_reserve,1.33.445.343 I sched_reserve,1.33.531.941 I system_info,1.33.544.818 I kl_divergence,1.57.364.920 I llama_perf_context_print,1.57.364.924 I llama_perf_context_print,1.57.364.925 I llama_perf_context_print,1.57.365.175 I common_memory_breakdown_print,1.57.573.874 I llama_perf_context_print,1.57.573.879 I llama_perf_context_print,1.57.574.130 I common_memory_breakdown_print,10.0% KLD,10.0% Δp,2.00.515.014 I llama_perf_context_print,2.00.515.017 I llama_perf_context_print,2.00.515.233 I common_memory_breakdown_print,2.07.545.088 I llama_perf_context_print,2.07.545.091 I llama_perf_context_print,2.07.545.092 I llama_perf_context_print,2.07.545.304 I common_memory_breakdown_print,2.19.557.466 I llama_perf_context_print,2.19.557.470 I llama_perf_context_print,2.19.557.471 I llama_perf_context_print,2.19.557.686 I common_memory_breakdown_print,2.54.375.390 I llama_perf_context_print,2.54.375.395 I llama_perf_context_print,2.54.375.396 I llama_perf_context_print,2.54.375.618 I common_memory_breakdown_print,25.0% Δp,5.0% KLD,5.0% Δp,75.0% Δp,90.0% KLD,90.0% Δp,95.0% KLD,95.0% Δp,99.0% KLD,99.0% Δp,99.9% KLD,99.9% Δp,"Cor(ln(PPL(Q)), ln(PPL(base)))",Maximum KLD,Maximum Δp,Mean KLD_std,Mean PPL(Q)-PPL(base)_mean,Mean PPL(Q)-PPL(base)_std,Mean PPL(Q)/PPL(base)_mean,Mean PPL(Q)/PPL(base)_std,Mean PPL(Q)_mean,Mean PPL(Q)_std,Mean PPL(base)_mean,Mean PPL(base)_std,Mean ln(PPL(Q)/PPL(base))_mean,Mean ln(PPL(Q)/PPL(base))_std,Mean Δp_mean,Mean Δp_std,Median KLD,Median Δp,Minimum KLD,Minimum Δp,RMS Δp_mean,RMS Δp_std,Same top p_mean,Same top p_std,file_path,file_size_gib,is_self_ref,mixture | |
| Step-3.5-Flash-IQ2_S (aes_sedai),69.18118572032002,2.78,0.488752,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,-7344130321.0,97053.773503,1.41,-0.35,-3.2881,-935.3248,-1135.324096,-1235.3211264,-1635.322,-2335.328,-2435.32128,-2535.32128,-2735.0,-2935.321,-3335.323,-48.2,-51.6,-6362.0,-342.0,3.0,6.0,0.0,-1.0,-128007.0,818.0,0.822,0.0,4096.0,48.0,6.496969664969697e+95,512.0,128.0,8.121212812121282e+83,1024.0,0.0,0.0,0.0,1.0,2.0,1.0,0.0,196.11,199.38,1.0,1.0,256.0,0.0,-98.303,47.0,4949.0,6473.93,28890.9,68890.9,75819.7,,,,,,16.0,8192.0,1.0,5000000.0,8192.0,7.87,-512.0,64.0,164.0,232.0,364.0,432.0,564.0,632.0,732.0,128.0,512.0,160.0,1128.0,2160.0,3128.0,4160.0,5128.0,6160.0,7128.0,128.0,128.0,3073.12,13073.12,23073.12,53073.12,74861.25,9.0,88.144,4.848561200112841e+50,561512819216.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1.2e-05,-90.797,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,23919.41,90485.13287233,34.0,-7344130321.0,,,,0.000533,-48.752,,,,,,,,,,,,,,,,-17.599,0.000125,-67.987,-0.005,1.451553,4.027,2.191823,13.423,4.030122,41.534,6.696032,75.322,79.94,15.07647,97.427,0.002225,1.356044,0.013584,1.561208,0.005153,3.77234,0.020688,2.416296,0.011058,0.44546,0.003301,-11.568,0.065,0.138796,-1.73,-0.000278,-99.975,27.232,0.081,76.564,0.112,kld/Step-3.5-Flash/wiki-test-raw/aes_sedai/Step-3.5-Flash-IQ2_S.md,64.43,False,Q6_K / IQ2_XS / IQ2_XS / IQ3_XXS | |
| Step-3.5-Flash-IQ3_S (aes_sedai),76.11755790336001,3.05,0.355228,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,-7344130321.0,103668.773503,1.38,-1.0,-3.2881,-935.3248,-1135.324096,-1235.3211264,-1635.322,-2335.328,-2435.32128,-2535.32128,-2735.0,-2935.321,-3335.323,-46.322,-49.2,-51.6,-284.0,6.0,0.0,-1.0,-128007.0,818.0,0.822,35.0,262144.0,48.0,6.496969664969697e+95,1.0,128.0,8.121212812121282e+83,1024.0,1024.0,0.0,0.0,288.0,2.0,5000000.0,10000.0,1.0,0.0,196.11,199.38,128896.0,1.0,1.0,256.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,-0.0,-96.393,,,,,,,47.0,4949.0,7103.93,39809.55,76134.7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,8192.0,8192.0,1.0,0.0,8192.0,7.87,-512.0,64.0,164.0,232.0,364.0,432.0,564.0,632.0,732.0,128.0,128.0,512.0,160.0,1128.0,2160.0,3128.0,4160.0,5128.0,6160.0,7128.0,128.0,128.0,3073.12,13073.12,43073.12,74861.25,321.38,9.0,92.844,4.848561200112841e+50,561512819216.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,7e-06,-84.064,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0.00031,-37.832,26262.11,34.0,-7344130321.0,,,,,,,,,,,,,-12.131,7.4e-05,-56.814,0.0,1.051053,4.65,1.636391,13.661,3.09273,40.996,5.265443,72.962,85.14,12.250976,96.137,0.001705,0.897332,0.0098,1.371367,0.003763,3.313628,0.017314,2.416296,0.011058,0.315808,0.002744,-8.559,0.057,0.089088,-0.879,-0.000229,-99.908,23.157,0.077,80.202,0.105,kld/Step-3.5-Flash/wiki-test-raw/aes_sedai/Step-3.5-Flash-IQ3_S.md,70.89,False,Q6_K / IQ2_S / IQ2_S / IQ3_S | |
| Step-3.5-Flash-IQ4_XS (aes_sedai),98.04336594944,3.93,0.127994,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,-8565340321.0,,,,,,,,,,,,,,,,,,,,,,,,,,124455.773503,1.58,-1.0,-3.2881,-1035.32262144,-1135.324096,-1235.3211264,-1635.322,-2335.328,-2435.32128,-2535.32128,-2735.0,-2935.321,-3335.323,-47.327,-50.4,-51.8,-80392.0,-442.0,3.0,80.0,0.0,-1.0,-128007.0,818.0,0.822,35.0,262144.0,48.0,6.496969664969697e+95,64.0,128.0,128.0,8.121212812121282e+83,1024.0,1024.0,0.0,0.0,0.0,11264.0,0.0,2.0,1.0,128.0,0.0,196.11,199.38,1.0,1.0,256.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,-1.1e-05,-79.297,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,4949.0,534.97,9220.76,312734.95,612769.09,77229.18,512.0,1.0,5000000.0,8192.0,7.87,-512.0,64.0,164.0,232.0,364.0,432.0,564.0,632.0,732.0,128.0,128.0,512.0,160.0,1128.0,2160.0,3128.0,4160.0,5128.0,6160.0,7128.0,128.0,3073.12,13073.12,23073.12,63073.12,321.38,9.0,79.294,4.848561200112841e+50,561512819216.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1e-06,-54.859,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,33507.92,34.0,-8565340321.0,7.1e-05,-16.541,,,,,,,,,,,,,,,,-3.875,1.7e-05,-28.634,0.099,0.373806,5.002,0.605387,12.134,1.266063,32.662,2.496434,59.866,94.28,6.904435,93.672,0.000696,0.268746,0.004419,1.111222,0.001776,2.685042,0.012863,2.416296,0.011058,0.105461,0.001598,-3.067,0.035,0.024985,-0.096,-0.000324,-99.895,13.655,0.059,88.359,0.085,kld/Step-3.5-Flash/wiki-test-raw/aes_sedai/Step-3.5-Flash-IQ4_XS.md,91.31,False,Q8_0 / IQ3_S / IQ3_S / IQ4_XS | |
| Step-3.5-Flash-Q4_K_M (aes_sedai),124.79027478528,5.01,0.040282,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,-8565340321.0,149970.773503,1.76,-0.35,-3.2881,-1035.32262144,-1235.3211264,-1635.322,-2335.328,-2435.32128,-2535.32128,-2835.323,-2935.321,-3335.323,-46.322,-50.5,-51.8,-32287.0,-542.0,3.0,80.0,0.0,-1.0,-128007.0,818.0,0.822,35.0,262144.0,48.0,6.496969664969697e+95,512.0,128.0,8.121212812121282e+83,1024.0,1024.0,0.0,0.0,0.0,11264.0,0.0,2.0,1.0,262144.0,196.11,199.38,128007.0,201.0,128007.0,256.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,-1.4e-05,-48.75,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,47.0,4949.0,534.97,116379.95,516379.95,78444.18,16.0,8192.0,1.0,5000000.0,8192.0,7.87,-512.0,64.0,164.0,232.0,364.0,432.0,564.0,632.0,732.0,128.0,512.0,160.0,1128.0,2160.0,3128.0,4160.0,5128.0,6160.0,7128.0,128.0,128.0,3073.12,23073.12,43073.12,74861.25,321.38,9.0,81.124,4.848561200112841e+50,561512819216.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,-0.0,-28.253,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1.7e-05,-6.76,,,,43441.78,81858.34287233,34.0,-8565340321.0,,,,,,,,,-1.077,4e-06,-12.739,0.333,0.115885,4.336,0.190934,9.161,0.41228,22.512,0.879877,41.289,98.14,5.095066,79.233,0.000237,0.060867,0.002219,1.02519,0.000914,2.477162,0.011502,2.416296,0.011058,0.024878,0.000891,-0.66,0.02,0.006899,-0.003,-0.000235,-99.368,7.527,0.038,93.537,0.065,kld/Step-3.5-Flash/wiki-test-raw/aes_sedai/Step-3.5-Flash-Q4_K_M.md,116.22,False,Q8_0 / Q4_K / Q4_K / Q5_K | |
| Step-3.5-Flash-Q5_K_M (aes_sedai),149.06757742592004,5.98,0.015249,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,-8565340321.0,173122.773503,1.94,-1.0,-3.2881,-935.3248,-1135.324096,-1235.3211264,-1635.322,-2235.32288,-2335.328,-2535.32128,-2735.0,-2835.323,-2935.321,-3335.323,-48.5,-51.8,-80392.0,-642.0,3.0,80.0,0.0,-1.0,-128007.0,818.0,,0.822,0.0,4096.0,48.0,6.496969664969697e+95,512.0,128.0,8.121212812121282e+83,1024.0,1024.0,0.0,0.0,11264.0,8.0,0.0,2.0,5000000.0,128.0,0.0,196.11,199.38,128896.0,1.0,1.0,256.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,-1.5e-05,-29.849,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,4949.0,534.97,119687.45,419721.59,79546.68,16.0,8192.0,1.0,1.0,8192.0,7.87,-512.0,64.0,164.0,232.0,364.0,432.0,564.0,632.0,732.0,128.0,128.0,512.0,160.0,1128.0,2160.0,3128.0,4160.0,5128.0,6160.0,7128.0,128.0,3073.12,13073.12,23073.12,63073.12,321.38,9.0,81.144,4.848561200112841e+50,561512819216.0,-1e-06,-16.495,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,7e-06,-3.708,,,,,,,,52305.47,84821.51287233,34.0,-8565340321.0,,,,,-0.499,2e-06,-7.141,0.297,0.043011,3.004,0.070816,6.073,0.158925,14.797,0.353039,28.993,99.29,3.072086,76.47,9.8e-05,0.015516,0.00133,1.006421,0.000551,2.431812,0.011175,2.416296,0.011058,0.006401,0.000547,-0.185,0.012,0.002566,-0.0,-0.000281,-64.93,4.642,0.026,95.974,0.052,kld/Step-3.5-Flash/wiki-test-raw/aes_sedai/Step-3.5-Flash-Q5_K_M.md,138.83,False,Q8_0 / Q5_K / Q5_K / Q6_K | |
| Step-3.5-Flash-Q8_0 (aes_sedai),211.98884831232002,8.51,0.005594,-8565340321.0,233130.773503,0.81,-0.35,-3.2881,-935.3248,-1135.324096,-1235.3211264,-1635.322,-2235.32288,-2435.32128,-2535.32128,-2735.0,-2935.321,-3335.323,-47.327,-80518.0,3.0,80.0,0.0,-128007.0,818.0,0.822,35.0,0.0,4096.0,48.0,6.496969664969697e+95,512.0,128.0,8.121212812121282e+83,1024.0,1024.0,0.0,0.0,0.0,2.0,10000.0,0.0,196.11,199.38,128896.0,1.0,128007.0,256.0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,-1.4e-05,-17.958,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,-1e-06,-9.247,4949.0,19570.76,328259.95,712404.18,8192.0,1.0,1.0,8192.0,7.87,-512.0,64.0,164.0,232.0,364.0,432.0,564.0,632.0,732.0,128.0,128.0,512.0,160.0,1128.0,2160.0,3128.0,4160.0,5128.0,6160.0,7128.0,128.0,128.0,3073.12,13073.12,23073.12,63073.12,74861.25,9.0,92.274,4.848561200112841e+50,561512819216.0,,,,,,,,2e-06,-2.024,,,,,,,,,,,,92237.15,80844.13287233,34.0,-8565340321.0,-0.236,0.0,-3.924,0.208,0.015086,1.876,0.025268,3.779,0.061369,9.11,0.150637,17.984,99.72,1.199049,65.162,4.1e-05,0.002891,0.000824,1.001197,0.000341,2.419187,0.011093,2.416296,0.011058,0.001196,0.000341,-0.029,0.007,0.000847,0.0,-0.000487,-59.321,2.764,0.019,97.516,0.041,kld/Step-3.5-Flash/wiki-test-raw/aes_sedai/Step-3.5-Flash-Q8_0.md,197.43,False, | |