Running ollama
on RX 6600
Why Local LLMs?
I believe that Local LLMs are the better future for privacy reasons.
Setup
OS | NixOS Unstable |
CPU | R9 7950X |
RAM | 64GB @ 6000Mhz |
GPU | RX 6600 |
Configuration
As seen in my configuration.nix, I have ollama
enabled as a service. The
problem with this for my GPU is that it does not automatically run with it.
While I can dig up how to change the configuration to include an environment
variable, we can just run it from the command line.
OLLAMA_HOST="127.0.0.1:11444" HSA_OVERRIDE_GFX_VERSION=10.3.0 ollama serve
OLLAMA_HOST
is important here since we don't want to conflict with the default port.HSA_OVERRIDE_GFX_VERSION
is the important environment variable to set since this enables work with the GPU.
Usage
Simply run
OLLAMA_HOST="127.0.0.1:11444" ollama run llama2:latest
With your model of choice of course. If you have radeontop
installed you should
see the VRAM
usage spike up to ~60%.
Models that fit in the VRAM
Here is a list of models I tested that fit in the VRAM of the 6600 as of
.codegemma:7b
llama2:7b
zephyr:7b
gemma:instruct
Models that do not work
These models seem to not work although they fit in the GPU
phi:2.7b
wizardcoder:7b-python
wizardcoder:7b-python-q4_0
Have fun!