Running ollama on RX 6600

Why Local LLMs?

I believe that Local LLMs are the better future for privacy reasons.


OS NixOS Unstable
CPU R9 7950X
RAM 64GB @ 6000Mhz
GPU RX 6600


As seen in my configuration.nix, I have ollama enabled as a service. The problem with this for my GPU is that it does not automatically run with it. While I can dig up how to change the configuration to include an environment variable, we can just run it from the command line.

  • OLLAMA_HOST is important here since we don't want to conflict with the default port.
  • HSA_OVERRIDE_GFX_VERSION is the important environment variable to set since this enables work with the GPU.


Simply run

OLLAMA_HOST="" ollama run llama2:latest

With your model of choice of course. If you have radeontop installed you should see the VRAM usage spike up to ~60%.

Models that fit in the VRAM

Here is a list of models I tested that fit in the VRAM of the 6600 as of [2024-04-13 Sat].

  • codegemma:7b
  • llama2:7b
  • zephyr:7b
  • gemma:instruct

Models that do not work

These models seem to not work although they fit in the GPU

  • phi:2.7b
  • wizardcoder:7b-python
  • wizardcoder:7b-python-q4_0

Have fun!