Yes you need to install CUDA and MSVC for GPU. But here's some good news! We just rolled our own GEMM functions so llamafile doesn't have to depend on cuBLAS anymore. That means llamafile 0.4 (which I'm shipping today) will have GPU on Windows that works out of the box, since not depending on cuBLAS anymore means that I'm able to compile a distributable DLL that only depends on KERNEL32.DLL. Oh it'll also have Mixtral support :) https://github.com/Mozilla-Ocho/llamafile/pull/82