Can you add a recent build of llama.cpp (arm64) to the results pool? I'm really ...

SparkyMcUnicorn · 2025-05-05T15:30:50 1746459050

I ran them again several times to make sure the results were fair. My previous runs also had a different 30B model loaded in the background that I forgot about.

LM Studio is an easy way to use both mlx and llama.cpp

anemll [0]: ~9.3 tok/sec

mlx [1]: ~50 tok/sec

gguf (llama.cpp b5219) [2]: ~41 tok/sec

[0] https://huggingface.co/anemll/anemll-DeepSeekR1-8B-ctx1024_0...

[1] https://huggingface.co/mlx-community/DeepSeek-R1-Distill-Lla...

[2] (8bit) https://huggingface.co/unsloth/DeepSeek-R1-Distill-Llama-8B-...

srigi · 2025-05-05T16:26:09 1746462369

Thank you very much.