Their killer feature is the --grammar option which restricts the logits the LLM outputs which makes them great for bash scripts that do all manner of NLP classification work.
Otherwise I use ollama when I need a local LLM, vllm when I'm renting GPU servers, or OpenAI API when I just want the best model.
Otherwise I use ollama when I need a local LLM, vllm when I'm renting GPU servers, or OpenAI API when I just want the best model.