I hate Google.
the llm request a search so my code did so. But Google changed the class names and my code stopped working. Wasted 6 hours of gpu processing.
I think llamacpp is still the best one to use since it can force the output to follow a grammar. So you won't get junks if the model decides to f*ck up you when your code requires a json response.
Ollama is great for downloading and run different model over rest api, but you can't enforce the output format.
QOTO: Question Others to Teach Ourselves An inclusive, Academic Freedom, instance All cultures welcome. Hate speech and harassment strictly forbidden.
I think llamacpp is still the best one to use since it can force the output to follow a grammar. So you won't get junks if the model decides to f*ck up you when your code requires a json response.