I really wanted to like Gemini 3 Flash. I’ve been triple prompting between Grok 4.1, GPT 5.2 and Gemini 3 Flash Reasoning every since it came out. And Gemini 3 Flash hallucinates too much to be usable. I finally stopped using it. So far I am very disappointed with it as a research tool.
Gemini 3 Flash is also more expensive than 2.5 Flash. Token prices are going up and will continue to go up.
That said, it does well on the benchmarks and Theo likes it:
The xAI guys made some comments when Gemini 3 came out alluding to the approach Google was using was fundamentally flawed and that it would have a growing problem with hallucinations. This seems to be proving true. Long term Grok may have some advantages as the “maximally truth seeking LLM” which is it’s goal.

