The tech giant expressed optimism for its latest product but also admitted that it currently has limitations.
Google has released a new experimental artificial intelligence (AI) model that it claims has “stronger reasoning capabilities” in its responses than the base Gemini 2.0 Flash model.
Launched yesterday (19 December), Gemini 2.0 Flash Thinking Experimental is available on Google’s AI prototyping platform, AI Studio.
The announcement follows Google’s launch of Gemini 2.0, its answer to OpenAI’s ChatGPT, just last week, while OpenAI released a preview of its “complex reasoning” AI model, o1, back in September.
Reasoning models are designed to fact-check themselves, making them more accurate, although these types of models often take longer to deliver results.
According to Google, the way the model’s ‘thoughts’ are returned depends on whether the user is using the Gemini API directly or making a request through AI Studio.
Logan Kilpatrick, who leads the product for AI Studio, took to X to call Gemini 2.0 “the first step in [Google’s] reasoning journey”.
Jeff Dean, chief scientist for Google DeepMind and Research, also claimed that the company saw “promising results” with the new model.
However, Google has also acknowledged that the newly released model has a number of limitations. These include a 32k token input limit; text and image input only; an 8k token output limit; text only output; and no built-in tool usage such as Search or code execution.
TechCrunch reported that it briefly tested the model and concluded that there was “certainly room for improvement”.
While the prospect of reasoning models seems attractive, owing to their ability to fact-check themselves, such models have also raised concerns, including the question of whether such an AI model could effectively cheat and deceive humans.
Earlier this year, Dr Shweta Singh of the University of Warwick’s argued that releasing such sophisticated models without proper scrutiny is “misguided”.
“To achieve its desired objective, the path or the strategy chosen by AI may not always necessarily be fair, or align with human values.”
Earlier this year, Stanford AI Index claimed that robust evaluations for large language models are “seriously lacking”, and there is a lack standardisation in responsible AI reporting.
Don’t miss out on the knowledge you need to succeed. Sign up for the Daily Brief, Silicon Republic’s digest of need-to-know sci-tech news.