인기 기사

Google on Tuesday introduced Gemini 3.1 Flash Lite, a new artificial intelligence model that the company describes as the fastest and most cost-efficient offering in the Gemini 3 family. The model is currently available to developers in preview via the Gemini API in Google AI Studio, and to enterprise customers through Vertex AI.
The launch marks Google’s push into the cost-efficient AI model segment, where it competes directly with similar offerings from other technology companies in the AI space.
Pricing for the model is set at $0.25 per one million input tokens and $1.50 per one million output tokens. Google stated that, according to Artificial Analysis benchmarks, Gemini 3.1 Flash Lite outperforms its predecessor, Gemini 2.5 Flash. The time to generate the first response token is 2.5 times faster, and overall output speed has improved by 45%. From a trading perspective, some investors may consider buying Google shares around the $300 level.
On the Arena.ai leaderboard, Gemini 3.1 Flash Lite achieved an Elo rating of 1432. It scored 86.9% in the GPQA Diamond benchmark and 76.8% in the MMMU Pro test. Google noted that the model surpasses previous larger Gemini models, including 2.5 Flash, in reasoning and multimodal understanding benchmarks.
The model also incorporates a “dynamic thinking” feature, allowing developers to adjust the level of reasoning depth applied to specific tasks. Google said this capability is designed to manage high-frequency workloads such as large-scale translation and content moderation, as well as more complex tasks like generating user interfaces and creating simulations.
Market Interpretation:
The partnership between Apple and Google is undergoing significant evolution as generative AI advances rapidly. The two companies have reached an agreement on licensing the Gemini model, and the collaboration is now expanding into cloud infrastructure. As demand for computing power surges for the next-generation Siri and Apple Intelligence, Apple is reportedly in talks with Google to host and operate dedicated server clusters in Google’s data centers to support Siri’s backend operations.








