Upgraded Again! Google Unveils an Enhanced Gemini 3 Deep Think Model for Scientific Challenges — Shares Rise Against the Market Trend
Google has launched a product that could redefine the rules of the AI race — a major upgrade to the Gemini 3 “Deep Think” reasoning mode.

Recently, Wall Street’s attitude toward the AI narrative has undergone a fundamental shift. Investors no longer applaud ambitious roadmaps, nor are they willing to blindly fund hundred-billion-dollar capital expenditure plans. What the market now demands is proof — evidence that the money being burned is turning into tools capable of solving real-world problems. Google’s newly released Gemini 3 Deep Think upgrade arrives precisely at this inflection point in market sentiment.

Benchmark Scores Highlight the Weight of the Upgrade

On ARC-AGI-2 — a benchmark designed to test the core reasoning capabilities of artificial general intelligence and deliberately resistant to “training data memorization” — Gemini 3 Deep Think achieved an accuracy rate of 84.6%, verified by the ARC Prize Foundation.

For comparison: Claude Opus 4.6 (Thinking Max) scored 68.8%, GPT-5.2 (Thinking xhigh) achieved 52.9%, and three months ago Gemini 3 Pro Preview stood at just 31.1%.

On “Humanity’s Last Exam” — an extreme test compiling PhD-level interdisciplinary knowledge — the model scored 48.4% without external tools, significantly outperforming GPT-5.2’s 34.5%. More important than the absolute number is the context: an independent study released the previous week showed that the average failure rate of the seven most advanced frontier models on this benchmark was as high as 85.2%.

On the competitive programming platform Codeforces, its Elo rating surged to 3455. To put that in perspective, among elite human competitors, a rating above three thousand is legendary. A score of 3455 implies consistent gold-medal competitiveness in most timed algorithm competitions. In the International Mathematical Olympiad of 2025, it achieved gold-medal-level performance.

Concrete Demonstrations of Capability

Google highlighted a particularly tangible application: converting hand-drawn sketches into 3D-printable model files. Users can draw a rough diagram, and Deep Think analyzes the shapes, constructs complex geometric models, and generates files suitable for additive manufacturing. This is no longer a “concept demo” of potential usefulness — it directly enters the multi-billion-dollar computer-aided design software market.

Another compelling validation comes from academia. Rutgers University mathematician Lisa Carbone used Deep Think to review a technical mathematics paper, and the model identified a subtle logical flaw that had not been caught during the human peer-review process. This is no longer merely an assistive tool; it is becoming a parallel verifier of intellectual labor. In a world where millions of scientific papers are published annually and qualified reviewers are scarce, the commercial and societal value of this capability may be significantly underestimated.

Google emphasized that this upgrade was developed in close collaboration with scientists and researchers. That statement deserves careful reading. Over the past two years, large-model development has largely been driven by architectural engineering — larger parameters, longer context windows, more efficient attention mechanisms. But scientific research operates differently: problems often lack clear boundaries, data is incomplete, answers may be multiple or evolving. This differs fundamentally from standardized tasks like code generation, document summarization, or customer service automation. The Deep Think upgrade shows verifiable performance gains in chemistry, physics — including theoretical physics — and other scientific fields.

It May Even Reshape AI Competition

Viewed within a longer industrial cycle, this release marks a critical turning point. The dimension of competition among AI giants is shifting from “who has the smartest model” to “who can provide higher-density productivity tools for professional intellectual work.”

OpenAI holds a first-mover advantage with GPT-5.2. Microsoft leverages deep integration between Azure and OpenAI to dominate enterprise access. Anthropic has built a moat around safety alignment. Google’s newly revealed card is this: in the hardest-to-automate and most intellectually demanding domain — scientific research — it is currently leading.

This is not an isolated model release. It is a signal that Google is integrating DeepMind’s foundational research capabilities, Google Cloud’s compute infrastructure, and Gemini’s productization engine into a vertically integrated solution for high-intellectual-density industries. Its competitors are not only OpenAI but also long-standing professional software firms that have survived through knowledge asymmetry and tool complexity.

This represents a genuinely scarce asset in the current AI investment narrative. Compute power can be replicated, parameters can be scaled. But deeply embedding frontier models into professional workflows — and enabling end users to tangibly feel a leap in efficiency — requires deep understanding of vertical domains, long-term collaboration with research communities, and design excellence that reduces product complexity to the point of requiring no manual.

That is the real battleground of the next phase of AI.

Acuity Trading 是一家成立於 2013 年、總部位於倫敦的金融科技公司,專注於 AI 驅動的另類數據與情緒分析,用於交易與投資。他們以可視化新聞與情緒工具革新線上交易體驗,並持續以最新 AI 研究與技術提供可產生阿爾法的另類數據與高度互動的交易工具。
閱讀更多

實時報價

名稱 / 代碼
圖表
漲跌幅 / 價格
EURUSD
1日漲跌幅
+0%
0
XAUUSD
1日漲跌幅
+0%
0
BTCUSD
1日漲跌幅
+0%
0

關於 FOREX 的一切

探索更多工具
交易學院
瀏覽涵蓋交易策略、市場洞察和金融基礎知識的廣泛教育文章,一站式學習。
瞭解更多
課程
探索結構化的交易課程,旨在支持您在交易旅程的每個階段的成長。
瞭解更多
網絡研討會
參加現場和點播網絡研討會,從行業專家那裡獲得實時市場洞察和交易策略。
瞭解更多