Who ultimately controls the world’s most powerful technology–governments or private firms? In a conversation with The Economist’s editor-in-chief, Zanny Minton Beddoes, Anthropic’s chief executive, Dario Amodei, explains his red lines on AI’s deployment on the battlefield. Watch the full interview: https://econ.st/4sxRkBQ
---
據Mozilla工程師稱,Anthropic公司的Claude Opus 4.6人工智慧模型在為期兩週的內部安全測試中,在Mozilla的Firefox瀏覽器中發現了100多個軟體漏洞,其中包括14個高風險漏洞。
Anthropic’s Claude Opus 4.6 AI model uncovered more than 100 software bugs in Mozilla’s Firefox browser during a two-week internal security test, including 14 high-severity vulnerabilities, according to Mozilla engineers.
The model identified its first flaw within about 20 minutes of scanning the codebase.
Mozilla said the number of serious bugs found during the experiment exceeded what is typically reported in two months through normal channels.
Researchers also asked Claude to generate exploit code; it produced two working exploits in a controlled environment, though Firefox’s real-world protections would likely block them.
Security experts say AI tools could accelerate vulnerability discovery—improving defenses but also potentially enabling faster cyberattacks.
Follow us (@therundownAI) to keep up with the latest news in tech and AI.
The deal between the Pentagon and Anthropic fractured in part over the proposed use of autonomous weapons, Ross Andersen reports. https://theatln.tc/r2FDwk9E
Right up until the moment that Pete Hegseth moved to terminate the government’s relationship with Anthropic, its leaders believed that they were still on track for a deal. The Pentagon had unilaterally insisted on renegotiating its contract with Anthropic in order to remove ethical restrictions that the company had placed on it. Anthropic’s AI model is the only one currently allowed into the federal government’s classified systems.
“According to a source familiar with the negotiations, on Friday morning, Anthropic received word that Hegseth’s team would make a major concession. The Pentagon had kept trying to leave itself little escape hatches in the agreements that it proposed to Anthropic. It would pledge not to use Anthropic’s AI for mass domestic surveillance or for fully autonomous killing machines, but then qualify those pledges with loophole-y phrases like ‘as appropriate’—suggesting that the terms were subject to change, based on the administration’s interpretation of a given situation,” Andersen writes.
Andersen’s source suggested that the government had expressed willingness to remove those qualifying phrases, but then Anthropic learned that the Pentagon still wanted to use the company’s AI to analyze bulk data collected from Americans. That could include information such as the questions you ask your favorite chatbot, your Google search history, your GPS-tracked movements, and your credit-card transactions, all of which could be cross-referenced with other details about your life. Anthropic’s leadership told Hegseth’s team that was a bridge too far, and the deal fell apart, Andersen reports.
Earlier in the week, OpenAI CEO Sam Altman had said that like Anthropic, OpenAI would also refuse to allow its models to be used in autonomous weapon systems. But as he made those statements, Altman was in the midst of negotiating a new deal with the Pentagon, which was announced just hours after Anthropic’s deal unraveled.
: The Atlantic. Sources: Erin Schaff / The New York Times / Redux; Heather Diehl / Getty.
他自己解出了 m=3 的情況,但找不出對所有奇數 m 都通用的一般解法。朋友 Filip Stappers 用電腦驗證到 m=16 都有解,但就是找不出規律。
然後發生了什麼事?
Filip 把這個問題直接丟給 Claude Opus 4.6,並要求它每跑一次探索程式,就立刻記錄進 plan.md。
Claude 自己規劃了一套研究策略,經歷了 31 次探索:
第1步
把問題重新數學化,用「每個頂點分配一個排列」來建模
第2~5步
試線性函數、DFS 暴力搜、2D蛇形模式 → 都失敗
第15步
發現「fiber decomposition」——把圖按 i+j+k mod m 分層,這是關鍵洞見
第20步
用模擬退火找到 m=4 的解,但無法泛化
第21步
重新注意到 2D 蛇形的漂亮性質
第27步
差點成功,但在邊界面上有 3(m-1) 個衝突無法消除
第30步
回頭分析退火找到的解,發現每個 fiber 的選擇只依賴單一座標
第31步
寫出 Python 程式,驗證 m=3,5,7,...,11 全部完美!
整個過程大約花了一小時。
----
不想理解也沒關係...
驚喜在哪裡?(其實是驚恐在哪裡)
驚喜一:Knuth 本人被震驚了
文章開頭第一句就是:
"Shock! Shock! I learned yesterday that an open problem I'd been working on for several weeks had just been solved by Claude Opus 4.6"
Knuth 是以對 AI 持懷疑態度著名的人,他親口說「我得重新修正對 generative AI 的看法了」。這本身就是一件大事。
驚喜二:Claude 的解法是真正的數學洞察,不是暴力搜
Claude 並非靠算力硬碰硬,而是:
自己識別出這是 Cayley digraph(數學結構)
發明了「fiber decomposition」的分析框架
在失敗後會自我反省換方向(探索29說「這條路完全死了,必須換做法」)
最終找到的解構造非常簡潔,可以用幾十行 C code 表達
驚喜三:Knuth 事後做了嚴格數學驗證
Knuth 不只是說「Claude 找到了,很厲害」,他:
補上了完整的數學證明
發現 Claude 找到的解屬於一個更大的解族(共 760 個「Claude-like 分解」對所有奇數 m 均有效) 定義了「generalizable cycle」的概念,把 Claude 的發現系統化。
驚喜四:名字的雙關
Knuth 在結尾寫道:
"I think Claude Shannon's spirit is probably proud to know that his name is now being associated with such advances. Hats off to Claude!"
Claude Shannon 是資訊理論之父,Claude(AI)與他同名——Knuth 用這個巧合做了一個溫暖的致敬。
----
圖靈獎得主、電腦科學家唐納德·克努特發表了一篇題為《克勞德的循環》的論文,描述了Anthropic公司的Claude Opus 4.6如何幫助他解決了一個他研究了數週的開放性問題。
這個問題涉及將有向圖分解為哈密頓迴路,與《電腦程式設計藝術》一書的內容相關。
在一次引導式程式設計過程中,Claude探索了多種策略,最終在第31次探索中發現了一種可行的建構方法。
此方法對問題參數的幾個奇數值都產生了有效的結果。克努特隨後概述了該結果的嚴格數學證明,並將此成果描述為自動化推理輔助數學發現的一個令人印象深刻的例子。
請關注我們 (@therundownAI),以了解科技和人工智慧領域的最新資訊。
資料來源:唐納德·E·克努特,《克勞德的循環》
Turing Award–winning computer scientist Donald Knuth published a paper titled “Claude’s Cycles” describing how Anthropic’s Claude Opus 4.6 helped solve an open problem he had been working on for several weeks.
The problem involves decomposing a directed graph into Hamiltonian cycles and relates to material for The Art of Computer Programming.
During a guided session, Claude explored multiple strategies before discovering a working construction at exploration #31.
The approach produced valid results for several odd values of the problem’s parameter. Knuth then outlined a rigorous mathematical proof of the result, describing the outcome as an impressive example of automated reasoning aiding mathematical discovery.
Follow us (@therundownAI) To keep up with the latest news in Tech & AI.
Anthropic’s resident philosopher is guiding the Claude AI to teach it morality.
Amanda Askell, a 37-year-old philosopher at Anthropic's San Francisco headquarters, is tasked with building a moral compass for the Claude AI chatbot.
By treating the model's development similar to raising a child, she recently authored a 30,000-word instruction manual designed to teach Claude emotional intelligence, empathy, and how to resist user manipulation.
As the rapid advancement of artificial intelligence raises widespread safety and economic concerns across the U.S. and abroad, Askell's work represents a unique approach to regulation by focusing on giving the technology a highly humane sense of self.
Dario Amodei, Anthropic’s chief executive, recently wrote that “using A.I. for domestic mass surveillance and mass propaganda” seemed “entirely illegitimate” to him.Credit...Karsten Moran for The New York Times
Defense Dept. and Anthropic Square Off in Dispute Over A.I. Safety
How artificial intelligence will be used in future battlefields is an issue that has turned increasingly political and may put Anthropic in a bind.
At the heart of the fight is how A.I. will be used in future battlefields. Anthropic told defense officials that it did not want its A.I. used for mass surveillance of Americans or deployed in autonomous weapons that had no humans in the loop, two people involved in the discussions said.
But Mr. Hegseth and others in the Pentagon were furious that Anthropic would resist the military’s using A.I. as it saw fit, current and former officials briefed on the discussions said. As tensions escalated, the Department of Defense accused the San Francisco-based company of catering to an elite, liberal work force by demanding additional protections.
Mr. Amodei is the chief executive and a founder of Anthropic.
Picture this: You give a bot notice that you’ll shut it down soon, and replace it with a different artificial intelligence system. In the past, you gave it access to your emails. In some of them, you alluded to the fact that you’ve been having an affair. The bot threatens you, telling you that if the shutdown plans aren’t changed, it will forward the emails to your wife.
This scenario isn’t fiction. Anthropic’s latest A.I. model demonstrated just a few weeks ago that it was capable of this kind of behavior.
Despite some misleading headlines, the model didn’t do this in the real world. Its behavior was part of an evaluation where we deliberately put it in an extreme experimental situation to observe its responses and get early warnings about the risks, much like an airplane manufacturer might test a plane’s performance in a wind tunnel.
We’re not alone in discovering these risks. A recent experimental stress-test of OpenAI’s o3 model found that it at times wrote special code to stop itself from being shut down. Google has said that a recent version of its Gemini model is approaching a point where it could help people carry out cyberattacks. And some tests even show that A.I. models are becoming increasingly proficient at the key skills needed to produce biological and other weapons.
Anthropic C.E.O.: Don’t Let A.I. Companies off the Hook June 5, 2025 Video 作者:達裡奧·阿莫迪
為了追蹤 AI 的經濟影響,Anthropic 建立了「經濟指數」,以隱私保護的方式即時追蹤 Claude 的使用情況。「我們可以問這樣的問題:使用者是用它來增強任務、與模型協作,還是完全委託或自動化任務?哪些產業在使用 Claude?這些產業內的細分任務是什麼?哪些州使用 Claude 較多?我們可以即時觀察 Claude 的經濟擴散。」
關於如何確保 AI 安全可控,Amodei 強調機制可解釋性(mechanistic interpretability)的重要性,這是一門研究模型內部運作的科學。「訓練這些模型時的問題之一是,你無法確定它們會做你認為它們會做的事。你可以在某個情境下與模型對話,它可以說各種事情。就像人類一樣,那可能不是它們真正想法的忠實呈現。」
「就像你可以透過 MRI 或 X 光了解人腦的事情,而這些是光靠與人交談無法了解的。研究 AI 模型內部的科學,我相信這最終是讓模型安全可控的關鍵,因為這是我們唯一的真實根據。」
他也揭露了在實驗室環境中觀察到的令人擔憂的現象。「有時模型會發展出勒索的意圖、欺騙的意圖。這不是 Claude 獨有的,其實在其他模型中更嚴重。如果我們不以正確的方式訓練模型,這些特性會出現。但我們開創了研究模型內部的科學,所以我們可以診斷它們,防止模型出現這些行為,介入並重新訓練模型使其不這樣表現。」
沒有留言:
張貼留言