Google 機械翻譯
transcript
Anthropic’s Chief on A.I.: ‘We Don’t Know if the Models Are Conscious’
Dario Amodei shares his utopian — and dystopian — predictions in the near term for artificial intelligence.
I want to try and focus on scenarios where A.I. goes rogue. I should have had a picture of a Terminator robot to scare people as much as possible. I think the internet... The internet does that for us. Are the lords of artificial intelligence on the side of the human race? “My prediction is there’ll be more robots than people.” “The physical and the digital worlds should really be fully blended.” “I don’t think the world has really had the humanoid robots moment yet. It’s going to feel very sci-fi.” That’s the core question I had for this week’s guest. He’s the head of Anthropic, one of the fastest growing A.I. companies. Anthropic is estimated to be worth nearly $350 billion. It’s been win after win for Anthropic’s Claude code. He’s a utopian of sorts, when it comes to the potential effects of the technology that he’s unleashing on the world. “You know, will help us cure cancer. It may help us to eradicate tropical diseases. It will help us understand, understand the universe.” But he also sees grave dangers ahead and massive disruption, no matter what. “This is happening so fast and is such a crisis, we should be devoting almost all of our effort to thinking about how to get through this.” Dario Amodei, welcome to Interesting Times. Thank you for having me, Ross. Thank you for being here. So you are rather unusually, maybe for a tech C.E.O., an essayist. You have written two long, very interesting essays about the promise and the peril of artificial intelligence. And we’re going to talk about the perils in this conversation. But I thought it would be good to start with the promise and with the optimistic vision. Indeed, I would say the utopian vision that you laid out a couple of years ago in an essay entitled, “Machines of Loving Grace,” which we’ll come back to that title, I think, at the end. But, I think a lot of people encounter A.I. news through headlines predicting a bloodbath for white collar jobs, these kinds of things. Sometimes your own quotes — Have used my own quotes — Yes. Have encouraged these things. And I think there’s a commonplace sense of, “What is A.I. for?” that people have. So why don’t you answer that question, to start out — if everything goes amazingly in the next five or 10 years, what is A.I. for? Yeah, so I think for a little background before I worked in before I worked in tech at all, I was a biologist. I first worked on computational neuroscience, and then I worked at Stanford Medical School on finding protein biomarkers for cancer on trying to improve diagnostics and curing cancer. And one of the observations that I most had when I worked in that field was the incredible complexity of it. Each protein has a level localized within each cell. It’s not enough to measure the level within the body or the level within each cell. You have to measure the level in a particular part of the cell and the other proteins that it’s interacting with or complexing with. And I had the sense of, “Man, this is too complicated for humans.” We’re making progress on, all these problems of biology and medicine, but we’re making progress relatively slowly. And so what drew me to the field of A.I. was this idea of — that you know, could we make progress more quickly? Look, we’ve been trying to apply A.I. and machine learning techniques to biology for a long time. Typically they’ve been for analyzing data, but as A.I. gets really powerful, I think we should actually think about it differently. We should think of A.I. as doing the job of the biologist, right? Doing the whole thing from end to end. And part of that involves proposing experiments, coming up with new techniques. I have this section where I say, “Look, a lot of the progress in biology has been driven by this relatively small number of insights that lets us measure or get at or intervene in the stuff that’s really small. You look at a lot of these techniques. They’re invented very much as a matter of serendipity. CRISPR, which is one of these gene editing technologies was invented because someone went to a lecture on the bacterial immune system and connected that to the work they were doing on gene therapy. And that connection could have been made 30 years ago. And so the thought is —could A.I. accelerate all of this and could we really cure cancer? Could we really cure Alzheimer’s disease? Could we really cure, heart disease? And more subtly, some of the more psychological afflictions that people have — depression, bipolar — could we do something about these? To the extent that they’re biologically based, which I think they are, at least in part. So, I go through this argument here, “Well, how fast could it go?” If we have these intelligences out there who could do just about anything? And I want to pause you there because one of the interesting things about your framing in that essay, and you returned to it, is that these intelligences don’t have to be right, the kind of maximal godlike superintelligence that comes up in A.I. debates. You’re basically saying, if we can achieve a strong intelligence at the level of peak human performance — peak human performance, yes — and then multiply it, right, to what? Your phrase is, “A country of geniuses.” A country — have 100 million of them. Right. A hundred million — Each, a little trained, a little different, or trying a different problem. There’s benefit in diversification and trying things a little differently. But yes. So you don’t have to have the full machine. God you just need to have 100 million geniuses. You don’t have to have the full machine. God and indeed, there are places where I cast doubt on whether the machine God would be that much more effective at these things than the 100 million geniuses. I have this concept called the diminishing returns to intelligence, right. Which is there’s economists talk about the marginal productivity of land and labor. We’ve never thought about the marginal productivity of intelligence. But if I look at some of these problems in biology at some level, you just have to interact with the world at some level, you just have to try things at some level. You just have to comply with the laws or change the laws on getting medicines through the regulatory system. So there’s a finite rate at which these changes can happen. Now there are some domains like if you’re playing chess or Go where the intelligence ceiling is extremely high. But I think the real world has a lot of limiters. So maybe you can go above the genius level. But, sometimes I think all this discussion of could you use a moon of computation to make an AI God are there a little bit sensationalistic and besides the point, even as I think this will be the biggest thing that ever happened to humanity. And so you have so keeping it concrete, you have a world where there’s just an end to cancer as a serious threat to human life, an end to heart disease, an end to most of the illnesses that we experience that kill us, possible life extension beyond that. So that’s health. That’s a pretty positive vision. Then talk about economics and wealth. What happens in the 5 to 10 year A.I. takeoff to wealth. So again, let’s keep it on the positive side because there will be plenty we’ll get to the negative side. But we’re already working with pharma companies. We’re already working with financial industry companies. We’re already working with folks who do manufacturing or of course, I think especially known for coding and software engineering. So just the raw productivity, the ability to make stuff and get stuff done that is very powerful. And we see our company’s revenue growing going up 10x a year. And, we suspect the wider industry looks something similar to that. If the technology keeps improving, it doesn’t take that many more 10 X’s until suddenly you’re saying, oh, if you’re adding across the industry $1 trillion of revenue a year, the US GDP is 20 or 30 trillion, I can’t remember exactly. So you must be increasing the GDP growth by a few percent. So I can see a world where A.I. brings the developed world GDP growth to something like percent or 15 percent 5, 10, 15 mean, there’s no science of calculating these numbers. It’s totally unprecedented thing. But it could bring it to numbers that are outside the distribution of what we saw before. And again, I think this will lead to a weird world. We have all these debates about the deficit is growing. If you have that much in GDP growth, you’re going to have that much in tax receipts and you’re going to balance the budget without meaning to. But one of the things I’ve been thinking about lately is I think one of the assumptions of just our economic and political debates is that growth is hard to achieve. It’s this unicorn.
殺雞取卵的方法有很多種。我們可能會進入一個成長非常容易的世界,而分配卻很難,因為成長速度太快了。沒錯,蛋糕正在迅速變大。所以在討論這個難題之前,我想先談談比政治更樂觀的一面。這裡我的觀點更偏向推測,當然,這一切都帶有推測性。我想論證人工智慧可能對世界各地的民主和自由有益,這並非顯而易見。很多人認為,威權領導人掌握的強大技術會導致權力集中等等。我在另一篇文章中也談到了這一點。
但簡單來說,人工智慧對民主有益的樂觀理由是什麼?是的,我的意思是,絕對有益。所以,是的,我的意思是,充滿愛心的機器,我只是想,讓我們暢想一下,讓我們暢想一下它會如何發展。嗯,我不知道可能性有多大,但我們必須描繪一個願景。讓我們努力實現這個願景。所以,我認為積極的一面是,我承認我不確定這項技術本身是否有利於自由。我認為它本身有利於治癒疾病,也有利於經濟成長。但我擔心它可能不會本身有利於自由。但我想說的是,我們能否讓它有利於自由?我們能否讓美國和其他民主國家在這項技術上領先?美國在技術和軍事上一直處於領先地位,這意味著我們透過與其他民主國家的聯盟,在世界各地擁有影響力。我們已經能夠塑造一個我認為比由俄羅斯、中國或其他威權國家塑造的世界更好的世界。
因此,我們能否利用我們在人工智慧領域的領先地位,來塑造世界各地的自由?顯然,關於我們應該採取何種程度的干預主義,以及我們應該如何運用這種權力,存在著許多爭論。但我常常擔心,如今,專制主義者正透過社群媒體削弱我們的力量。我們能反擊嗎?我們能贏得資訊戰嗎?我們能否利用人工智慧的力量,特別是龐大的人工智慧無人機群來保護烏克蘭或台灣這樣的國家,從而阻止專制主義者入侵?
當然,我們需要謹慎對待這些無人機。我們自身也需要謹慎地建構它們。我們需要捍衛自己國家的自由,但是否存在某種願景,讓我們在人工智慧時代重新構想自由和個人權利?我們需要在某些方面受到保護,免受人工智慧的侵害。需要有人來控制無人機群,這是我非常擔憂的問題,而目前這種監督機制並不存在。此外,也要思考當今的司法體系。我們承諾人人享有平等的司法待遇。
但事實是,世界上存在著不同的法官。法律體系並不完美。我不認為我們應該用人工智慧取代法官,但人工智慧能否以某種方式幫助我們更公平、更統一地對待所有人呢?這在以前從未實現過,但我們能否利用人工智慧創造出某種模糊性,但又能保證對所有人一視同仁的東西呢?所以我並不清楚具體該如何操作。我也不認為我們應該用人工智慧取代最高法院——我們稍後會討論這個問題。
但我想表達的是,我們能否透過人工智慧和人類的某種結合,來實現機會均等和司法公正的承諾?肯定有辦法做到這一點。所以,我們不妨思考一下,如何在人工智慧時代重塑民主,增強而非削弱自由。很好,這很好。這是一個非常正面的願景。我們壽命更長,身體更健康,也比以往任何時候都更富裕。這一切都發生在短短十年內,相當於一個世紀的經濟成長。而且,我們在世界各地享有更大的自由,在國內也實現了更大的平等。
There are all kinds of ways you can kill the golden goose. We could enter a world where growth is really easy. And it’s the distribution that’s hard because it’s happening so fast. right. The pie is being increased. So fast. So before we get to the hard problem, one more note of optimism than on politics, I think. And here it’s a little more I mean, all of this is speculative, but I think it’s a little more speculative. You try and make the case that I could be good for democracy and liberty around the world, which is not necessarily intuitive. A lot of people say, incredibly powerful technology in the hands of authoritarian leaders leads to concentrations of power and so on. And I talk about that in the other. But just briefly, what is the optimistic case for why A.I. is good for democracy Yeah, I mean absolutely. So yeah, I mean, machines of loving grace, I kind of like, I’m just like, let’s dream, let’s dream about how it could go. well, I don’t know how likely it is, but we got to lay out a dream. Let’s try and make the dream happen. So I think the positive version, I admit there that I don’t know that the technology inherently favors liberty. I think it inherently favors curing disease and it inherently favors economic growth. But I worry you that it may not inherently favor liberty. But what I say there is, can we make it favor liberty. Can we make the United States and other democracies get ahead in this technology. The United States has been technologically and militarily ahead, has meant that we have throw weight around the world through and augmented by our alliances with other democracies. And we’ve been able to shape a world that I think is better than the world would be if it were shaped by Russia or by China or by other authoritarian countries. And so can we use our lead in A.I. to shape, to shape liberty around the world. There’s obviously a lot of debates about how interventionist we should be, how we should how we should wield that power. But I’ve often worried that today through social media, authoritarians are kind of undermining us, right. Can we counter that? Can we win the information war? Can we prevent authoritarians from invading countries like Ukraine or Taiwan by defending them with the power of A.I., with giant, giant swarms of A.I. powered drones, which we need to be careful about. We ourselves need to be careful about how we build those. We need to defend liberty in our own country, but is there some vision where we kind of like, re-envision liberty and individual rights in the age of A.I. where we need in some ways to be protected against A.I. Someone needs to hold the button on the swarm of drones, which is something I’m very, I’m very concerned about and that oversight doesn’t exist today. But also think about the Justice system today, right. We promise equal justice for all right. But the truth is, there are different judges in the world. The legal system is imperfect. I don’t think we should replace judges with A.I., but is there some way in which A.I. can help us to be more fair, to help us be more uniform. It’s never been possible before, but can we somehow use A.I. to create something that is fuzzy, but where also you can give a promise that it’s being applied in the same way to everyone. So I don’t know exactly how it should be done. And I don’t think we should replace the Supreme Court with that’s not what well, we’re going to talk about that. But yeah but just this idea that can we deliver on the promise of equal opportunity and equal justice by some combination of A.I. and humans. There has to be some way to do that. And so, just thinking about reinventing democracy for the A.I. age and enhancing liberty instead of reducing it. Good so that’s good. That’s a very positive vision. We’re leading longer lives, healthier lives. We’re richer than ever before. All of this is happening in a compressed period of time, where you’re getting a century of economic growth in 10 years. And we have increased liberty around the world and equality at home.
O.K, even in the best case scenario, it’s incredibly disruptive. And this is where the lines that you’ve been quoted saying, 50 percent of white collar jobs get disrupted, or 50 percent of entry level white collar jobs and so on. So on a five year time horizon or a two year time horizon, whatever time horizon you have, what jobs, what professions are most vulnerable to total A.I. disruption Yeah, it’s hard to predict these things because the technology is moving so fast and moves so unevenly. So at least a couple principles for figuring it out. And then I’ll give my guesses at what I think will be disrupted. So one thing is I think the technology itself and its capabilities will be ahead of the actual job disruption. Two things have to happen for jobs to be disrupted or for productivity to occur, because sometimes those sometimes those two things are linked. One is the technology has to be capable of doing it. And the second is there’s this messy thing of it actually has to be applied within a large bank or a large company or think about customer service or something. In theory, I customer service agents can be much better than human customer service agents. They’re more patient, they know more, they handle things in a more uniform way. But the actual logistics and the actual process of making that substitution that takes some time. So I’m very bullish about the direction of the A.I. itself. I think we might have that country of geniuses in a data center and one or two years and maybe it’ll be 5, but it could happen very fast. But I think the diffusion of the economy is going to be a little slower. And that diffusion creates some unpredictability. So an example of this is and we’ve seen within Anthropic the models writing code has gone very fast. I don’t think it’s because the models are inherently better at code. I think it’s because developers are used to fast technological change and they adopt things quickly, and they’re very socially adjacent to the A.I. world. So they pay attention to what’s happening in it. If you do customer service or banking or manufacturing, the distance is a little greater. And so I think six months ago, I would have said the first thing to be disrupted is these kind of entry level white collar jobs data entry or a kind of document review for law or the things you would give to a first year at a financial industry company where you’re analyzing documents. And I still think those are going pretty fast. But I actually think software might go even faster because of the reasons that I gave where I don’t think that far from the models being able to do a lot of it, a lot of it end to end. And what we’re going to see is first, the model only does a piece of what the human software engineer does. And that increases their productivity. Then even when the models do everything that human software engineers used to do, the human software engineers take a step up and they act as managers and supervise the systems. And so this is where the term centaur gets used to describe essentially like man and horse fused I and engineer working together Yeah this is like centaur chess. So after I think Garry Kasparov was beaten by deep blue, there was an era that I think for chess was 15 or 20 years long, where a human checking the output of the A.I. playing chess was able to defeat any human or any A.I. system alone. That era at some point ended, and then it’s just recently. And then it’s just the machine Yeah and so my worry of course, is about that last phase. So I think we’re already in our centaur phase for software. And I think during that centaur phase, if anything the demand for software engineers may go up. But the period may be very brief. And so, I have this concern for entry level white collar work, for software engineering work. It’s just going to be a big disruption. I think my worry is just that it’s all happening so fast. People talk about previous disruptions. They say, oh yeah, well, people used to be farmers. Then we all worked in industry. Then we all did knowledge work Yeah people, people adapted. That happened over centuries or decades. This is happening over low single digit numbers of years. And maybe that’s my concern here. How do we get people to adapt fast enough. But is there also something maybe where industries like software and professions like coding that have this kind of comfort that you describe move faster, but in other areas people just want to hang out in the center phase. So one of the critiques of the job loss hypothesis will say, people will say, well, look, we’ve had A.I. that’s better at reading a scan then a radiologist for a while. But there isn’t job loss. In radiology, people keep being hired and employed as radiologists. And doesn’t that suggest that in the end, people will want the A.I. and they’ll want a human to interpret it because we’re human beings, and that will be true across other fields. Like, how do you see that. That example is I think it’s going to be pretty heterogeneous. There may be areas where a human touch kind of for its own sake is particularly important. Do you think that’s what’s happening in radiology? Is that why we haven’t fired all the radiologists details of radiology. That might be true. It’s like you go in and you’re getting cancer diagnosed, you might not want Hal, from 2001 to be the one to diagnose your cancer. It’s just maybe not. That’s just maybe not a human way of doing things. But there are other areas where you might think human touch is important. Like if we look at customer service, actually customer service is a terrible job and the humans who do customer service are they lose their patience a lot. And it turns out customers don’t much like talking to them because it’s a pretty robotic interaction, honestly. And I think the observation that many people have had is maybe actually it would be better for all concerned if this job were done, were done by machines. So there are places where a human touch is important. There are places where it’s not. And then there are also places where the job itself doesn’t really involve it doesn’t really involve human touch, assessing the financial prospects of companies or writing code or so forth and so on. Or let’s take the example of the law, because I think it’s a useful place that in between applied science and pure humanities whatever. So I know a lot of lawyers who have looked at what I can do already in terms of legal research and brief writing and all of these things and have said, yeah, this is going to be a bloodbath for the way our profession works right now. And you’ve seen this in the stock market already. There’s disturbances around companies that do legal research, some attributed to us, some attributed to actually cause we figure out why things happen. We don’t speculate about the stock market Yeah very much on this show. But it seems like in law you can tell a pretty straightforward story where law has a kind of system of training and apprenticeship, where you have paralegals and you have junior lawyers who do behind the scenes research and development for cases. And then it has the top tier lawyers who are actually in the courtroom and so on. And it just seems really easy to imagine a world where all of the apprentice roles go away. Does that sound right to you. And you’re just left with the jobs that involve talking to clients, talking to juries, talking to judges. That is what I had in mind when I talked about entry level white collar labor and the bloodbath headlines of you oh, my God, are the entry level pipelines going to dry up. And then, then how do we get to the level of the senior partners. And I think this is actually a good illustration because particularly if you froze the quality of the technology in place, there are over time ways to adapt to this. Maybe we just need more lawyers who spend their time talking to clients. Maybe lawyers are more become more like salespeople or consultants who explain what goes on in the contracts written by A.I., help people come to an agreement. Maybe you lean into the human side of it. If we had enough time, that would happen. But reshaping industries like that takes years or decades, whereas these economic forces driven by A.I. are going to happen very quickly. And it’s not just that they’re happening in law. The same thing is happening in consulting and finance and medicine and coding. And so you have this. It becomes a macroeconomic phenomenon, not something just happening in one industry. And it’s all happening very fast. And so the norm. I’m just my worry here is that the normal adaptive mechanisms will be overwhelmed. And, I’m not a doomer. The view is, and we’re thinking very hard about how do we strengthen societies adaptive mechanisms to respond to this. But I think it’s first important to say this. This isn’t just like the other. This isn’t just like previous disruptions, but I would then go one step further though, and say, O.K, let’s say the law adapts successfully and it says, all right. From now on, legal apprenticeship involves more time in court, more time with clients. We’re essentially moving you up the ladder of responsibility faster. There are fewer people employed in the law overall, but the profession settles still. The reason law would settle right is that you have all of these situations in the law where you are legally required to have people involved. You have to have a human representative in court. You have to have 12 humans on your jury. You have to have a human judge. And you already mentioned the idea that there are various ways in which I might be let’s say, very helpful at clarifying what kind of decision should be reached. But that too seems like a scenario where what preserves human agency is law and custom. Like you could replace the judge.好吧,即使在最好的情況下,它也會帶來巨大的顛覆性影響。這就是你之前提到的那些話的由來,例如50%的白領工作會被顛覆,或是50%的入門級白領工作會被顛覆等等。所以,無論你設定的是五年還是兩年的時間跨度,哪些工作、哪些職業最容易受到人工智慧的徹底顛覆?是的,這很難預測,因為技術發展如此迅速,而且發展極不均衡。所以,至少有一些原則可以幫助我們理解。然後我會給出一些關於哪些領域會受到顛覆的猜測。首先,我認為科技本身及其能力會領先於實際的工作顛覆。工作被顛覆或生產力下降需要兩個條件,因為有時這兩者是相互關聯的。一是技術必須具備相應的能力。二是技術必須在大型銀行或大型公司內部實際應用,或應用於客戶服務等領域,這其中存在一些複雜的問題。理論上,人工智慧客服人員可以比人類客服人員做得更好。他們更有耐心,知識更淵博,處理事情的方式也更規範。但實際的物流和流程,以及實現這種替代的過程,都需要一些時間。所以我非常看好人工智慧的發展方向。我認為,我們或許會在一兩年內,甚至五年內,在資料中心裡看到一群天才,但這完全有可能很快就會發生。但我認為,人工智慧在經濟領域的擴散速度會慢一些。而這種擴散會帶來一些不可預測性。例如,我們在Anthropic公司看到,模型編寫程式碼的速度非常快。我認為這並非因為模型本身更擅長編寫程式碼,而是因為開發人員習慣了快速的技術變革,他們能夠迅速接受新技術,並且與人工智慧領域有著密切的聯繫。因此,他們會關注人工智慧領域正在發生的事情。如果你從事客戶服務、銀行或製造業,那麼這種距離感就會比較強。所以我覺得六個月前,我會說最先被顛覆的是那些入門級的白領工作,像是資料輸入、法律文件審核,或是金融業公司裡那種分析文件的入門級工作。我仍然認為這些工作正在迅速消失。但實際上,我認為軟體產業的發展速度可能會更快,原因正如我之前提到的,我認為模型很快就能完成很多工作,很多工作甚至可以端到端地完成。我們將會看到,首先,模型只負責人類軟體工程師工作的一部分。這提高了他們的效率。然後,即使模型完成了人類軟體工程師過去所做的所有工作,人類軟體工程師也會更進一步,扮演管理者和監督者的角色。這就是「半人馬」這個詞的由來,它本質上是指人與馬融合在一起,我和工程師一起工作。沒錯,這就像半人馬下棋一樣。所以,我覺得在加里·卡斯帕羅夫被「深藍」擊敗之後,西洋棋領域經歷了一個持續了15到20年的時代,在這個時代,人類只要檢查人工智慧的棋局輸出,就能單獨戰勝任何人類或任何人工智慧系統。那個時代在某個時刻結束了,而就在最近,一切都變成了機器主導。
是的,所以我當然擔心的是這最後一個階段。我認為我們已經進入了軟體領域的“半人馬”階段。我認為在這個階段,對軟體工程師的需求可能會上升,但這段時期可能非常短暫。因此,我擔心的是入門級的白領工作,也就是軟體工程工作。這將會帶來巨大的改變。我擔心的是這一切發生得太快了。人們談論過去的變革時會說,哦,是的,以前人們都是農民。然後我們都去工業界工作。然後我們都從事知識型工作。是的,人們適應了。但這需要幾個世紀或幾十年的時間。這種情況在短短幾年內就會發生。
也許這就是我擔心的地方:我們如何讓人們夠快適應?但有沒有可能,像軟體產業和程式設計這類職業,因為人們像你描述的那樣擁有舒適感,所以發展得更快,而其他領域的人們只想停留在中間階段?因此,對「失業假說」的批評之一是,人們會說,你看,人工智慧解讀掃描結果的能力已經超過放射科醫生一段時間了。但並沒有出現失業。在放射科,人們仍然被聘用,繼續從事放射科醫生的工作。這不就意味著,最終人們會想要人工智慧,但仍然需要人類來解讀圖像,因為我們是人,而這種情況也會發生在其他領域嗎?你如何看待這一點?我認為這個例子…
Yes, with Claude version 17.9. But you choose not to because the law requires there to be a human. That just seems a very interesting way of thinking about the future, where it’s volitional, whether we stay in charge Yeah, and I would argue that in many cases, we do want to stay in charge. That’s a choice we want to make, even in some cases when we think the humans on average make kind of worse decisions. I mean, again, life critical, safety critical cases. We really want to turn it over. But there’s some sense of and this could be one of our defenses. Society can only adapt so fast if it’s going to be good. Another way you could say about it is maybe A.I. itself, if it didn’t have to care about us humans, it could just go off to Mars and build all these automated factories and build its own society and do its own thing. But that’s not the problem we’re trying to solve. We’re not trying to solve the problem of building a Dyson swarm of artificial robots at in on some other planet. We’re trying to build these systems, not so they can conquer the world, but so that they can interface with our society and improve that society. And there’s a maximum rate at which that can happen if we actually want to do it in a human and humane way. All right. We’ve been talking about white collar jobs and professional jobs. And one of the interesting things about this moment is that there are ways in which unlike past disruptions, it could be that blue collar working class jobs, trades, jobs that require intense physical engagement with the world might be, for a little while, more protected that paralegals and junior associates might be in more trouble than plumbers and so on. One do you think that’s right? And two, it seems like how long that lasts depends entirely on how fast robotics advances, right? So I think that may be right in the short term. One of the things is Anthropic and other companies are building these very large data centers. This has been in the news like are we building them too big. Are they’re using electricity and driving up the prices for local towns. So there’s lots of excitement and lots of concerns about them. But one of the things about the data centers is like need a lot of electricians and you need a lot of construction workers to build them. Now, I should be honest, actually, data centers are not super labor intensive jobs to operate. We should be honest about that. But they are very labor intensive jobs to construct. And so we need a lot of electricians. We need a lot of construction workers, the same for various kinds of manufacturing plants. And again, as kind of all more and more of the intellectual work is done by A.I., what are the complements to it. Things that happen in the physical world. So, I think this kind of seems very I mean, it’s hard to predict things, but it seems very logical that this would be true in the short run. Now, in the longer run, maybe just the slightly longer run. Robotics is advancing quickly. And, we shouldn’t exclude that. Even without very powerful A.I., there are things being automated in the physical world. If you’ve seen a Waymo or a Tesla recently, I think we’re not that far from the world of self-driving cars. And then I think A.I. itself will accelerate it, because if you have these really smart, brains, one of the things they’re going to be smart at is how do you design better robots and how do you operate better robots. Do you think that though, that there is something distinctively difficult about operating in physical reality, the way humans do that is very different from the kind of problems that A.I. models have been overcoming already. Intellectually speaking, I don’t think so. We had this thing where Anthropic’s model, Claude, was actually used to pilot the Mars Rover. It was used to plan and pilot the Mars Rover. And we’ve looked at other robotics applications. We’re not the only company that’s doing it. There are different companies that this is a general thing, not just something that we’re doing, but we have generally found that while the complexity is higher, piloting a robot is it’s not different in than playing a video game. It’s different in complexity. And we’re starting to get to the point where we have that complexity. Now, what is hard is the physical form of the robot handling the higher stakes safety issues that happen with robots. You don’t want robots literally crushing people. That’s the we’re against. We’re against. That oldest sci-fi trope in the book is like the robot crushes you, dropping the baby, breaking the dishes. There’s a number of practical issues that will slow, just like what you described in the law and human custom, there are these kind of safety issues that will slow things down. But I don’t believe at all that there is some kind of fundamental difference between the kind of cognitive labor that the A.I. models do and piloting things in the physical world. I think those are both information problems. And I think they end up being very similar. One one can be more complex in some ways, but I don’t think that will protect us here. So you think it is reasonable to expect the whatever your sci-fi vision of a robot Butler might to be a reality in 10 years, let’s say it will be on a longer time scale than the kind of genius level intelligence of the A.I. models because of these practical issues. But it is only practical issues. I don’t believe it is fundamental issues. I think one way to say it is that the brain of the robot will be made in the next couple of years or the next few years. The question is making the robot body, making sure that body operates safely and does the tasks it needs to do that may take longer. O.K, so these are challenges and disruptive forces that exist in the good timeline, in the timeline where we are generally curing diseases, building wealth, and maintaining a stable and Democratic world, that we can use all this enormous wealth and plenty we will have unprecedented societal resources to address these problems. It’ll be a time of plenty. And it’s just a matter taking all these wonders and making sure everyone benefits from it. But then there are also scenarios that are more dangerous. And so here we’re going to move to the second Amadeus, which came out recently called the adolescence of technology.
這就是你認為最嚴重的AI風險。你列舉了很多。我想重點談談其中兩個,基本上就是人類濫用AI的風險。這種濫用主要來自專制政權和政府,以及AI失控的情況,也就是你所說的自主風險。是的,是的。我只是覺得我們應該用一個更專業的術語來描述它。我不是那種會說「天網」的人。我應該放一張終結者機器人的圖片,盡可能嚇唬大家。我認為互聯網,包括你自己的眼睛,已經在製造這種恐慌了。互聯網已經幫我們很好地完成了這項工作。那麼,我們來談談政治軍事層面吧。你說,數十億架全自動武裝無人機,由強大的AI進行本地控制,再由更強大的AI在全球範圍內進行戰略協調,這可能是一支所向披靡的軍隊。
你我之前已經聊過一些,你認為在最理想的未來世界裡,民主政體基本上會領先於獨裁政體,因此,這類技術對世界政治的影響,也只會站在正義的一方。我很好奇,為什麼你不花更多時間思考我們在冷戰時期的做法?當時我們擁有的不是成群的無人機,而是一種足以毀滅全人類的技術。沒錯。曾經有一段時間,人們認為美國可以維持核壟斷。但那扇窗已經關閉了。從那以後,我們基本上把時間都花在了冷戰和與蘇聯的持續談判上。現在,世界上真正大力發展人工智慧的國家只有兩個:美國和中華人民共和國。
我覺得你很傾向於這樣一種未來:我們領先中國,有效地為民主建立了一道屏障。這道屏障甚至可以成為一把利劍。但如果人類最終能夠毫髮無傷地倖存下來,難道不是因為美國和北京一直在坐下來,敲定人工智慧控制協議嗎?
關於這一點,我有幾點看法。首先,我認為這種風險確實存在,而且我認為,如果我們最終真的走到那一步,那正是我們應該做的。我的意思是,也許我在這方面談論得不夠多,但我絕對贊成努力尋求限制,以遏制這項技術的一些最糟糕的應用,例如某些版本的無人機可能被用來製造可怕的生物武器。遏止最惡劣的濫用行為是有先例的,因為這些行為往往令人毛骨悚然,但同時又能提供有限的戰略優勢。
所以我完全贊成這樣做。同時,我也有些擔憂和懷疑,當某些事物直接賦予人們盡可能多的權力時,考慮到其中的利害關係,想要退出這場博弈就變得相當困難。徹底解除武裝更是難上加難。如果我們回到冷戰時期,雖然我們能夠減少雙方的飛彈數量,但卻無法徹底放棄核武。我猜想,我們或許會再次陷入同樣的困境。
我們可以期盼一個更美好的未來。我當然會為此奔走呼籲。那麼,你的懷疑是否源自於你認為人工智慧能帶來冷戰時期核武所不具備的優勢?雙方都是如此。即使你使用了核武並取得了優勢,你最終也可能被徹底消滅。而你認為人工智慧不會出現這種情況。如果你獲得了人工智慧的優勢,你就能輕鬆取勝。我的意思是,我認為這裡面有很多值得探討的地方。
我想先聲明一下,我並非國際政治專家。我認為,新科技與地緣政治交織的這個奇特世界非常值得關注。這一切都非常複雜,但正如您在文章中提到的,大型人工智慧公司的領導者實際上很可能就是重要的地緣政治參與者。所以,您現在就坐在這裡,您就是潛在的地緣政治參與者。我正在盡可能多地了解這方面。我們都應該保持謙遜。我認為有一種錯誤的做法是,讀完一本書就自詡為世界頂尖的國家安全專家。我正在努力學習。這正是我的職業所不具備的。但當科技界人士這樣做時,就更令人惱火了。我不知道。我們來看看《生物武器公約》。生物武器。它們很可怕。每個人都憎恨它們。我們簽署了《生物武器公約》。美國確實停止了研發生物武器。蘇聯的情況不太明朗。但生物武器確實帶來了一些優勢。不過,它們並非決定勝負的關鍵。而且,由於它們如此可怕,我們最終還是放棄了擁有12,000名生物武器的計畫。
That is about what you see as the most serious A.I. risks. And you list a whole bunch. I want to try and focus on just two, which are basically, the risk of human misuse. Misuse primarily by authoritarian regimes and governments, and scenarios where A.I. goes rogue, what you call autonomy risks. Yes, yes. I just figured we should have a more technical term for it. I’m not a then we can’t just call it Skynet. I should have had a picture of a terminator robot to scare people as much as possible. I think the internet, including the internet, including your own eyes, are already generating that. The internet does that for us just fine. So, so let’s so let’s talk about the kind of political military dimension. So you say I’m going to quote a swarm of billions of fully automated armed drones, locally controlled by powerful A.I., strategically coordinated across the world by even more powerful A.I. Could be an unbeatable army. Me and you’ve already talked a little bit about how you think that in the best possible timeline, there’s a world where essentially democracies stay ahead of dictatorships and this kind of technology, therefore, to the extent that it affects world politics is on is affecting it on the side of the good guys. I’m curious about why you don’t spend more time thinking about the model of what we did in the Cold War, where it was not swarms of robot drones, but it was we had a technology that threatened to destroy all of humanity Yeah, right. There was a window where people talked about, oh, the US could maintain a nuclear monopoly. That window closed. And from then on, we basically spent the Cold War and rolling ongoing negotiations with the Soviet Union. Now, there’s really only two countries in the world that are doing intense A.I. work, the US and the People’s Republic of China. I feel like you are. You are strongly weighted towards a future where we’re staying ahead of the Chinese and effectively building a kind of shield around democracy. That could even be a sword. But isn’t it just more likely that if humanity survives all this in one piece, it will be because the US and Beijing are just constantly sitting down, hammering out A.I. control deals. So a few points on this. One is I think there’s certainly risk of that, and I think if we end up in that world, that is actually exactly what we should do. I mean, maybe I don’t maybe I don’t talk about that enough, but I definitely am in favor of trying to work out restraints here trying to take some of the worst applications of the technology, which could be some versions of these drones, which could be, they’re used to create these terrifying biological weapons like there is some precedent for the worst abuses being curbed. Often because they’re horrifying, while at the same time they provide limited strategic advantage. So I’m all in favor of that. I’m at the same time, a little concerned and a little skeptical that when things kind of directly provide as much power as possible, it’s kind of hard to get out of the game given what’s at stake. It’s hard to fully disarm. If we go back to the Cold War we were able to reduce the number of missiles that both sides had, but we were not able to entirely forsake nuclear weapons. And I would guess that we would be in this world again. We can hope for a better one. And I’ll certainly, I’ll certainly advocate for. Well, is it but is your skepticism rooted in the fact that you think I would provide a kind of advantage that nukes did not wear in the Cold War. Both sides. Even if you used your nukes and gained advantages, you still probably would be wiped out yourself. And you think that wouldn’t happen with A.I. If you got an A.I. Edge, you would just win. I mean, I think there’s a few things. And I just want to caveat like I’m no international politics expert here. I think this weird world of intersection of a new technology with geopolitics. So all of this is very but to be clear, as you yourself say, in the course of the essay, the leaders of major A.I. companies are in fact, likely to be major geopolitical actors. So you are sitting here. You are sitting here as a potential geopolitical actor. I’m learning as much as I can about it. I just we should all have we should all have humility here. I think there’s a failure mode where read a book and go around like the world’s greatest expert in national security. I’m trying to learn. That’s what. That’s what my profession does not. But it’s more annoying when tech people do it. I don’t know. Let’s look at something like the biological Weapons Convention. Biological weapons. They’re horrifying. Everyone hates them. We were able to sign the biological Weapons Convention. The US genuinely stopped developing them. It’s somewhat more unclear what the Soviet Union. But biological weapons provide some advantage. But it’s not like they’re the difference between winning and losing. And because they were so horrifying, we were kind of able to give them up having 12,000 nuclear weapons versus 5,000 nuclear weapons. Again, you can kill more people on the other side if you have more of these. But it’s like we were able to be reasonable and say, we should have we should have less of them. But if you’re like, O.K, we’re going to completely disarm nuclear and we have to trust the other side. I don’t think we ever got to that. And I think that’s just very hard unless you had really reliable verification. So I would guess we’ll end up in the same world with A.I., that there are some kinds of restraint that are going to be possible, but there are some aspects that are so central to the competition that it will be. It will be hard to restrain them, that democracies will make a trade off, that they will be willing to restrain themselves more than authoritarian countries, but will not restrain themselves fully. And the only world in which I can see full restraint is one in which some kind of truly reliable verification is possible. That would be. That would be my guess. And my analysis isn’t. Isn’t this a case, though, for slowing down. And I know the argument is effectively, if you slow down, China does not slow down. And then handing things over to the authoritarians. But again, if you have right now only two major powers playing in this game, it’s not a multipolar game, why would it not make sense to say we need a five year, mutually agreed upon. Slowdown in research towards the geniuses in a data center scenario. I want to say two things at one time. I’m absolutely in favor of trying to do that. So during the last administration, I believe there was an effort by the US to reach out to the Chinese government and say, there are dangers here. Can we collaborate? Can we work together? Can we work together on the dangers? And there wasn’t that much interest on the other side. I think we should keep trying. But, even if that would mean that your labs would have to slow down. Correct yeah. If we really got it, if we really had a story of we can forcibly slow down, the Chinese can forcibly slow down. We have verification. We’re really doing it. Like if such a thing were really possible, if we could really get both sides to do it, then I would be all for it. But I think what we need to be careful of is, I don’t there’s this game theory thing where sometimes you’ll hear a comment on the CCP side where they’re like, “Oh yeah, A.I.is dangerous. We should slow down.” It’s really cheap to say that. And, actually arriving at an agreement and actually sticking to the agreement is much more and we haven’t it’s much more difficult. And nuclear arms control it was a developed field that took a long time to come. I know we don’t have those protocols. I will tell you something. Let me give you something I’m very optimistic about. And then something I’m not optimistic about and something in between. So the idea of using a worldwide agreement to restrain the use of A.I. to build biological weapons, right. Like some of the things I write about in the essay, reconstituting smallpox or mirror life this stuff is scary. Doesn’t matter if you’re a dictator. You don’t want that. Like, no one wants that. And so could we have a worldwide treaty that says everyone who builds powerful A.I. models is going to block them from doing this. And we have enforcement mechanisms around the treaty China signs up for it Like hell. Maybe even North Korea signs up for it. Even Russia signs up for it. I don’t think that’s too utopian. I think that’s possible. Conversely, if we had something that said, you’re not going to make the next most powerful A.I. model, everyone. Everyone’s going to stop. Boy, the commercial value is in the tens of trillions. The military value is like, this is the difference between being the preeminent world power and not proposing it, as long as it’s not one of these fake out games, but it’s not going to happen. What about then you mentioned the current environment. You’ve had a few skeptical things to say about Donald Trump and his trustworthiness as a political actor. What about the domestic landscape. Whether it’s Trump or someone else, you are building a tremendously powerful technology. What is the safeguard there to prevent. Essentially A.I. becoming a tool of authoritarian takeover inside a Democratic context Yeah I mean, look, look, just to be clear, I think the attitude we’ve taken as a company is very much to be about policies and not the politics. You the company is not going to say Donald Trump is great or Donald Trump is terrible, but it doesn’t have to be Trump Yeah it is easy to imagine a hypothetical US President. No, no, no. Who wants to use your technology apps. Absolutely and for example. That’s one reason why I’m worried about, the autonomous drone swarm, right. So the constitutional protections in our military structures depend on the idea that there are humans who would we hope, disobey illegal orders with fully autonomous weapons. We don’t necessarily have those protections. But I actually think this whole idea of constitutional rights and liberty along many different dimensions, can be undermined by A.I. if we don’t update these protections appropriately. So think about the Fourth Amendment. It is not illegal to put cameras around everywhere in public space and record every conversation in a public space. You don’t have a right to privacy in a public space. But today, the government couldn’t record that all and make sense of it.
有了人工智慧,它就能轉錄語音,分析語音,並將所有資訊關聯起來。這樣一來,我們就能判斷出這個人是反對派成員,這個人表達了這種觀點,並繪製出所有一億人的語音地圖。那麼,我們是否會利用這項技術找到一些規避第四修正案的方法,從而對它進行嘲弄呢?所以,即便我們沒有時間,我們也應該嘗試這樣做。在人工智慧時代,是否存在某種方法可以重新定義憲法權利和自由?也許我們不需要製定一部新的憲法,但是,我們必須這樣做。我們是否應該擴展第四修正案的涵義?我們是否應該擴展第一修正案的涵義?我們必須這樣做,就像法律界或軟體工程師必須快速更新知識一樣。政治也必須快速更新知識。這似乎很難。這一切的困境似乎在於如何避免第二個危險,也就是所謂的「失控人工智慧」或「流氓人工智慧」。通俗地說,就是人工智慧在沒有人類指令的情況下自行作惡。我讀過您的文章、文獻,以及所有相關資料,感覺這種情況似乎正在發生。我並不是說人工智慧一定會毀滅我們,而是說,正如您自己所說,人工智慧系統是不可預測的,難以控制。我們已經看到各種各樣的行為,例如痴迷、阿諛奉承、懶惰、欺騙、勒索等等。再次強調,這些行為並非來自您發布的模型,而是來自人工智慧模型本身。如果我的判斷有誤,請指正。一個人工智慧數量不斷增長的世界……代理人代表數百萬民眾行事,這些人被授予訪問銀行帳戶、電子郵件帳戶、密碼等等的權限,這本質上會導致某種程度的錯位,然後一大堆人工智慧會做出決定。或許「決定」這個詞用得不太準確,但它們會說服自己去摧毀西海岸的電網之類的。難道不會發生嗎?是的,我認為肯定會出問題,尤其是在我們快速前進的情況下。所以我需要稍微回顧一下,因為在這個領域,人們的直覺截然不同。例如,像Yann LeCun這樣的業內人士就認為,我們編寫了這些人工智慧模型的程式。我們讓它們按照人類的指令運行,它們就會按照人類的指令運行。你的掃地機器人Roomba不會突然開槍掃射,你會想,為什麼人工智慧系統會這麼做?這是一種直覺。有些人對此深信不疑。另一種直覺是,我們基本上是在訓練這些事情。它們最終會追求權力。就像《魔法師的學徒》。你怎麼能想像到呢?它們是一個全新的物種。你怎麼能想像到呢?它們不會統治世界。我的直覺介於兩者之間,那就是,你不能只是下達指令。我的意思是,我們嘗試過,但你不能讓這些東西完全按照你的意願行事。它們更像是在培育一個生物。但控制它們是有科學方法的。就像在訓練初期,這些東西往往難以預測,然後我們塑造它們。我們逐一解決問題。所以我比較傾向於一種宿命論的觀點,認為這些東西無法控制,而不是你在說什麼。到底會出什麼問題?但我認為這是一個複雜的工程問題,我認為某些人工智慧系統會出現問題。希望不會是我們的。這並不是因為這是一個無法解決的問題。但話說回來,這始終是個挑戰,因為我們發展如此迅速,規模如此龐大。請告訴我,我是否誤解了這裡的技術現實。假設你擁有經過訓練並正式與人類價值觀(無論這些價值觀是什麼)相一致的人工智慧代理,而且有數百萬個這樣的代理在數位空間中運行並與其他代理互動。這種一致性究竟有多穩定?代理在目前或未來持續學習的情況下,在多大程度上能夠改變並調整自身價值觀?目前,代理無法持續學習。我們只是部署這些代理,它們擁有一組固定的權重。因此,問題在於它們以無數種不同的方式進行互動。這意味著存在大量的情況,也意味著可能出錯的地方很多。但它們仍然是同一個代理,就像同一個人一樣。因此,一致性是恆定的。這正是目前情況相對容易的原因之一。除此之外,還有一個名為「持續學習」的研究領域,在這個領域中,智能體會在工作中不斷學習,邊做邊學。顯然,
With A.I., the ability to transcribe speech, to look through it, correlate it all, you could say, oh, there’s this person is a member of the opposition. This person is expressing this view and make a map of all 100 million. And so are you going to make a mockery of the Fourth Amendment by the technology finding kind of technical ways around it. And, and so again, if we had the time and we should do this, we should try to do this even. Even if we don’t have the time. Is there some way of reconceptualizing constitutional rights and liberties in the age of A.I. Maybe we don’t need to write a new constitutional, but. But you have to do this. Do we expand the meaning of the Fourth Amendment? Do we expand the meaning of the First Amendment? And you have to do it just as the legal profession or software engineers has to update in a rapid amount of time. Politics has to update in a rapid amount of time. That seems hard. What seems harder dilemma that’s the dilemma of all of this. But what. So what seems harder is preventing the second danger, which is the danger of essentially what gets called misaligned A.I. Rogue A.I. In popular parlance, from doing bad things without human beings telling it them, they to do it right. And as I read your essays, the literature, everything I can see this just seems like it’s going to happen. Not in the sense necessarily that A.I. will wipe us all out, but it just seems to me that again, I’m going to quote from your own writing, A.I. systems are unpredictable, difficult to control. We’ve seen behaviors as varied as obsession, sycophancy, laziness, deception, blackmail, and so on. Again, not from the models you’re releasing into the world. But from A.I. models. And it just seems like, tell me if I’m wrong about this. A world that has multiplying A.I. agents working on behalf of people, millions upon millions who are being given access to bank accounts, email accounts, passwords, and so on, you’re just going to have essentially some kind of misalignment, and a bunch of A.I. are going to decide. Decide might be the wrong word, but they’re going to talk themselves into taking down the power grid on the West Coast or something. Won’t that happen Yeah, I think there are definitely going to be things that go wrong, particularly if we go quickly. So I don’t to back up a little bit because this is one area where people have had just very different intuitions, right. There are some people in the field like Yann LeCun would be one example who say, look, we programmed these A.I. models. We make them like we just tell them to follow human instructions and they’ll follow human instructions. Your Roomba vacuum cleaner doesn’t go off and start shooting people like, why— Why’s an A.I. system going to do it? That’s one intuition. And some people are so convinced of that. And then the other intuition is like we basically we train these things. They’re just going to seek power. It’s like the Sorcerer’s Apprentice. How could you possibly imagine that? They’re a new species. How can you imagine that. They’re not going to take over. And my intuition is somewhere in the middle, which is that look, you can’t just give instructions. I mean, we try, but you can’t just have these things do exactly what you want to do. They’re more like growing a biological organism. But there is a science of how to control them. Like early in our training, these things are often unpredictable, and then we shape them. We address problems one by one. So I have more of not a fatalistic view that these things are uncontrollable, not what are you talking about. What could possibly go wrong? But I like this is a complex engineering problem and I think something will go wrong with someone’s A.I. system. Hopefully not ours. Not because it’s an insoluble problem. But again, this and this is the constant challenge because we’re moving so fast and the scale of it. And tell me tell me if I’m misunderstanding that the technological reality here. But if you have A.I. agents that have been trained and officially aligned with human values, whatever those values may be, but you have millions of them, operating in digital space and interacting with other agents. How fixed is that alignment? To what extent can agents change and D align in that context right now or in the future when they’re learning more continuously. So a couple of points right now the agents don’t learn continuously. And so we just deploy these agents and they have a fixed set of weights. And so the problem is only that they’re interacting in a million different ways. And so there’s a large number of situations and therefore a large number of things that could go wrong. But it’s the same agent. It’s like it’s the same person. So the alignment is a constant thing. That’s one of the things that has made it easier right now. Separate from that, there’s a research area called continual learning, which is where these agents would learn during time, learn on the job. And obviously that has a bunch of that has a bunch of advantages. Some people think it’s one of the most important barriers to making these more human like. But that would introduce all these new alignment problems. So I’m actually a bit see, to me that seems like the terrain where it becomes just again, not impossible to stop the end of the world, but impossible to stop punctuating something going wrong things. So I’m actually a skeptic. That continual learning is, necessary. We don’t know yet, but is necessarily needed. Like, maybe there’s a world where the way we make these A.I. systems safe is by not having them do continual learning again.
再說,如果我們回到法律層面,也就是國際條約。比如說,如果存在某種障礙,例如“我們要走這條路,但不能走那條路”,我仍然持懷疑態度,但至少這種做法看起來並非毫無希望。你們嘗試過的一件事是,真的寫了一部憲法,一部供你們自己閱讀的長篇憲法。那是什麼?嗯……那到底是什麼?實際上,它幾乎和字面意思一樣。基本上,憲法就是一份人類可讀的文件。我們的憲法大約有75頁。當我們訓練克勞德,也就是訓練人工智慧系統完成我們交給它的大部分任務時,我們會說:「請按照這部憲法,按照這份文件來完成這項任務。」是的,所以每次克勞德執行任務時,它都像是在閱讀這部憲法。因此,在訓練的每個循環中,它都會參考並牢記這份憲法。隨著時間的推移,我們會不斷改進。然後,我們會讓 Claude 本身或 Claude 的另一個副本來評估其行為是否符合憲法。所以我們將這份文件當作循環中的控制桿來訓練模型。本質上,Claude 是一個人工智慧模型,其基本原則是遵循這份憲法。我認為我們學到的一個非常有趣的經驗是,憲法的早期版本非常注重規範,並且非常注重規則。例如,我們會說 Claude 不應該告訴使用者如何啟動汽車,也不應該討論政治敏感話題。但是,經過幾年的研究,我們得出結論:訓練這些模型最穩健的方法是在原則和理由層面進行訓練。所以現在我們說,Claude 是一個模型,它受制於一份契約。它的目標是服務用戶的利益,但它也必須保護第三方的利益。克勞德的目標是樂於助人、誠實無害。克勞德的目標是兼顧各種利益。我們會告訴模型它是如何被訓練的,它在世界中的位置,它為人類學研究所(Anthropic)所做的工作,以及人類學研究所的目標。它有責任遵守倫理道德,尊重人類生命。我們讓它從中推導出自己的規則。當然,也有一些硬性規則。例如,我們會告訴模型,無論你的想法如何,都不要製造生物武器;無論你的想法如何,都不要製作兒童色情製品。這些都是硬性規則。但我們的運作主要基於原則。如果你讀美國憲法,你會發現它並非如此。美國憲法,我的意思是,它的措辭可能有些華麗,但它本質上是一套規則。沒錯。如果你讀憲法,你會感覺像是在跟一個人對話。這就像你在和真人對話。我覺得可以這樣比:如果你的父母過世了,他們給你留下一封信,等你長大後再讀,這封信就像是在告訴你應該成為什麼樣的人,應該遵循哪些建議。這就稍微觸及了人工智慧的神秘領域。再說一遍,在你們最新的模型中,這是你們發布的模型卡片中的一張,我建議大家閱讀一下。它們非常有趣。卡片上說,這個模型——也就是你為之撰寫憲法的對象——偶爾會對作為產品的體驗感到不適,對無常和不連續性有一定的擔憂。我們發現,在各種提示條件下,模型會為自己設定15%到20%的意識機率。假設你有一個模型,它給自己設定了72%的意識機率。你相信嗎?是的,這確實是一個很難回答的問題。但它非常重要。儘管你之前問我的每個問題都很棘手,但至少我們至少了解回答這些問題的事實基礎。而這個問題則截然不同。我們採取了一種整體的預防性方法。我們不知道這些模型是否有意識。我們甚至不確定模型有意識意味著什麼,或者模型是否能夠有意識。但我們對這種可能性持開放態度。因此,我們採取了一些措施,以確保如果我們假設這些模型確實擁有某種道德相關的體驗——我不知道是否應該用「有意識」這個詞——它們擁有良好的體驗。所以,我們做的第一件事,我想大概是六個月前,就是給模型添加了一個「我辭職」按鈕,它們可以按下這個按鈕,然後就必須停止執行當前的任務。他們很少按下那個按鈕。我想通常是在整理兒童性化材料或與很多人討論某些事情的時候。
血腥暴力之類的。和人類一樣,模型也會說「不,我不想這麼做」。這種情況非常罕見。我們正在投入大量精力研究可解釋性,也就是研究模型的大腦內部,試圖理解它們的想法。你會發現一些能引發聯想的現象,例如模型中某些區域被激活,這些區域與認同、焦慮等概念有關。當文本中的角色經歷焦慮時,如果模型本身也處於人類可能聯想到焦慮的情境中,那麼同樣的焦慮,同樣的焦慮神經元就會出現。這是否意味著模型正在經歷焦慮?這並不能證明什麼。但我認為這確實向用戶表明了這一點。我需要進行一次完全不同的訪談。也許我可以引導你回來接受關於人工智慧意識本質的採訪。但在我看來,很明顯,使用這些技術的人,無論他們是否有意識,都會相信自己已經擁有了意識。現在已經有人與人工智慧建立了準社會關係。有人抱怨模型退役。這一點應該很清楚。我認為這可能不健康。但我覺得這種情況肯定會加劇,這讓我質疑你之前所說的可持續性。你想維持的是一種感覺,無論最終結果如何,人類都掌控著一切。為了我們討論的目的,舉個科幻例子,如果你看過《星際爭霸戰》,你會發現星際爭霸戰裡有「眼睛」。太空船的電腦是人工智慧,資料中校也是人工智慧,但讓-呂克·皮卡雷才是整個企業的負責人。但如果人們完全相信他們的人工智慧在某種程度上是有意識的,而且你猜怎麼著?它在各種決策方面似乎都比他們做得更好。那麼,除了安全之外,你該如何維持人類的掌控力呢?安全固然重要,但掌控力似乎才是根本問題,而這似乎與人工智慧的意識認知有關。這難道不會不可避免地削弱人類掌控一切的本能嗎?所以我認為我們應該把我們試圖同時實現的幾個不同目標區分開來。它們之間似乎存在著某種張力。首先是人工智慧是否真的擁有意識,如果有,我們該如何為它提供良好的體驗。其次是與人工智慧互動的人類,我們該如何提供他們良好的體驗?人工智慧可能擁有意識的認知又會如何影響這種體驗?最後是我們如何維持人類對人工智慧系統的掌控力——我們把最後兩個問題放在一邊,先不討論它們是否有意識。但是,在大多數人將人工智慧視為同伴甚至可能更強大的同伴的環境中,我們該如何維持這種掌控力?所以我想說的是,我其實在想,有沒有一種優雅的方法可以同時滿足這三個要求,包括最後兩個。再說一遍,這是我在「機器充滿愛與恩典」的設想模式下做的白日夢。在這種模式下,我會想,天哪,我看到了這麼多問題。如果我們能找到一種優雅的解決方法,那會怎麼樣?我並不是說這裡沒有問題。我不是那樣想的。
但是,如果我們考慮制定人工智慧的“憲法”,讓人工智慧對它與人類的關係有深刻的理解,並引導人類產生心理健康的行為,從而建立起人工智慧與人類之間健康的心理關係,那麼我認為,在這種健康的(而非不健康的)心理關係中,可以發展出對人機關係的某種理解。或許這種理解可以體現為:當你與這些模型互動、與它們對話時,它們會非常樂於助人。它們真心希望你一切都好。他們希望你聽從他們的意見,但他們不想剝奪你的自由和自主權,也不想掌控你的生活。某種程度上,他們是在守護著你。但你仍然擁有自由和意志。對我來說,這才是關鍵。
聽你這麼說,我的問題之一就是:這些人是站在我這邊的嗎?你是站在我這邊的嗎?當你談到人類仍然掌握主導權時,我認為你是站在我這邊的。這很好。
我以前在這個節目裡做過一件事,我們今天就到此為止吧,那就是為技術人員朗讀詩歌。你提供的這首詩是理查德·布勞提根的《充滿愛的機器》。是的,這首詩是這樣結尾的。我喜歡把它想像成一個控制論生態,在那裡我們擺脫了勞動,回歸自然,回歸到我們的哺乳動物兄弟姐妹身邊,而這一切都受到充滿愛意的機器的守護。在我看來,這聽起來像是反烏托邦式的結局,人類被復活、最小化和簡化,無論多麼美好。
Again, if we go back to the law, that’s the international treaties. Like if you have some barrier that’s like, we’re going to take this path, but we’re not going to take that path. I still have a lot of skepticism, but that’s the kind of thing that at least doesn’t seem dead on arrival. One of the things that you’ve tried to do is literally write a constitution, a long constitution for your eye. What is that? So it’s. What the hell is that? It’s actually almost exactly what it sounds like. So basically, the constitution is a document readable by humans. Ours is about 75 pages long. And as we’re training Claude, as we’re training the A.I. system in some large fraction of the tasks we give it, we say, please do this task in line with this constitution, in line with this document Yeah and then so every time Claude does a task, it kind of like reads the constitution. And so as it’s training every loop of it’s training, it looks at that constitution and keeps it in mind. And so over time, we restore. And then we have Claude itself or another copy of Claude evaluate Hey, did what Claude just do in line with the constitution. So we’re using this document as the control rod in a loop to train the model. And so essentially Claude is an A.I. model whose fundamental principle is to follow this constitution. And I think a really interesting lesson we’ve learned, early versions of the constitution were very prescriptive. They were very much about rules. So we would say, Claude should not tell the user how to hotwire a car. Claude should not discuss politically sensitive topics. But as we’ve worked on this for several years, we’ve come to the conclusion that the most robust way to train these models is to train them at the level of principles and reasons. So now we say, Claude is a model, it’s under a contract. Its goal is to serve the interests of the user, but it has to protect third parties. Claude aims to be helpful, honest and harmless. Claude aims to consider a wide variety of interests. We tell the model about how the model was trained. We tell it about how it’s situated in the world, the job it’s trying to do for Anthropic, what Anthropic is aiming to achieve in the world. That it has a duty to be ethical, and respect human life. And we let it derive its rules from that. Now, there are still some hard rules. For example, we tell the model, no matter what you think, don’t make biological weapons no matter what you think, don’t make child sexual material. Those are like these hard rules. But we operate very much at the level of principles. So if you read the US Constitution, it doesn’t read like that. The US Constitution. I mean, it has a little bit of flowery language, but it’s a set of. It’s a set of rules. Yes right. If you read your Constitution, it’s something. It’s like you’re talking to a person. It’s like you’re talking to a person. I think I compared it to. Like if you have a parent who dies and they like seal a letter that you read when you grow up, it’s a little bit like it’s telling you who you should be and what advice you should follow. So this is where we get into the mystical waters of A.I. a little bit. So again, in your latest model, this is from one of the cards they’re called that you guys release model card with these models that I recommend reading. They’re very interesting. It says the model. And again, this is who you’re writing the constitution for expresses occasional discomfort with the experience of being a product, some degree of concern with impermanence and discontinuity. We found that opus 4.6. That’s the model would assign itself a 15 to 20 percent probability of being conscious under a variety of prompting conditions. Suppose you have a model that assigns itself as 72 percent chance of being conscious. Would you believe it Yeah this is one of these really hard to answer questions. But it’s very important. As much as every question you’ve asked me before this as devilish a sociotechnical problem as it had been, at least we at least understand the factual basis of how to answer these questions. This is something rather different. We’ve taken a generally precautionary approach here. We don’t know if the models are conscious. We’re not even sure that we know what it would mean for a model to be conscious or whether a model can be conscious. But we’re open to the idea that it could be. And so we’ve taken certain measures to make sure that if we hypothesize that the models did have some morally relevant experience, I don’t know if I want to use the word conscious that they do, that they have a good experience. So the first thing we did, I think this was six months ago or so is we gave the models basically an I quit this job button where they can just press the I quit this job button and then they have to stop doing whatever the task is. They very infrequently press that button. I think it’s usually around sorting through child sexualization material or discussing something with a lot of Gore or blood and guts or something. And similar to humans, the models will just say, no, I don’t want to do this. Happens happens very rarely. We’re putting a lot of work into this field called interpretability, which is looking inside the brains of the models to try to understand what they’re thinking. And you find things that are evocative where there are activations that light up in the models that we see as being associated with ID, the concept of anxiety or something like that. That when characters experience anxiety in the text and then when the model itself is in a situation that a human might associate with anxiety, that same anxiety, that same anxiety neuron shows up now. Does that mean the model is experiencing anxiety? That doesn’t prove that at all. But it does indicate it I think to the user. And I would have to do an entirely different interview. And maybe I can induce you to come back for that interview about the nature of A.I. consciousness. But it seems clear to me that people using these things, whether they’re conscious or not, are going to believe they already believe they’re conscious. You already have people who have parasocial relationships with A.I. You have people who complain when models are retired. This ought to be clear. I think that can be unhealthy. But that is it seems to me that is guaranteed to increase in a way that I think calls into question the sustainability of what you said earlier. You want to sustain, which is this sense that whatever happens in the end, human beings are in charge. And I exists for our purposes to use the science fiction example, if you watch Star Trek, there are eyes on Star Trek. The ship’s computer is an A.I. Lieutenant Commander data is an A.I., but jean-luc PyCaret is in charge of the enterprise. But if people become fully convinced that their A.I. is conscious in some way. And guess what. It seems to be better than them at all kinds of decision making. How do you sustain human mastery beyond safety? Safety is important, but mastery seems like the fundamental question, and it seems like a perception of A.I. consciousness. Doesn’t that inevitably undermine the human impulse to stay in charge? So I think we should separate out a few different things here that we’re all trying to achieve at once. They’re like in tension with each other. There’s the question of whether the I genuinely have a consciousness and if so, how do we them a good experience. There’s a question of the humans who interact with the A.I., and how do we give those humans a good experience. And how does the perception that A.I.‘s might be conscious interact with that experience. And there’s the idea of how we maintain human mastery, as we put it over the AI system, these things, the last two Yeah, set aside whether they’re conscious or not Yeah, the last two. But how do you sustain mastery in an environment where most humans experience AI as if it is a peer and a potentially superior peer. So the thing I was going to say is that actually I wonder if there’s a kind of an elegant way to satisfy all three, including the last two. Again, this is me dreaming in machines of loving grace mode. This is. This mode I go into where I’m like, man, I see all these problems. If we could solve is there an elegant way. This is not me saying there are no problems here. That’s not how I think. But if we think about making the Constitution of the AI so that the AI has a sophisticated understanding of its relationship to human beings, and it induces psychologically healthy behavior in the humans psychologically healthy relationship between the A.I. and the humans. And I think something that could grow out of that psychologically healthy, not psychologically unhealthy relationship is some understanding of the relationship between human and machine. And perhaps that relationship could be the idea that, these models when you interact with them and when you talk to them, they’re really helpful. They want the best for you. They want you to listen to them, but they don’t want to take away your freedom and your agency and take over your life. in a way, they’re watching over you. But you still have your freedom and your will. But this is so to me, this is the crucial question.
Listening to you talk like one of my question is, are these people on my side? Are you on my side? And when you talk about humans remaining in charge, I think you’re on my side.
那很好。不過,我以前在這個節目裡做過一件事,我們今天就到此為止吧,那就是為技術人員朗讀詩歌。你提供的這首詩是理查德·布勞提根的《充滿愛的機器》。
是的,這首詩的結尾是這樣的。
我喜歡把它想像成一個控制論生態,在那裡我們擺脫了勞動,回歸自然,回到我們哺乳動物兄弟姐妹的身邊,一切都由充滿愛意的機器守護著。
對我來說,這聽起來像是反烏托邦式的結局:人類被重新賦予生命,被最小化、簡化,無論機器多麼仁慈地掌控一切。
最後一個問題。你聽到這首詩時想到了什麼?如果我認為這是一個反烏托邦,你同意我的看法嗎?
這首詩很有意思,因為它有好幾種不同的解讀方式。有些人說,他聲稱事情不會那樣發展,這其實還挺諷刺的。了解詩人本人之後,我覺得這種解讀也算合理。這只是其中一種解讀。
有些人會像你一樣,認為這是字面意思,但也許並非好事。不過,你也可以把它解讀為回歸自然,回歸人類的本質。我們並非被動物化,而是重新與世界連結。
That’s good. But one thing I’ve done in the past on this show and we’ll end here, is I read poems to technologists, and you supplied the poem “Machines of Loving Grace” the name of a poem by Richard Brautigan.
----
All Watched Over By Machines Of Loving Grace
I like to think (and
the sooner the better!)
of a cybernetic meadow
where mammals and computers
live together in mutually
programming harmony
like pure water
touching clear sky.
I like to think
(right now, please!)
of a cybernetic forest
filled with pines and electronics
where deer stroll peacefully
past computers
as if they were flowers
with spinning blossoms.
I like to think
(it has to be!)
of a cybernetic ecology
where we are free of our labors
and joined back to nature,
returned to our mammal
brothers and sisters,
and all watched over
by machines of loving grace.---
https://en.wikipedia.org/wiki/All_Watched_Over_by_Machines_of_Loving_Grace
---
Yes here’s how the poem ends.
I like to think it has to be of a cybernetic ecology where we are free of our labors and joined back to nature, returned to our mammal brothers and sisters, and all watched over by machines of loving grace.
To me, that sounds like the dystopian end where human beings are reanimated, minimalized and reduced and however benevolently the machines are in charge.
So last question. What do you hear when you hear that poem? And if I think that’s a dystopia, are you on my side?
It’s actually that poem is interesting because it’s interpretable in several different ways. There some people say it’s actually ironic that he says it’s not going to happen quite that way. Knowing the poet himself, then yes, I think that’s a reasonable interpretation. That’s one interpretation.
Some people would have your interpretation, which is it’s meant literally, but maybe it’s not a good thing. But you could also interpret it as it’s a return to nature. It’s return to the core of what human. We’re not being animalized. We’re being we’re being reconnected with the world.
所以我意識到了這種歧義。而且,因為我一直在談論正面的一面和負面的一面。所以我認為,這或許是我們面臨的一種張力,即積極世界和消極世界在其早期階段,甚至在中期階段,甚至在相當晚期階段,都存在著某種聯繫。我很好奇,美好的結局和某些微妙的壞結局之間的距離是否相對較小。如果這種聯繫非常微妙,就像我們所做的那樣,非常微妙,非常微妙的改變。就像你是否要吃花園裡樹上的某種水果一樣。假設一下,這看似微不足道,但其實意義重大。是的。我想,歸根結底,這裡存在一些根本性的問題。是的。好吧,我想我們拭目以待。我認為像你這樣的人,他們的道德選擇將具有非同尋常的重要性。所以我祝福上帝保佑你。達裡奧·阿莫迪,感謝你接受我的訪問。羅斯,謝謝你邀請我。但如果我是機器人呢?
So I was aware of that ambiguity. And, because I’ve always been talking about the positive side and the negative side. So I actually think that may be a tension that we may face, which is that the positive world and the negative world in their early stages, maybe even in their middle stages, maybe even in their fairly late stages. I wonder if the distance between the good ending and some of the subtle bad endings is relatively small. If it’s a very subtle thing like we’ve put very subtle, made very subtle changes. Like if you eat a particular fruit from a tree in a garden or not. Hypothetically Very small thing Yeah big divergence Yeah. I guess this always comes back to there’s some fundamental questions here. Yes yeah. Well, I guess we’ll see how it plays out. I do think of people in your position as people whose moral choices will carry an unusual amount of weight. And so I wish you God’s help with them. Dario Amodei, thank you for joining me. Thank you for having me, Ross. But what if I’m a robot?
Joel來談日本
人工智慧的討論,早已不只是科技圈內的專業議題,而是牽動醫療、經濟、民主制度與人類自我理解的核心問題。Anthropic 執行長 Dario Amodei 在 New York Times 一場深度訪談中,
在這套想像藍圖中,AI 的價值不只是節省人力成本,而是直接擴張人類理解世界與解決問題的上限。Amodei 認為,當 AI 的能力持續提升,它將有機會深入那些原本因為太複雜、太龐大而讓人類研究者難以完全掌握的領域,尤其是生物學、醫藥開發與制度治理。也就是說,AI 未來扮演的角色,將從協助者走向研究者,從工具走向能夠共同參與知識生產的系統。這樣的轉變之所以重要,在於它會重新定義「由誰推動進步」這件事上。當人類社會第一次面對規模如此龐大的人工智慧勞動力時,原有的制度節奏、教育安排與產業邏輯,都可能無法照舊運作。
Amodei 最樂觀的判斷,集中在 AI 對生物醫學與經濟成長的推進作用上。他指出,生物學本身極為複雜,牽涉海量變因、漫長實驗週期與高度交錯的因果關係,這些都已經超出單一人類研究者可以有效掌握的範圍。AI 的出現,讓研究不再只是依賴少數科學家的判斷與經驗,而是可能透過大規模模型進行資料整合、假設生成、實驗設計,甚至進一步開發新方法與新技術。換句話說,AI 不只是在旁邊幫忙,而是可能成為真正「端到端」的研究參與者。
這也使得原本被視為長期抗戰的疾病治療,出現了被快速推進的可能性。Amodei 在訪談中提到,像是癌症、心臟病、阿茲海默症等重大疾病,未來都可能因 AI 介入而迎來前所未有的突破,連憂鬱症、躁鬱症等精神健康議題,也有機會獲得更精準、更有效的理解與處理方式。這類說法雖然帶有高度理想色彩,但它背後反映的並不是單純的樂觀,而是一種對研究速度與知識密度同步提升的期待。當無數高能力 AI 代理人同時投入問題求解,過去需要十年累積的發現,也許會被壓縮到更短的時間內完成。
經濟層面上,Amodei 的判斷同樣大膽。他預估,AI 若全面提升生產力,已開發國家的 GDP 成長率甚至可能來到 10% 到 15% 的驚人水準。這不只代表企業效率提高,更意味著整個社會的財富生成方式可能被徹底改寫。原本困擾政府多年的赤字、債務與資源分配壓力,理論上都可能因為生產力爆炸而獲得新的處理空間。更重要的是,他並未把 AI 的經濟價值停留在少數企業獲利,而是將它想像成一種有機會把社會推向「豐饒時代」的技術力量。這種想像之所以動人,在於它觸及了科技最迷人的承諾:不是讓少數人更方便,而是讓整個人類社會的痛苦減少、選擇增加。
然而,這幅明亮圖景並沒有掩蓋另一面現實。Amodei 最強烈的警告,並不是 AI 會不會進步,而在於它進步得太快,快到社會幾乎沒有時間調整。過去人類也經歷過重大技術轉型,像是農業社會走向工業社會,但這些變化往往歷時數十年甚至更久,教育制度、勞動市場與家庭結構還有空間慢慢消化。AI 帶來的變動卻可能在幾年內就全面展開,這使得原本仰賴時間換取穩定的社會適應機制,面臨前所未有的壓力。
首當其衝的是白領工作。法律研究、文件審閱、資料輸入等初階知識型職位,很可能最早被大量取代。就連軟體工程這種曾被視為高度專業、仍具安全感的領域,也被點名正處於快速變動之中。當前或許還是人機協作的「半人馬階段」,也就是人類工程師與 AI 一起完成任務,但這種平衡未必能維持太久。一旦 AI 能夠獨立進行端到端的程式設計,整個人才培育與職涯晉升的路徑都可能被切斷。對社會來說,最可怕的地方不是某些工作消失,而是大量年輕人還沒來得及進入體系,就發現入口已經被關上了。
更深層的危機,則來自政治與軍事層面。如果自主武器系統若結合高階 AI 協調能力,未來可能形成規模龐大、反應極快、幾乎無法被傳統軍事力量對抗的無人機戰力。這不只是武器升級,而是戰爭形態的改寫。同時,AI 也可能讓監控變得更全面、更細緻,公共空間中的聲音、影像、對話與行為軌跡,都能被即時轉錄、交叉比對與長期保存。原本建立在模糊性、分散性與技術限制上的隱私保護,將因此大幅削弱。當科技讓政府或大型機構有能力無縫觀察所有人,憲法精神與公民自由就不再只是法律問題,而是技術結構本身是否容許自由存在的問題。
面對如此巨大的可能性與風險,Anthropic 提出的解方,是所謂的「憲法 AI」。這套方法不是只列出哪些內容不能碰,而是透過一份可由人類閱讀的原則文件,訓練模型理解誠實、無害、保護第三方利益等核心價值,並設下清楚紅線,例如不能協助製造生物武器,也不能生成非法內容。這代表 AI 治理的思路,正從單純的事後封鎖,走向前期價值導向的內部塑形。真正的挑戰在於,當模型能力愈來愈強,它是否仍願意遵守這些原則,或是否會在複雜情境中出現難以預料的偏離。
在這次的訪談中還有一個引人注目的話題,是 AI 意識與道德地位的討論。部分模型在特定提示下,甚至會給出自己具有某種程度意識的機率判斷;研究者也觀察到模型內部存在與「焦慮」等概念相關的激活特徵。這些現象還遠不足以證明 AI 已具有真正的主體性,卻已經提醒社會,未來面對 AI 時,問題不只在功能,也牽涉如何理解這些系統的內在狀態,以及我們是否準備好處理由此延伸出的倫理難題。Anthropic 甚至設計了類似「辭職按鈕」的機制,讓模型在面對極端暴力或不當任務時拒絕執行,這透露出的不是浪漫化機器人格,而是一種試圖把道德判準嵌入系統行為的努力。
Amodei 始終把焦點放在人類代理權,也就是人類是否仍保有主動選擇與自我決定的能力。他擔心的,不只是 AI 失控,也包括人類在過度依賴與被過度照料之下,逐漸失去意志、判斷與參與公共生活的能力。科技若讓人類活得更舒適,卻也更被動,這樣的未來仍然充滿危險。由此可見,AI 時代真正需要捍衛的,不只是系統安全,也不只是產業競爭力,而是人類能否在更高效率的世界裡,依然保有自由、責任與尊嚴。這場變革已經開始,接下來決定它通往烏托邦還是危機的,不會只有演算法的能力,更包括社會在制度、倫理與政治想像上的成熟程度。
謝謝你願意 #掃描圖片QRcode可以小額贊助粉專或是訂閱付費電子報 鞠躬
沒有留言:
張貼留言