《 Apple's AI Strategy: Apple Datacenters, On-device, Cloud, And More》
📌
When to run on-device, in Apple datacenters, or in the cloud with OpenAI, Deal economics
Dylan Patel
Nvidia continues to ramp their production to service the world’s insatiable demand for GPUs, and yet, our Accelerator Model’s extensive checks show Apple’s purchases of GPUs are quite miniscule. In fact, they aren’t even a top 10 customer. Furthermore, while all eyes are on WWDC, Apple’s only announcing AI there, not shipping. The question on everyone’s mind is... what the heck is Apple doing in AI?
Mark Gurman laid out the features Apple is announcing at WWDC. Furthermore there’s a variety of rumors floating around from others, so let’s get to the bottom of what’s really happening, how, and what Apple can do.
The first thing is that multiple sources have reported that Apple is ramping up production of its M-series processors this year to record volumes. This is primarily Apple’s M2 Ultra SKUs which is 2 M2 Max SoCs stitched together with what Apple calls “UltraFusion.” Note that Apple’s M3 Ultra was cancelled.
苹果公司正积极拓展其人工智能战略,涉足包括自家数据中心、端侧运算以及与OpenAI合作的云计算服务。在这一系列复杂的商业决策中,我们探讨了何时以及如何在不同平台上部署这些技术。
迪伦·帕特尔的最新报道指出,尽管Nvidia不断加大产量以迎合全球对GPU的强烈需求,但苹果对GPU的采购量却出人意料地小。实际上,苹果在GPU采购商中的排名并未进入前十。当前,业界的目光都聚焦于苹果在全球开发者大会(WWDC)上对AI的最新宣布,而非实际产品的发货。这引发了一个行业热议的问题:苹果在人工智能领域的真正动向是什么?
业内资深分析师马克·古尔曼透露了苹果在WWDC上即将公布的一系列新功能。同时,市场上也流传着各种关于苹果的传闻。我们将深入探讨,揭示苹果在人工智能领域的实际行动、策略以及可能采取的未来举措。
据多方消息源透露,苹果正在加紧生产其M系列处理器,预计今年的产量将创下历史新高。这主要得益于苹果M2 Ultra SKU的推出,该产品由两个M2 Max SoC通过苹果所称的“UltraFusion”技术组合而成。值得注意的是,苹果原计划推出的M3 Ultra处理器已宣布取消。
Ultrafusion is Apple’s marketing name for using a local silicon interconnect (bridge die) to connect the two M2 Max chips in a package. The two chips are exposed as a single chip to many layers of software. M2 Ultra utilizes TSMC’s InFO-LsI packaging technology. This is a similar concept as TSMC’s CoWoS-L that is being adopted by Nvidia’s Blackwell and future accelerators down the road to make large chips. The only major difference between Apple and Nvidia’s approaches are that InFO is chip-first vs CoWoS-L is chip-last process flow, and that they are using different types of memory.
苹果公司推出的“Ultrafusion”技术,是其最新的技术概念,指的是使用本地硅互连(桥接芯片)技术将两颗M2 Max芯片在封装中连接起来。这种设计使得两颗芯片在众多软件层面上被视作单一芯片。M2 Ultra采用了台积电的InFO-LsI封装技术,这与台积电为Nvidia的Blackwell及未来加速器采用的CoWoS-L概念颇为相似,后者的目标是制造大型芯片。苹果与Nvidia方法的主要区别在于,苹果采用的是先进的InFO(芯片先进)技术,而Nvidia采用的是CoWoS-L(芯片后进)工艺流程,此外,两家公司使用的内存类型也不同。
What’s curious about Apple’s production increases is that there is nothing on the demand front to support a sudden increase in M2 Ultra shipments. M2 Ultra is only used in high-end Mac Studio and Mac Pro products. There has been no meaningful refresh of these products in a year, and there are no plans for one any time soon. Furthermore, no new products are shipping with it either.
High end desktop PCs and Macs remains quite sluggish compared to the peak covid demand, yet 2024 production of the chips that would purportedly power these high end Macs is going to be significantly higher than the last few years even though there is nothing to suggest there will be consumer demand to soak up all these units.
苹果公司增加M2 Ultra芯片产量的举动引起了业界的关注,因为目前市场需求并未显示出对这一突增产量的支持。M2 Ultra芯片目前仅用于高端的Mac Studio和Mac Pro产品,而这些产品在过去一年中并未有重大更新,且短期内也没有更新计划。同时,也没有新产品搭载该芯片出货。
尽管如此,与疫情高峰时期相比,高端桌面PC和Mac的市场需求相对疲软,但预计到2024年,这些高端Mac所需芯片的产量将大幅增加,尽管目前并无迹象表明消费者需求有足够的增长来消化这些库存。
How do we reconcile this? 这一矛盾该如何解释呢?
This additional production of the M2 Ultras is consistent with recent reporting out from the WSJ and Bloomberg about Apple using their own silicon in their own Datacenters for serving AI to Apple users.
Furthermore, Apple has extensive expansion plans for their own datacenter infrastructure. We are tracking 7 different datacenter sites with over 30 buildings for Apple as well as their planned buildout. Their total capacity is doubling in a relatively short period of time.
最近从《华尔街日报》和彭博社的报道中可以看出,苹果增产M2 Ultra芯片,是为了在自家数据中心使用自研硅片,为苹果用户提供AI服务。
此外,苹果对自家数据中心基础设施的扩张计划宏大,我们正在追踪苹果的7个不同数据中心地点,包括超过30栋建筑以及它们的计划扩建。苹果的总体容量将在相对较短的时间内翻倍。
Above is Apple’s soon to be largest datacenter site. Currently they only have 1 datacenter there, but many will be coming up next year. Our datacenter model has more details on the upcoming Apple datacenters.
即将成为苹果最大数据中心的场地目前只有一个数据中心,但预计明年将有更多的数据中心投入使用。我们的数据中心模型将提供更多即将到来的苹果数据中心的细节。
The other indication that Cupertino is serious about their AI hardware and infrastructure strategy is they made a number of major hires a few months ago. This includes Sumit Gupta who joined to lead cloud infrastructure at Apple in March. He’s an impressive hire. He was at Nvidia from 2007 to 2015, and involved in the beginning of Nvidia's foray into accelerated computing. After working on AI at IBM, he then joined Google’s AI infrastructure team in 2021 and eventually was the product manager for all Google infrastructure including the Google TPU and Arm based datacenter CPUs.
He’s been heavily involved in AI hardware at Nvidia and Google who are both the best in the business and the only companies that are deploying AI infrastructure at scale today. This is the perfect hire.
Given this context, let’s look at what Apple is doing with their current and future in-house chips and external chips. We will look into Apple’s ongoing deal with Nvidia. We will also discuss what Apple can run on-device, in cloud, and when they have to go to external service provider based AI. The economics around the deal differ from the $20B Google search deal, but not in the way you’d think. We will also discuss how Apple can offer this to customers and grow revenue.
库比蒂诺对其AI硬件和基础设施战略的认真态度还体现在几个月前的一系列重要招聘上。其中包括Sumit Gupta,他在今年三月加入苹果,领导云基础设施团队。他的履历令人印象深刻,2007至2015年期间他在Nvidia工作,参与了Nvidia加速计算的初步探索。在IBM从事AI工作后,他于2021年加入了谷歌的AI基础设施团队,并最终成为了包括谷歌TPU和基于Arm的数据中心CPU在内的所有谷歌基础设施的产品经理。
他在Nvidia和谷歌从事AI硬件的丰富经验使他成为业内最佳人选,这两家公司都是当今规模部署AI基础设施的佼佼者。他无疑是完美的人选。
在这个背景下,让我们看看苹果目前和未来的自研芯片以及外部芯片的情况。我们将探讨苹果与Nvidia的持续合作。我们还将讨论苹果可以在设备上、云端以及何时需要转向基于外部服务提供商的AI。这笔交易的经济学与200亿美元的谷歌搜索交易不同,但不是你想象的那样。我们还将讨论苹果如何向客户提供这些服务并增加收入。
💡
To begin with – the M2 Ultra for AI servers is a bad idea.
首先 - 将M2 Ultra用于AI服务器是一个糟糕的主意。
There is a common narrative that the Apple M-series chips have good AI performance. This is only in context of other client chips for on-device AI. The competition there is using significantly worse memory architectures on their laptops and desktops. Existing Intel, AMD, and Qualcomm based laptops all have a 128-bit bus memory bus. Apple has much memory wider bus widths than can be had on other vendors CPUs.
有一种普遍的观点认为,苹果M系列芯片在AI性能上表现良好。但这只是在其他客户端芯片用于设备上AI的情境中才成立。在这个领域,竞争对手在笔记本电脑和台式机上使用的内存架构明显更差。现有的基于英特尔、AMD和高通的笔记本电脑都采用了128位总线内存。苹果的内存总线宽度远远超过其他厂商的CPU。
As such, Apple’s memory bandwidth crushes the traditional CPU vendor competition. Other laptops can come with Nvidia GPUs that have comparable memory bandwidth to Apple, but Nvidia goes with a lower cost GDDR6 based memory architecture versus Apple’s high cost LPDDR architecture which requires far more chip shoreline area.
因此,苹果的内存带宽远超传统CPU供应商的竞争。其他笔记本电脑可能配备了与苹果相当的内存带宽的Nvidia GPU,但Nvidia选择了成本较低的GDDR6内存架构,而苹果则采用了成本更高、需要更多芯片边缘面积的LPDDR架构。
The side effect of this is that Nvidia GPUs are capped at quite small memory sizes, meaning the client gaming GPUs cannot hold good models in memory that Apple can, such as LLAMA 3 70B.
这样的副作用是,Nvidia GPU的内存大小受到限制,较小的内存意味着客户端游戏GPU无法像苹果那样存储大型模型,例如LLAMA 3 70B。
This does not extend into cloud-based AI performance. While on device is primarily concerned with whether or not a model can even be served, the cloud cares about economics. Here, while raw bandwidth and capacity are important the number of FLOPS begins to matter much more because many users are being served concurrently via batching. High batch sizes can reduce the cost of inference (tokenomics) by more than 10x.
这种情况并不适用于基于云的AI性能。在设备上,主要关心的是模型是否能被运行,而云服务更关心经济效益。在这里,尽管原始带宽和容量很重要,但随着许多用户通过批处理同时服务,浮点运算次数(FLOPS)的重要性开始显著增加。高批处理量可以将推理成本(代币经济学)降低超过10倍。
The reality is the M2 Ultra is the best house in a bad neighborhood, and it cannot compare to datacenter GPUs. While they are behind the competition in memory bandwidth, the more important gap is the significantly fewer FLOPS and therefore concurrent users.
现实是,M2 Ultra可能是糟糕邻里中的最佳之选,但它无法与数据中心的GPU相媲美。尽管在内存带宽上落后于竞争对手,但更关键的差距在于它显著较少的FLOPS,因此并发用户数量也更少。
Here Apple has a miniscule number of FLOPS in their GPU. Thankfully they also have the Neural Engine. One strategy we have seen for running LLMs on Apple devices is running the multi-layer perceptron on the Neural Engine and running attention mechanism on the GPU. Note there is a fabric bandwidth issue here, so you won't get anywhere close to addition in terms of total FLOPS.
苹果的GPU中FLOPS(浮点运算次数)的数量微乎其微。幸运的是,他们还配备了神经引擎。我们观察到的一种在苹果设备上运行大型语言模型(LLM)的策略是在神经引擎上运行多层感知机,在GPU上运行注意力机制。需要注意的是,这里存在一个织物带宽问题,因此在总FLOPS方面几乎无法实现叠加。
Regardless though, even if you can magically add up GPU and Neural Engine, you are still ~35X to ~85X off what datacenter GPUs can do. This means capability achieve high batch sizes is limited and the number of users served per chip is dramatically reduced. The M2 Ultra would be lucky to be able to serve 4-6 users per chip for LLAMA 70B where as batch sizes of 64+ are achieved regularly on GPUs.
然而,即使你可以神奇地将GPU和神经引擎的性能相加,其性能仍然比数据中心GPU低约35倍到85倍。这意味着实现高批处理大小的能力受限,每个芯片服务的用户数量大幅减少。M2 Ultra能够为LLAMA 70B服务4-6个用户已经算是幸运的,而常规GPU可以轻松实现64+的批处理大小。
This is just a per unit comparison, which does not include one of the most important variables: cost. Apple can get M2 Ultra without paying the hefty margin a merchant silicon or custom design partner is taking. Apple’s costs are in the ~$2,000 range for 2 M2 Max dies, the InFO-L packaging, and 192GB of LPDDR. For comparison an H100 is 10x the cost.
这仅是每单位产品的比较,并未包含成本这一关键变量。苹果能够以约2000美元的成本获得M2 Ultra,无需支付给硅商或定制设计合作伙伴的高额利润。相比之下,一个H100的成本是其十倍。
Given there is a 10x cost difference, and more than 10x difference in performance, Apple is hard pressed for the M2 Ultra to be cost effective even for a LLAMA-3 70B type model.
鉴于成本差异是10倍,性能差异超过10倍,苹果要使M2 Ultra在成本效益上对于LLAMA-3 70B型模型而言变得具有竞争力是非常困难的。
Furthermore this doesn’t apply when models scale beyond a single chip. Compute does not simply scale linearly, and M-series SoCs, in particular, are not designed to scale like this. The only chip-to-chip interconnect is the UltraFusion bridge which is used to fuse two M2 Max to get one M2 Ultra. This is nothing like Nvidia NVLink with high speed Serdes for chip-to-chip scale-up. We have spoken poorly about other firms chip to chip IQ, such as Amazon’s Trainium 2 and MI300X, but these are space age compared to what Apple has.
此外,当模型扩展超出单个芯片时,这种比较并不适用。计算性能并不是简单线性扩展的,特别是M系列SoC并不是为此类扩展设计的。唯一的芯片到芯片互连技术是UltraFusion桥接,它用于将两个M2 Max融合成一个M2 Ultra。这与Nvidia的NVLink搭配高速Serdes进行芯片到芯片扩展不同。我们之前对其他公司的芯片到芯片智能(IQ)评价不高,如亚马逊的Trainium 2和MI300X,但与苹果的技术相比,这些可以说是太空时代的技术。
While Apple can create a decent amount of aggregate compute per dollar, not so far off from just buying Nvidia GPUs, getting the FLOPS to work effectively to be a single cluster to train will be impossible and inference will be relegated to model sizes that are about LLAMA-3 sized at human speech speeds.
尽管苹果能够以每美元创造相当数量的综合计算能力,与直接购买Nvidia GPU相比并不算差,但要让FLOPS有效地工作以形成单一训练集群将是不可能的,推理将被限制在类似LLAMA-3这样的模型大小,以人类语音速度运行。
Furthermore, Apple will never be able to run the hundreds of billion parameter models on M2 Ultras.
此外,苹果将永远无法在M2 Ultras上运行数千亿参数的模型。
Why is Apple doing something so plainly not optimal? Apple’s AI team must already realize this. Why not go the default option and invest more in Nvidia Hoppers and Blackwells like everyone else? Or even use more TPUs, after all one of the few things Apple has publicly shown from their Foundational Models team which builds on top of Google’s Jax and XLA with the AXLearn framework.
为什么苹果要做一些显然不是最优的事情?苹果的AI团队肯定已经意识到这一点。为什么不选择默认选项,像其他人一样投资更多的Nvidia Hoppers和Blackwells?或者甚至使用更多的TPUs,毕竟苹果公开展示的少数几件事之一就是他们的基础模型团队,该团队在Google的Jax和XLA之上构建了AXLearn框架。
Business decisions are supposed to be rational in theory, but in practice they are often made by individuals and people who can have biases, or even worse, grudges. Apple certainly has held grudges against one of their most important suppliers: Qualcomm. Apple just has no readily available alternatives to using Qualcomm modem chipsets. They are trying to develop their own modem but cannot seem to overcome technical or legal hurdles until 2027 at minimum. Otherwise, Qualcomm would be eliminated from the iPhone’s BOM.
理论上,商业决策应该是理性的,但实际上它们往往是由个人和可能有偏见的人做出的,或者更糟糕的是,怀有怨恨。苹果显然对他们最重要的供应商之一高通(Qualcomm)怀有怨恨。苹果没有现成的替代品可以使用高通的调制解调器芯片组。他们正在尝试开发自己的调制解调器,但似乎无法在2027年之前克服技术或法律障碍。否则,高通将从iPhone的BOM中被淘汰。
This leads us to an older Apple grudge people might not be familiar with: Nvidia. These days, everyone is in awe of Nvidia’s flawless engineering execution. There’s recency bias afoot, as Nvidia has made their fair share of major engineering mistakes in the past. One of the biggest was the “bumpgate” fiasco which happened in 2006-2009. A quick refresher: Nvidia’s entire 55nm and 65nm line of GPUs during that time had extremely high premature failure rates of 40% due to high thermals and poor package design. The bumps between the chip and package substrate were prone to cracking due to stress, leading to unacceptable failure rates.
这让我们想到了一个较早的苹果怨恨,人们可能不太熟悉:Nvidia。如今,每个人都对Nvidia的无懈可击的工程执行力感到敬畏。这里存在着近因效应,因为Nvidia过去也犯过他们的重大工程错误。其中最大的一个是2006-2009年发生的“bumpgate”(凸点门)事件。简单回顾一下:在那段时间内,Nvidia的整个55nm和65nm系列GPU由于高热量和糟糕的封装设计,出现了高达40%的过早失效率。芯片和封装基板之间的凸点由于应力而容易破裂,导致无法接受的失效率。
This applied to chips in GeForce 6000, 7000, 8000 and 9000 as well as various mobile chipsets. Laptops that contained Nvidia chipsets shipped by Apple, Dell and HP were all affected. What was worse was Nvidia’s handling of the situation. After Nvidia initially refused to take responsibility, Apple, Dell and HP brought a class action lawsuit against Nvidia which Nvidia eventually settled and agreed to replace the faulty GPUs that were shipped.
这影响到GeForce 6000、7000、8000和9000系列以及各种移动芯片组。包含Nvidia芯片组的笔记本电脑,由苹果、戴尔和惠普出货的都受到了影响。更糟糕的是Nvidia对这一情况的处理。在Nvidia最初拒绝承担责任后,苹果、戴尔和惠普对Nvidia提起了集体诉讼,Nvidia最终和解并同意更换被发货的有缺陷的GPU。
This severely damaged the Apple and Nvidia relationship, with Nvidia no longer being designed into any Apple socket after that. Apple quite literally went with the much worse power/performance AMD GPUs at the time and even came up with a custom GPU with AMD that used HBM in laptops.
这严重损害了苹果和Nvidia的关系,之后Nvidia就不再被设计进任何苹果产品的插槽中。当时,苹果实际上选择了性能/功耗比更差的AMD GPU,并且甚至与AMD合作开发了在笔记本电脑中使用HBM的定制GPU。
This is the historical baggage with Nvidia that could be giving Apple pause to rely on Nvidia again.
这就是Nvidia的历史包袱,可能让苹果在再次依赖Nvidia时犹豫不决。
If the only usecase for Apple was to serve models with chat bot style applications, Apple is being irrational. In reality, Apple is doing much more than just serving their users chatbots or personal assistants.
如果苹果的唯一用例是为聊天机器人风格的应用程序服务,那么苹果的做法是不理智的。实际上,苹果做的远不止为用户提供聊天机器人或个人助理。
Apple’s goal here is to integrate all their data and services together with AI. This means their chip needs to run the full iOS / MacOS stack. Users will have near digital twins of their operating system, applications and data on the device and in Apple's cloud. This requires not only AI compute performance, but also all of Apple’s sauce around their CPU cores and silicon to software stack.
苹果的目标是将所有的数据和服务通过人工智能整合在一起。这意味着他们的芯片需要运行完整的iOS/MacOS堆栈。用户将在设备和苹果云中几乎拥有他们操作系统、应用程序和数据的数字孪生体。这不仅要求AI计算性能,还要求苹果围绕其CPU核心和硅片到软件堆栈的所有“魔法”。
Features reported by Mark Gurman such as transcribing voice memos, retouching photos with AI, and making searches faster and more reliable in the Spotlight feature can be done today on the iPhone. Furthermore, Apple plans to launch features such as suggested replies to emails and text messages, which can be done with tiny models on device or smaller models on M2 Ultra in the cloud. Siri will likely have to run in the cloud to have the model be capable enough and to stream to the Apple Watch. GenAI emojis are also reported to be a feature, but those can be trivially done on device.
据马克·古尔曼报道的功能,如转录语音备忘录、用AI修饰照片以及使Spotlight功能中的搜索更快更可靠,今天已经可以在iPhone上完成。此外,苹果计划推出如对电子邮件和短信的建议回复等功能,这些可以在设备上的微型模型或云中的M2 Ultra上的较小模型上完成。Siri很可能需要在云中运行,以使模型足够强大,并且能够流式传输到Apple Watch。据报道,GenAI表情符号也将是一个功能,但这些可以在设备上轻松完成。
Another major feature Apple wants to launch are smart recaps. Summaries missed notifications, text messages, web pages, news articles, documents, notes and other forms of media. This also doesn't require a frontier model specifically, and as such can run on M2 Ultra in the cloud and some of these on device. To be clear none of these features are ground breaking and most are already in Google and Meta apps and various Android phones.
苹果想要推出的另一个主要功能是智能摘要。总结错过的通知、短信、网页、新闻文章、文件、笔记和其他形式的媒体。这也不需要特定的前沿模型,因此可以在云中的M2 Ultra上运行,其中一些可以在设备上运行。需要明确的是,这些功能都不是突破性的,大多数功能已经存在于谷歌和Meta应用程序以及各种安卓手机中。
Apple’s big pitch here is they run your data securely in their datacenters, and don’t kick your sensitive data off to third party clouds.
苹果在这里的重点宣传是,他们在自己的数据中心安全地运行您的数据,而不会将您的敏感数据踢给第三方云服务。
M2 Ultras are just the short-term solution while Apple is developing something better over time. The M3 Ultra was cancelled, and M4 Ultra we do not see going into production yet, so that may be a dead generation as well.
M2 Ultra只是一个短期解决方案,苹果正在开发更好的东西。M3 Ultra已经取消,而M4 Ultra我们还没有看到进入生产阶段,因此这可能也是一个夭折的世代。
The Neural engine is not currently optimized for some of the calculations required for LLMs. Furthermore, fabric bandwidth to the neural engine is severely starved and as such, it would need a major rework to be capable of language models.
神经引擎目前还没有针对LLMs所需的某些计算进行优化。此外,神经引擎的织物带宽严重不足,因此,它需要进行重大改造才能支持语言模型。
Apple will not be going to a custom silicon provider for help with their AI chip. We could see them licensing high speed SerDes and making their own datacenter focused chip, but that’s still years away and still in whiteboarding territory.
苹果不会转向定制硅片供应商寻求帮助制造他们的AI芯片。我们可以看到他们许可高速SerDes并制造自己的数据中心专用芯片,但这仍然是数年之事,目前还处于白板规划阶段。
As such, this year and next year we will still see Apple using their beefed up laptop and desktop Apple silicon.
因此,今年和明年我们仍将看到苹果使用他们加强版的笔记本电脑和桌面苹果硅片。
Apple has shown some model efforts, but certainly nothing that comes close to GPT-4, Gemini, or Claude. Apple needs to offer frontier models. At the end of the day, they simply haven't spun up the compute power or talent to be able to train their own frontier model. However, they need to be able to serve AI to their userbase and one that is tailored to the Apples values and ethos.
苹果已经展示了一些模型成果,但肯定没有接近GPT-4、Gemini或Claude。苹果需要提供前沿模型。归根结底,他们简单地没有启动足够的计算能力或人才来训练自己的前沿模型。然而,他们需要能够为用户群提供AI服务,并且这种服务要符合苹果的价值观和精神。
Yes, iPhone users can download ChatGPT from the AppStore, and that's good enough today. But that will not significantly change the smartphone market share equation. It would be very out of character for vertically integrated Apple to not have their own Apple-branded take on the next consumer internet paradigm.
是的,iPhone用户可以从AppStore下载ChatGPT,而且目前来看这已经足够好了。但这不会显著改变智能手机市场份额的方程式。对于垂直整合的苹果来说,如果没有自己品牌的下一代消费者互联网范式,那将是非常不符合他们一贯的作风。
Gurman has reported that Apple has struck a deal with OpenAI. He has also said they are looking to strike a deal with Google. He has also reported that there are discussions with Anthropic. This would involve an Apple wrapper around the service on Apple’s devices, with a new system prompt that is consistent with the Apple image and brand.
Gurman报道称苹果已经与OpenAI达成了一项协议。他还表示,他们正寻求与谷歌达成协议。他还报告说,他们正在与Anthropic进行讨论。这将涉及在苹果设备上的服务周围包装一个苹果包装,配有一个与苹果形象和品牌一致的新系统提示。
There is another business model consideration, and it is how it affects Apple’s big search revenue. Google pays Apple $20 billion a year to be the default search engine for Apple. This is unbelievably valuable as the default search engines are generally more profitable than being at the given search engine multiple times through the ads Google as they make this back multiple times over through the ads they serve through search queries Apple users make.
还有另一个商业模式的考虑因素,那就是它如何影响苹果巨大的搜索收入。谷歌每年支付给苹果200亿美元,使其成为苹果的默认搜索引擎。默认搜索引擎通常比通过广告多次出现在给定搜索引擎上更有利可图,谷歌通过他们为苹果用户的搜索查询提供的广告多次赚回这笔钱。
If users are going to use ChatGPT, LLAMA, or Claude for answers instead of traditional search, that will cannibalize Google search revenues that Apple currently gets a cut of. To be fair Apple still participates in users’ premium subscription fees to Gen AI services if the transaction was completed via in-app purchase, but in the future publishers may force users to subscribe outside of the app (like Netflix does) to avoid Appstore fees. Furthermore, there is extreme regulator interest in both the Google search exclusivity deal and the App Store revenue cut. Apple needs another path.
如果用户打算使用ChatGPT、LLAMA或Claude来获取答案,而不是传统搜索,那将会蚕食苹果目前从谷歌搜索收入中分得的一杯羹。公平地说,如果交易是通过应用内购完成的,苹果仍然会参与用户对Gen AI服务的高级订阅费用,但在未来,出版商可能会迫使用户在应用外订阅(就像Netflix所做的那样)以避免 App store 的费用。此外,监管机构对谷歌搜索独家协议和App Store收入分成都表现出极大的兴趣。苹果需要另辟蹊径。
It is better strategically for Apple to have some control instead of just being the dumb hardware to use AI. However, to monetize effectively and make money on lost search revenue in the case of Google, Apple or their partners will need to resort to ads.
对苹果来说,战略上最好是有一定的控制权,而不仅仅是使用AI的“愚蠢”硬件。然而,为了有效地变现并在谷歌的搜索收入损失上赚钱,苹果或其合作伙伴将需要依靠广告。
While cost of serve the model will be higher, there is also a much higher conversion for genAI ads. GenAI based search with ads supporting it is a viable business model. The problem is that Apple won’t allow these to be served in their personal assistant with 0 control and the massive data privacy issues.
虽然提供模型的成本会更高,但genAI广告的转化率也会更高。支持广告的基于genAI的搜索是一个可行的商业模型。问题在于,苹果不会允许这些广告在他们的个人助理上以零控制权和巨大的数据隐私问题出现。
As such, there won’t be an associate revenue stream for their provider to offset the service being offered for free. If that’s the case, then Apple is is in a pickle. The service has to paid for the provider to make money and one which Apple takes a cut.
因此,他们的提供商不会有关联的收入流来抵消免费提供的服务。如果是这样,苹果就陷入了困境。服务必须由提供商支付以赚钱,并且苹果能从中分一杯羹。
A purely paid subscription to an OpenAI or Google with revenue share is also not an option as that would relinquish too much control. Instead Apple can juice short term and long term revenue with these features. They can push it by offering it for free to all purchasers of the new Pro series iPhone for a period of time. Furthermore they can offer it as part of a more expensive Apple One subscription. On the back end, Apple can kick off some money to OpenAI or Google in the form of usage based pricing that is priced lower than the API is due to the volume deal.
单纯的向OpenAI或谷歌支付订阅费用并分享收入也不是一个选项,因为这将放弃太多控制权。相反,苹果可以通过这些功能增加短期和长期收入。他们可以通过向所有新款Pro系列iPhone的购买者免费提供一段时间来推动它。此外,他们可以将其作为更昂贵的Apple One订阅的一部分提供。在后端,苹果可以通过基于使用量的定价向OpenAI或谷歌支付一些钱,这个价格由于批量交易而低于API应有的价格。
When this launches, Apple will likely receive a huge flux of customer trying it out and that will put a lot of pressure on AI infrastructure. Now, OpenAI and Google have to build to be ready for that massive influx of traffic. There will be a huge spike, the question is what is retention, but regardless the AI hardware must be in place for the spike of usage.
当这个服务启动时,苹果很可能会收到大量尝试使用的客户,这将给AI基础设施带来很大压力。现在,OpenAI和谷歌必须建设好,准备迎接这大量的流量。将会有一个巨大的激增,问题是用户留存率如何,但无论如何,在使用高峰期,必须有相应的AI硬件到位。