Mohamed Awad, Arm, and Mark Lohmeyer, Google | Google Cloud Next ’24
[Savannah Peterson]
Introduction
Good afternoon, cloud community, and welcome back to fabulous Las Vegas. We’re midway through day two of three days here at Google Cloud Next. My name’s Savannah Peterson, joined by CUBE founder and fabulous host, John Furrier.
Google Cloud Next ’24
John, what a great day for us.
[John Furrier]
Great day, and we have a great set of guests coming up to break down the biggest news of the show, I think. They’ve got a lot of headlines.
[Savannah Peterson]
I think you’re not the only person.
[John Furrier]
Custom Silicon Axiom
Certainly the custom silicon axiom from Google. Real signal to the market that more horsepower is coming, and the innovation for the developers. So it should be a great segment.
[Savannah Peterson]
Yeah, it’s going to be an absolutely great segment. On that note, Mohammad and Mark, thank you both so much for being here.
[Mark Lohmeyer]
Thank you.
[Savannah Peterson]
I can imagine this is one of your bigger weeks of the year. How has the reception been?
[Mohamed Awad]
Super, super exciting.
[Mark Lohmeyer]
Yeah, no, I mean, look, we’re really excited about this announcement. I think the customer reaction, the partner reaction, and the ecosystem reaction has been incredibly positive. So we’re really excited to talk more about this with you today.
[00:01:01]
[Savannah Peterson]
Yeah, all right. So let’s dig in. Just in case folks haven’t read the news from yesterday, what did we announce? Mohammad, go for it.
[Mohamed Awad]
Yeah, I mean, Google announced they’ve got their own custom-based ARM processor that they built. So it’s super cool. First one, right? First one out the gate. It’s NeoVerse V2-based. And we’re super excited to have partnered with them in making it happen. I mean, Mark and his team have been fantastic through the whole process. So super, super excited to get it out there.
[Savannah Peterson]
Mark, how did you know that this was the next development for Google?
[Mark Lohmeyer]
Well, you know, it’s interesting you ask. Google has a rich history of custom silicon and systems development for specific workloads. You know, five generations of TPUs, three generations of video coding units, multiple generations of processors that go into pixel phones. And so we were really excited to apply that sort of engineering prowess to, in many ways, a bigger space and a bigger problem, which is general purpose data center computing, right? And so that’s why we’re super excited about Axion.
[00:02:00]
And we think it can solve a whole new set of challenges for our customers.
[John Furrier]
Data Center 2.0
So when TK, Thomas Kurian, was announcing this on the keynote, I’m like, okay, ARM’s involved. We talked last year, kind of hinted a little bit this direction. But I like what he said. He said, quote, Axion processes ARM-based CPU, thank you very much, ARM, designed for the data center. And I’m like, hmm, data center. We’re back to the data center. So, you know, cloud obviously had a success, Google Cloud. But the role of the data center is changing significantly. What’s the motivation behind some of these design criterias with the new custom processor? Again, general purpose for the enterprise, I get that.
AI Systems and Data Centers
But how is the enterprise changing? Because we’re seeing AI systems emerge. So it’s not your yesterday’s data center. It’s a different data center. Of course, Google is in the data center business. You have data centers. So can you guys share what this means for the data center redo or 2.0 or whatever version we’re on? What’s the new data center look like?
[Mohamed Awad]
Yeah, I mean, it’s super interesting. And I think, you know, the reality is is that the data center is sort of, it’s turning into this, you know, it’s trying to deal with all these issues around performance, around sustainability, around efficiency.
[00:03:09]
These data centers are built. There’s a certain amount of power that’s brought into them. And you’re trying to cram as much performance and as much efficiency out of them as possible so you can kind of keep up with, you know, AI and all these new workloads. And what’s interesting about what Google has done is by, you know, building silicon from the ground up, and in this case, general purpose compute, what they’ve been able to do is start to optimize what that infrastructure looks like so you can get the most efficiency out of it. You can get the most performance out of it. And, you know, Arm, for our part, I mean, we’re a 30-year-old company, right? And we’ve been building efficient CPUs for a long, long time. And so, you know, this is kind of right in our sweet spot in terms of enabling them to kind of go off and do that.
[John Furrier]
Energy Efficiency
And not to put a plug in for Arm, but, I mean, we’ve been tracking this for over a decade. The energy aspect of what you guys have done also has been a big notable thing. That’s a big part of the constraint now is power and efficiency and to get the most performance without the least amount of energy exerted.
[00:04:07]
[Mark Lohmeyer]
Workload-Optimized Infrastructure
Absolutely. So, you know, building on what Mohamed said, at Google we really design these systems based on a strategy we call workload-optimized infrastructure. And basically what that means is we look at the needs of each and every workload and then we design it at a systems level across compute, storage, and networking to meet the unique needs of each and every workload. And to your earlier point, we’re seeing tremendous growth in customer interest in scaling this next generation of applications, many of which are powered by AI. Now, when you do that, in most cases, those applications also need to have general purpose compute that goes with them, right, to serve all the other aspects of those applications. And so performance becomes really critical and energy efficiency and cost becomes really critical. And so we’re really, I think, breaking new ground here together from that perspective. The Axion processors will have 30% higher performance than the fastest Arm processors available in the cloud today.
[00:05:00]
They’ll have 50% higher performance than comparable x86 generation processors and 60% better energy efficiency than comparable x86-based instances. So these benefits are huge if you think about our customers looking to scale their workloads.
[John Furrier]
Roadmap for Processors
On the nanometer side, what’s the roadmap look like for the processors?
[Mark Lohmeyer]
Yeah, so we’re not getting into the specifics of nanometers with a chip, et cetera. It’s more about the benefits that we deliver to our customers here. But the thing I will add is that that systems level point is really important, right? And so we take these fantastic Axion processors, but then we combine them with other elements of Google Cloud. For example, we have our titanium offload system, and so we’re able to offload some of the infrastructure operations to those titanium chips and systems, and that enables the full capabilities of the Axion processor to be able to be used for those workloads. And so it’s these types of things that really ultimately deliver the outcome.
[John Furrier]
System Design Architecture
So you guys are essentially doing some system design architecture around the role of compute for, say, because you’ve got HPC and AI have come together, right?
[00:06:01]
So we’ve seen that with TPUs and TPU configurations. So we’re looking at kind of a melting pot of re-architecting.
Re-Architecting the Data Center
Can you guys share more?
[Mohamed Awad]
Yeah, and that’s such an important point, right? I think the point that Mark made is such an important one, that what we’re seeing is that folks like Google, what they’re doing is they’re re-architecting the entire infrastructure, the entire data center to really think about, how do I optimize for my workloads? Nobody knows them better than them, right? How do they optimize the entire system? And that really starts down at the micro-architecture level. So we’re actually, you know, yesterday was a moment in time. We’ve been working with Google for literally years now on things like micro-code and micro-architecture optimization to ensure those workloads are optimized. And then they’ve been thinking about it from that micro-architecture all the way up through the system and the networking and how it’s going to interact with other chips in the system, et cetera.
[Savannah Peterson]
Partnership Duration and AI Hype
You just mentioned the duration of partnership there, and I’m curious, obviously AI having a moment, workload optimized, silicon is brilliant, frankly, and it’s absolutely vital for this next technological evolution.
[00:07:11]
If you can share, were you collaborating on this before the hype curve really started to peak, or was there a bit of timing that coincided with that?
[Mark Lohmeyer]
Collaboration before AI Hype
We’ve been collaborating together for a number of years now, so it was actually before the AI demand curve really hit us all together. But I think just to add to one thing Mohamed said earlier, I think it’s really important is the applications that this is able to power. And so Google for many years has been leveraging ARM for our internal services, things like Bigtable, Spanner, BigQuery. We also leverage it for things like the YouTube ads platform. And so we have experience there about how to optimize end-to-end those software services on top of the infrastructure. And now we’re super excited about being able to supercharge all of those to the next level with the Axion capabilities that we’ve talked about before.
[00:08:00]
So it’s for our internal use within Google, but then the fantastic thing is we also expose that as a cloud service to our external customers so they can get the same benefits for a broad range of general purpose applications.
[John Furrier]
ARM Advantages
That’s awesome. Great partnership with ARM. ARM has some advantages. They’ve got the mature OSS and the ecosystem. But before we get to that, I want to go back to the end-to-end real quick, if you don’t mind.
End-to-End Workloads
We’re seeing a pattern with people who are deploying and pushing production workloads and AI. They’re usually pretty well-defined end-to-end, and they’re going to need performance. So they’re going to have agents around them. So they’re saying, I want an end-to-end workflow. It’s pretty baked. It may change a little bit, but not much.
Custom Silicon for End-to-End Workflows
They know what they’re going to need. Is that a good use case for this kind of custom, I won’t say custom, oh, custom silicon, custom infrastructure? You guys are looking at this as providing a solution to that, right?
[Mark Lohmeyer]
Oh, yeah, absolutely. So if I take a specific example, right? Let’s say you have an existing application, and maybe that application is serving up some content to your users, right? Fantastic. Now you might want to add an agent into that application or add some new generative AI service into that application to sort of supercharge its capabilities, right?
[00:09:05]
Now when you do that, you’re going to drive a ton of additional end-customer demand for that service. What does that mean behind the scenes for the infrastructure? Well, you want to have a common set of cloud services and platforms. You want to be able to consume them through Kubernetes engine like GKE, but then under the covers, be able to power that general-purpose compute with great capabilities like Axion, and maybe power the AI part of it with our cloud GPUs or cloud TPUs. And so it’s really at that systems level that we’re able to deliver the just right type of infrastructure for the needs of every part of every application.
[John Furrier]
Workloads and Resource Pools
Do you see workloads being sent to different resource pools, kind of like almost like a scheduler, like, hey, I’m going to do a prompt answer that might be I.O., but then, hey, I want to go do some reasoning, and then go off and do a custom set of high performance?
[Mark Lohmeyer]
Oh, yeah, yeah, absolutely. So that is becoming a very, very dynamic world, right? And so if you think about the users using these applications, they’re going to exercise them in different ways. As they exercise them in different ways, we want to be able to have the infrastructure sort of respond to that dynamic nature of the request, and do that with great performance, great cost, and great efficiency.
[00:10:09]
And so having Axion-based instances in our portfolio allows us to do that at greater performance, greater scale, lower cost than ever before.
[Savannah Peterson]
Open Source and Kubernetes
Which is what everyone’s asking for right now, quite literally. Everyone wants that easy button, give it to me quicker and faster. You mentioned Kubernetes, and I can’t help but dig in a little bit on open source.
Open Source History
Yeah. Google’s got a long history of empowering the open source community. How does that play into this partnership?
[Mohamed Awad]
Yeah, so, yeah, Google’s got a long history of working open source. Arm’s got a long history of working open source as well. If you look at the infrastructure space specifically, we’ve literally been investing in the open source community for well over a decade now. Google’s been investing alongside us for a good chunk of that time as well. And what’s really interesting right now is that we started off over a decade ago trying to get that ecosystem going, but now we’re really at a point where tens of thousands of enterprise users are actually using Arm-based instances every day.
[00:11:00]
We’ve got hundreds of open source software packages which are readily available, and Arm’s a first-class citizen and is being a support arm. And so, really, we’re at a point now where the flywheel has really started spinning in a pretty massive way. So it’s a really interesting time to be part of that ecosystem. We actually believe it’s one of our strongest attributes. That coupled with the idea that you can take technology from us that’s performant and efficient, and then customize it at a system level to go off and build your whatever solution and custom tailor it to your workloads and your infrastructure.
[John Furrier]
Machine Learning Apps and ML Ops
So machine learning apps, that brings up the whole machine learning apps. We heard on theCUBE here yesterday a lot of talk about ML Ops. As you get more generative AI, you’re going to still have more model operations, whatever you want to call it now, but ML Ops is becoming essentially AI Ops, whatever you want to call it. As these apps get built, you guys offer a compiler, I think, OpenXLA.
[00:12:00]
Yes. That’s in your tool chain.
OpenXLA and ARM Project
Did you guys open source that? Is that an Arm project?
[Mohamed Awad]
I don’t know that we open sourced it specifically, but we’re certainly contributors in lots of different areas, including OpenXLA is an area that we participate in.
[John Furrier]
Edge Apps and Inference
Okay, so my question is developers. If I’m a developer and I want to build an Edge app, Savannah and I were riffing on this this morning, like, okay, if I want to build a lightweight Edge app, but I want to need to do inference and all kinds of cool stuff, how does this processor fit into that picture, or does it?
[Mark Lohmeyer]
Yeah, I can take a shot if you’d like. So if you think about any app, certainly an Edge app, as we were talking about before, those applications will have some services that are really the inferencing associated with the model, right? And they’ll have other more general purpose serving capabilities, right? There’s a web server, there’s an application server, there might be a database server behind it. So now you think about serving that composite application, you want to have just the right infrastructure behind each of those elements of that application. And so within Google, we have great support for GPUs and TPUs that are fantastic for serving those models in a very high performance efficient way.
[00:13:06]
We support OpenXLA together with our partners in Arm and many others as part of that, so it’s really critical. But together with that, you have all of the other aspects of that application. The web servers, maybe a Java server, you’ve got the backend database. All of those need to operate at great performance too. They need to operate at the same speed and the same performance as the AI model part of that. And so that’s really, Axion is incredibly complementary to those other ML-based, AI-based services that we offer as well.
[Mohamed Awad]
Edge Applications and ARM Partnerships
I just want to add one thing, which is when you talk about edge applications and that sort of thing, I think it’s important to say that Arm partners ship billions of devices into edge applications every year. And what we’re seeing quite often is that the folks that are developing those applications, let’s say it’s a car or let’s say it’s an industrial gateway or robotics device, they want binary compatibility in terms of their development environment.
[00:14:03]
They want to be able to go do sophisticated testing and they want to be able to do it in the cloud so they can take advantage of things like CICD pipelines, et cetera. And so the fact that they’re running that same architecture allows them to run that same code in both places. That’s a huge area that we’re seeing a lot of traction in.
[John Furrier]
Brand Relationships
You guys are two big brands in the tech space. Obviously Arm, well known for the work you guys do at many levels, from high performance in the data center to the edge and billions of devices. Google obviously, everyone knows what Google’s doing at all many levels.
Relationship between ARM and Google
Huge brands. For the people that aren’t inside the ropes of the industry, talk about the relationship. What does Arm bring into the table? What’s Google bring into the table? The results, the chip, I get that. You design chips with you. What’s the difference? How do you explain it to someone who’s not in the industry?
ARM and Google Contributions
What does Arm bring to the table? And Google, where do you guys meet? How does this all come together?
[Mohamed Awad]
Why don’t you start and I’ll jump in. Sure. Obviously, at the end of the day, Arm looks at some of these markets and some of these technologies from an end-to-end perspective.
[00:15:00]
We look at what does it mean to develop a CPU and a compute subsystem and interconnects for applications like the data center and automotive and IOT and all these different use cases. We’ve got tremendous amount of experience in that and we go off and we spend quite a bit in the software associated with that. But Google lives some of this on a day-to-day basis. They have the big iron. We deliver those platforms into a company like Google and then they go off and really start to hone it in and make it applicable and relevant and optimized for their particular use case. Custom silicon.
[Mark Lohmeyer]
Partnership at Multiple Levels
Yeah, absolutely. We’ve had a fantastic partnership over many, many years. We work together at so many different levels. So clearly at the silicon level, Google design systems leveraging the Arm architecture, Neoverse V2, but it goes beyond that. It’s also at the software layer as we were talking about before, which is really, really critical to our customers because it speaks to time to value. If you can take an existing application, an existing ISV that’s already been optimized for Arm and now you can easily bring it into Google Cloud and run it on Axion and get the benefits, that’s a huge time to market value and that is based on the great partnership that we have together at the software layer.
[00:16:12]
And then beyond that from a Google Cloud perspective, we surround that with the rest of the system. The storage that goes with it, the network, the Kubernetes consumption surface, the optimization of our applications on top of that and really deliver that as an end-to-end system. So it’s a deep, deep partnership at its heart that makes all of those bigger benefits possible.
[John Furrier]
Purpose-Built Custom Silicon
I mean, you’re basically taking Google’s objective design criteria with Arm’s capabilities together, hence custom silicons born out of it. So it’s basically purpose-built for good stuff.
[Mark Lohmeyer]
Google’s Expertise in Large-Scale Applications
Exactly, and as you mentioned before, if you think about Google, we have experience building and operating and optimizing applications for consumers that are leveraged by billions of consumers around the world each, right? And so that expertise we have around what it takes to run applications at large scale, we’re able to leverage that, take that and use that as we design together what Axion needs to look like and do.
[00:17:06]
[John Furrier]
Benefits for Users
So now I’m like, okay, what’s in it for me? Performance, what are some of the benefits going to come from this? How do you guys see this?
Output of the Investment
Obviously huge investment, huge decision, huge development, what’s going to be the output of this? What’s the range of scope to benefits for us?
[Mark Lohmeyer]
Benefits for Cloud Customers
I mean, maybe I can start. So first we think this is going to be huge for our cloud customers, right? They are looking to scale their business, they’re looking to scale their applications to support that. When they do that, they need to do it with great performance, with cost effectiveness because that drives the profitability of their business and energy efficiency is incredibly important. And so with Axion, we can enable that value to them incredibly rapidly because they can bring all their existing ARM optimized applications into that environment incredibly quickly. So we think this is going to be transformative for a broad range of our cloud customers. And then also for us internally within Google, we’re going to be leveraging this for, as I mentioned before, for our consumer-based services.
[00:18:03]
So we’re super excited about those benefits that we can deliver.
[Mohamed Awad]
Benefits for Enterprise Users
Yeah, I mean, I think Mark hit on it. I mean, I think the biggest things are, if you think about what enterprise and what users are looking for, they’re looking for performance, they’re looking to get the job done, and they’re looking to go off and hit some of those efficiency and sustainability goals. And really what Axion and leveraging that ARM-based Neoverse platform is, it’s really delivering, so.
[Savannah Peterson]
Sustainability and Efficiency
I think that’s so powerful and I think that’s so important. Here we are always trying to do things faster and adopt and now we’ve got massive amounts of data. If we’re not doing it more sustainably and more efficiently, we’re running ourselves straight into the ground. So it’s such a great synergy between both companies.
Future Expectations
Closing question for you. Now that it’s out in the wild and you’ve made the big announcement, years in the making, which is very exciting, what do you hope that you can say at the next Google Cloud Next about how it’s being used in the wild?
[Mark Lohmeyer]
Customer Adoption and Success Stories
You want to take it? I’ll go first, maybe? Sure.
[Savannah Peterson]
I love how you both deferred there. You’re like, I need a second.
[Mark Lohmeyer]
So, you know, first of all, I’d say, look, we’re really looking forward to launching our Axion-based instances later this year, getting those into the hands of our cloud customers.
[00:19:08]
If you fast forward, let’s say, a year from now to next year, I’m looking forward to having hundreds, thousands of customers being able to share the benefits that that provided to them, to their businesses, to their applications, and how it was really enabling them to transform the value they could add to their customers. So I’m really looking forward to that customer adoption. I think the feedback we’ve already gotten from our early adopter customers, companies like Elastic, like Snap, like many others, has been extremely positive. And so we’re looking to build on that and, of course, help many, many more customers going forward.
[Mohamed Awad]
Yeah, and I would just add to that, I think it’s really about those specific use cases, about those specific customer success stories that have come about, and that we can stand up and say, hey, this came about and this customer achieved that level of success, they had that level of efficiency gain, that much better performance, they could complete their job, they saved this much. It’s those sorts of things that I think I’m really looking forward to sharing with everybody.
[00:20:01]
Conclusion
Great stuff.
[Savannah Peterson]
Yeah, really great stuff, and what an exciting time. Mark, Mohamed, thank you both so much for being here, and congratulations again on what will undoubtedly be a legendary Next for you. John, always a pleasure with your insights, and thank all of you for tuning in wherever you may be viewing today. We’re here in Las Vegas, Nevada, at Google Cloud Next. My name’s Savannah Peterson, you’re watching theCUBE, the leading source for enterprise tech news.