Ep 2. The Mainframe’s Role in Generative AI for the Enterprise | AI Insights and Innovation

[David Linthicum]

Introduction

Welcome to AI Insights and Innovation, your go-to podcast for the latest news, trends, and insights in artificial intelligence, including generative AI. Join us as we explore groundbreaking developments, interview leading experts, and provide in-depth commentary and advice on how you can find AI success with your organization. I’m your host, David Linthicum, author, speaker, B-list geek, longtime AI systems architect, and analyst at ThoughtCube Research.

Mainframe’s Role in Generative AI

Let’s get into the topic. So this is an interesting one from this week. I’m fresh back from the Broadcom Analyst Conference, Analyst Forum that they put on in Boston, and got a lot of information on the current state of the mainframe. Obviously Broadcom, big into mainframe software. They bought CA, obviously Broadcom just bought VMware.

[00:01:00]

So up and coming, older organization made of older piece parts, but becoming a larger player in the IT universe. So what I decided to do this week is look at the role that mainframes play in the world of generative AI, because that comes up as a constant question for me. They say, where should their generative AI systems run? Where should they store their training data or user training data from? Where should they run their inference engines from? And obviously lots of great questions, and the answer is going to be, well, it depends. And you have to look at the different tools and technologies that are out there, including the mainframe, in trying to figure out which ones are going to be the right platform for the different components of a generative AI architecture. And obviously the three legs of the stool there are going to be the training of the models, the running inferences on the models, and the ability to have interfaces into the models, APIs, chatbots, natural language processors, things like that.

[00:02:06]

Mainframe Architecture

So let’s talk about these architectures. First, let’s figure out what a mainframe is. Many of you who are probably listening to this podcast haven’t worked with mainframes. I’m 62 years old, obviously started my career doing mainframes and obviously moved over in distributed computing, PCs, client servers, service-oriented architecture, cloud computing, web-based development, all kinds of platforms, but cut my teeth in developments on mainframe systems. In fact, the first mainframe systems that I worked on were AI-based systems and did list-based programming, things like that, and leveraging mainframe systems for doing that because it was kind of the only game in town. It was the only processors around that were powerful enough. I’m talking about 1986, 87 timeframe that could run AI systems. But here we are in 2024.

What is a Mainframe?

So what does this world look like? Well, first let’s talk about what mainframes are, and I asked Chet GPT to give us a definition of a mainframe.

[00:03:05]

A mainframe is a large, powerful computer designed to process complex calculations and manage vast amounts of data. It can handle multiple tasks and support numerous users simultaneously without impacting performance or security. Mainframes are essential for critical applications that require high levels of reliability, availability, and serviceability. They are commonly used by large enterprises and industries such as finance, healthcare, and governments for tasks that involve intensive data processing, transaction processing, and large-scale computations. These are huge machines. They measure their memory in terabytes, so it’s kind of another level of play. And mainframes have also changed their states over the years. We used to use them as primarily centralized processors. They can act as storage servers, database servers, different process servers. They can participate in very complex distributed systems inclusive of cloud-based systems that are working with mainframe-based systems.

[00:04:03]

In fact, many of the hybrid clouds that I see out there are going to be mainframes that are leveraging public cloud providers and the ability to partition applications and data sets between public cloud providers and mainframe systems.

Mainframe in Generative AI Systems

But let’s get back to the topic at hand. How do they work with generative AI systems? Well, first and foremost, people are pushing out guidelines, even some of the mainframe vendors out there, in terms of how you should leverage a mainframe in processing generative AI. Some of them are saying, do the training of the models out on the cloud platforms and do the inferences on the mainframe, and sometimes vice versa. There is no hard and fast definition about how to use this stuff, and the architecture that you’re going to derive, leveraging mainframes or not, is going to be largely dependent on your requirements. Keep that in mind. Anytime someone has some sort of an overreaching foundational set of criteria for how you should leverage a platform in leveraging a certain type of process, in this case, generative AI processing, it’s going to be wrong a great many times, because it doesn’t understand your particular use case, how you’re leveraging this technology effectively.

[00:05:19]

So keep that in mind. Anybody has very general advice, normally that general advice is not going to be right in every bit of scenario. So you have to take an architectural eye to this stuff to figure out how you’re going to make it work.

Mainframe as a Data Service

So mainframes have a function within generative AI, because in many instances, the training data is going to be stored. Some people are looking at implementing these systems, they’re looking at copying the data from the mainframe systems, basically to some sort of an intermediate storage device, perhaps doing some reformatting or rudimentary realignment of the data, so it can be leveraged as training data.

[00:06:01]

And so in other words, adding additional steps between the transactional state of the data, and the ability for the data to be leveraged as legitimate training data for generative AI systems, the assertions I make is that normally doesn’t need to be the case, you can leverage mainframe systems as servers for training data. And if you’re worried about the format and the way in which it’s producing the information, there’s lots of interfaces, lots of middleware, lots of storage systems where you can access mainframe data, using any number of ways, we can look at his object based databases and structured databases and relational databases, and all those sorts of things are within the realm of possibility. So when you’re thinking about copying the data or moving the data in some sort of an ETL kind of a scenario, to some sort of an intermediate storage system or intermediate database for the purpose of just using it as training data, think Lardon long and hard about what you’re doing. And so the mainframe system can be a awesome producer of data, they can do so at scale, we just mentioned huge amount of processing power within these systems.

[00:07:10]

And the ability to see these things as a data service that you’re able to leverage this training data for generative AI systems is going to be probably not a bad architectural approach in many instances.

Mainframe Scalability and Performance

But of course, your particular architecture is based on your needs, and what you’re looking to do and dealing with security and governance and all the things that come into play there. So we’re doing this for a couple of reasons, mainframe systems, huge scalability and performance. So if we’re looking to get at massive amounts of information, and that’s normally what we’re doing with training data, we’re able to get at the processing of that data in very complex computations that are necessarily for advanced AI models are going to be able to be handled by the mainframe systems. So if we need to do manipulation of the data, we need to do complex calculations, we need to do those sorts of things.

[00:08:06]

Again, the mainframe will be a very good candidate for running those sorts of things to support a generative AI system.

Mainframe Use Case Variations

And so the case for using the mainframe varies. The canned architectures that we’re seeing out there, as I mentioned before, you have to be very careful with those. If people are giving you general guidance, but understanding your own requirements, normally, they’re not considering everything that need to be considered and looking at your architecture, including misaligning the role of the mainframe for your particular use case. So look at your architecture holistically. And normally, if a mainframe is going to be involved, you already have one in place. This is not about adding a mainframe just for the processing of training data. This is about the majority of your data within an enterprise. And this is the case within many enterprises exist on this mainframe system, because they need the computational intensive characteristics of a mainframe, very dense processing, things like that, able to do a lot of things very quickly in a short amount of time.

[00:09:14]

And we’re going to find that leveraging a mainframe for a training data service is going to be indicated many times, I think many times more than people originally thought.

Mainframe’s Role in Generative AI

So the mainframe has a role. It’s dependent on your use case. But never calculate the mainframe out of your generative AI framework just because it’s a mainframe. There seems to be that kind of take out there. People look for generative AI primarily going to run on the cloud, which is perfectly fine, but it’s very expensive in doing so, or running on premise, which is also perfectly fine. But include a lot of your existing systems in that particular architecture so we’re not being additive to the cost and additive to the complexity. People typically have mainframes.

[00:10:01]

So this is not about working around the mainframe. This is about working within the mainframe to build your generative AI system.

Working Within the Mainframe

You’re going to find it’s going to be cheaper, better, more effective, and more efficient if you look at these sorts of architectures. So seamless integration with modern technologies, mainframes are able to integrate with any number of systems. Back in the day when I first started, they were very difficult to integrate as data services and they were centralized processors. So everybody was using mainframes via green screens, 3270 terminals, things like that. Those days are over. So they’re big, huge, honking processors that exist within your data center. They’re able to do lots of stuff very quickly and they’re easy to integrate.

Seamless Integration with Modern Technologies

Broadcom obviously is a software company, has many different integration technologies, middleware technologies to communicate with the mainframes, but there’s lots of companies out there that still support it. And so the accessibility of the mainframe is never going to be a problem. We can have it communicating with the cloud providers, we can have it communicating with on-premises systems, we can have it communicating with traditional LAMP step systems, things like that, microcomputers, edge-based systems, things like that.

[00:11:09]

And we don’t even know it’s a mainframe. It has a seamless integration system, we’re leveraging the processing, we’re leveraging the data, we’re able to do so through common API systems that are very easy to implement. So keep that in mind, this is not going to be a problem in you having a problem with integrating with the mainframe systems, the capabilities, the software is there. This is 2024, this is not 1984 anymore. All those problems have been solved and the mainframes that we used 23 years ago are very different than the mainframes we use today. We use open systems, things like that.

Future Trends of Mainframe Technology

This isn’t going to be a mainframe promotional podcast, but this is going to be looking at the opportunities for leveraging mainframes as part of the generative AI systems and that certainly is going to be the case. So what are the future trends of this technology? Well, I think that the mainframes aren’t going away.

[00:12:02]

Obviously, we’re moving into multiple different directions, including cloud computing, edge-based systems, some of the newer on-premises systems, hyperconverged architectures, all these things that are starting to emerge that exist on-premises as well as in the cloud.

Heterogeneous Systems

One thing we’re not doing is moving to a particular system only. So this is not about cloud-only architecture. I think in many cases, we found that over-utilizing cloud computing has a tendency to be very expensive. So moving forward, this is about moving to a world that’s going to be very heterogeneous. Mainframes are going to live in it, cloud computing is going to live in it, edge-based systems are going to live in it, wearable computing like my wristwatch is going to live in it. And the ability to create an application instance like a generative AI system across many different platforms is going to be your path to success, also your path to the most optimized architecture, which is going to lead you to optimize the cost, which is going to lead you to bringing the most value back to the business.

[00:13:08]

All Platforms Are Not Suited for All Use Cases

So if this is about leveraging one platform over another, you’re going to find that all platforms are not suited for all use cases. There’s reasons to use mainframes, there’s reasons to use the cloud, there’s reason to use different services in the cloud, different processors and platforms that exist with on-premise systems. All those things are there as an architectural framework that you can assemble and you can compile to create something that’s going to be most optimized for the problems you’re looking to solve for the business.

Generative AI Architectures Across Platforms

In the case of generative AI, I’m finding that as we’re building these architectures, they’re never based on one platform. They’re going to be running across different platforms, across different data services for the training data, different systems for the inference engines, and that depends on what you’re looking for and that type of processing.

[00:14:01]

Aspects are running on cloud, aspects are running on-premises, and sometimes it’s running all on-premises. And all those things should be architectural options that you’re considering. This should never be only one particular platform.

Avoiding Platform Exclusivity

If you’re doing that, chances are you’re doing it wrong. You’re not getting to the most cost-effective mechanisms for making that happen. So look at all of your platforms in your portfolio and their ability to do what they do best.

Mainframe for Data Services and Inference Processing

In this case, look at the mainframe for some of the data services for the training data, perhaps even doing some inference processing on the mainframe systems. But other systems that exist on-premises in your existing data center, sitting next to your mainframe, are assets that exist on the cloud or managed service providers or co-location providers or whatever you’re looking at. Systems into the future are never going to be existent on one particular platform.

No Single Platform Solution

There’s never going to be one particular platform that’s going to be the answer for every problem that we’re looking to solve out there. And I think that’s what people are looking for.

[00:15:00]

The Best Platform for Generative AI Systems

People come up to me at conferences after I speak, and they always ask me, what’s the best cloud or what’s the best platform for generative AI systems? Can’t answer that question. It depends on lots of different variables and how you’re leveraging that technology, the kind of business problems you’re looking to solve, the kind of assets you have in your current configuration, your as-is state, where your to-be state is going, your IT strategy, your security parameters, regulatory pressures, compliance, things like that. All these things come into driving the kind of solutions you’re looking to get to.

Conclusion

So that’s it in a nutshell. And so hopefully that made sense to you as kind of pragmatic advice and how we should approach generative AI systems in light of existing systems you may have around, including the mainframe. There’s not a reason to eliminate the mainframes. There’s a reason to look at their viable participation in this particular architecture. That’s what I’m arguing here. What that participation is depends on the problem you’re looking to solve.

[00:16:01]

Looking at All System Aspects

But if there’s one message you take out of this podcast is the fact that we’re going to have to look at all aspects of the systems, all aspects of our current problem domain that we’re building, and the ability to look at creating the most optimized architecture with all these assets that are in front of us. And so it’s a complex process to go through. I think architecture in general is a very complex thing to do.

Complexity of Generative AI Systems

Generative AI systems are no easier. They’re large distributed systems at the end of the day with different problems that they bring into the equation in terms of what we need to solve, I.O., massive processing, things like that, looking at using GPUs or CPUs, all these things that come into it. So we’re going to have to be creative in how we solve these solutions.

Creative Solutions

This is not about throwing money and resources and GPUs at the problem. This is about using the most appropriate infrastructure, the most appropriate architecture to solve the particular problems at hand.

Using Appropriate Infrastructure and Architecture

That’s the point I’m trying to make.

[00:17:01]

Thank You

But anyway, thank you very much.