Microsoft Build 2024: CEO Nadella Declares ‘A Golden Age Of Systems’
‘I still remember distinctly the first time Win32 was discussed … .Net, Azure. These are moments that I’ve marked my life with. And it just feels like we’re, yet again, at a moment like that,’ Microsoft CEO Satya Nadella said in his keynote at Build 2024.
Microsoft Chairman and CEO Satya Nadella declared “a golden age of systems” in his Build 2024 keynote Tuesday, taking time to walk an audience of developers through the tech giant’s ongoing innovations in artificial intelligence across infrastructure, data, tooling, applications and more.
The Redmond, Wash.-based vendor has made breakthroughs in making computers understand humans instead of needing laypeople to understand computers, plus making computers assist humans in reasoning, planning and taking actions, Nadella said.
“This is, like, maybe the golden age of systems,” Nadella said. “What’s really driving it? I always come back to the scaling laws. Just like Moore’s Law helped drive the information revolution, the scaling laws of DNNs [deep neural networks] are really—along with the model architecture, interesting ways to use data, generate data—driving this intelligence revolution.”
[RELATED: Microsoft Build 2024: 5 Things To Know About Copilot+ PCs]
Microsoft Build 2024
While Intel co-founder Gordon Moore said that the number of transistors on an integrated circuit should double every two years with minimal cost increases, the DNN scaling law sees “doubling every six months,” Nadella said.
Nadella made multiple comparisons to tech history to illustrate the gravity of the AI moment. Microsoft’s Windows Copilot Runtime will do to AI what 32-bit Windows did for GUIs. The rollout of Copilot across organizations reminds Nadella of the start of the PC era, both “democratizing expertise.”
As for scaling, Microsoft has added 30 times the supercomputing power to Azure in six months, Nadella said. Microsoft-backed OpenAI’s latest GPT AI model is 12 times less expensive and six times faster since launch.
Looking ahead, Nadella predicted that the next exciting iteration of Copilot work will involve users making agents who can perform work in the background asynchronously. “That’s one of the key things that’s going to really change in the next year,” he said.
Although the two men didn’t share the stage, speaking after Nadella Tuesday was Sam Altman, CEO of OpenAI, the creator of ChatGPT, Dall-E and other popular generative AI tools.
Altman remarked that AI models keep getting smarter and that the hype has reached heights seen during the mobile phone revolution of the late 2000s when every company spoke about their mobile capabilities and whether they had a mobile app. “A few years later, no one said they were a mobile company because it was like table stakes,” Altman said.
In an observation that could apply to not just developers but executives and solution providers as well, Altman said that adopting AI, like adopting other evolutionary technologies, “doesn’t get you out of the hard work of building a great product or a great company or a great service.”
“AI alone is a new enabler, but it doesn’t automatically break the rules of business,” Altman said. “You still have to figure out how you are going to build enduring value in whatever you are doing. And it is easy to lose sight of that in the excitement of the gold rush.”
Here is more of what Nadella said during his Build 2024 keynote.
A ‘Golden Age Of Systems’
I still remember distinctly the first time Win32 was discussed … .Net, Azure. These are moments that I’ve marked my life with.
And it just feels like we’re, yet again, at a moment like that. It is just that the scale, the scope is so much deeper, so much broader this time around. Every layer of the tech stack is changing.
Everything—from the power draw and the cooling layer of the data center to the NPUs and the edge—are being shaped by these new workloads, these distributed, synchronous … workloads are reshaping every layer of the tech stack.
But if you think about even going all the way back to the beginning of modern computing … there have been two real dreams we’ve had.
First is can computers understand us instead of us having to understand computers? And second, in a world where we have this ever-increasing information of people, places and things, as you digitize more artifacts … can computers help us reason, plan and act more effectively on all that information? … And here we are, I think that we have real breakthroughs on both fronts. … This is maybe the golden age of systems.
What’s really driving it? I always come back to the scaling laws. Just like Moore's Law helped drive the information revolution, the scaling laws of DNNs are really—along with the model architecture, interesting ways to use data, generate data—driving this intelligence revolution.
You could say Moore's Law was probably more stable in the sense that it was scaling at maybe 15 months, 18 months. We now have these things that are … doubling every six months.
What we have, though, with the scaling laws is a new natural user interface that is multimodal—that means it supports tech, images, video as input and output.
We have memory that retains important context, recalls both our personal knowledge and data across our apps and devices. We have new reasoning and planning capabilities that help us understand very complex context and complete complex tasks while reducing the cognitive load on us.
But what stands out to me as I look back at this past year is how you all as developers have taken all of these capabilities and applied them, quite frankly, to change the world around us. … The rate of diffusion is unlike anything I’ve seen in my professional career. And it’s just increasing.
Windows Copilot Runtime
What Win32 was to graphical user interface, we believe the Windows Copilot Runtime will free for AI. It starts with our Windows Copilot Library, a collection of these ready-to-use local APIs. … This includes no-code integrations for Studio Effects, things like creative filters, teleprompter, voice focus and much more.
But of course, if you want to access these models … you can directly call them through APIs. We have 40-plus models available out of the box, including Phi-Silica, our newest member of our small language family of models, which we specifically designed to run locally on your NPUs on Copilot+ PCs, bringing that lightning-fast local inference to the device.
The other thing is that Copilot Library also makes it easy for you to incorporate RAG [retrieval-augmented generation] inside of your applications with on-device data.
It gives you the right tools to build a vector store within your app. … We will be natively supporting PyTorch and the new WebNN [Web Neural Network] framework through Windows DirectML.
Native PyTorch support means thousands of OSS [open-source software] models will just work out of the box on Windows, making it easy for you to get started.
In fact, with WebNN, web developers finally have a web-native machine learning framework that gives them direct access to both GPUs and NPUs. … Both PyTorch and WebNN are available in developer preview today.
The Copilot Stack
We’ve always been a platform company. And our goal is to build the most complete, end-to-end stack—from infrastructure to data to tooling to the application extensibility so that you can apply the power of this technology to build your own applications. … We have the most complete, scalable AI infrastructure that meets your needs in this AI era.
We’re building Azure as the world’s computer. We have the most comprehensive global infrastructure with more than 60-plus data center regions, more than any other cloud provider.
Over the past year we’ve expanded our data center regions and AI capacity from Japan to Mexico, from Spain to Wisconsin. We’re making our best-in-class AI infrastructure available everywhere. … At the silicon layer, we are dynamically able to map workloads to the best accelerator AI hardware so that we have the best performance.
And our custom I/O [input/output] hardware and server designs allow us to provide dramatically faster networking, remote storage and local storage throughput.
This end-to-end approach is really helping us get to the unprecedented scale. In fact, last November, we announced the most powerful AI supercomputer in the cloud for training using just actually a small fraction of our infrastructure. And over the past six months, we've added 30 times that supercomputing power to Azure. It is crazy to see the scale.
Nvidia, AMD Partnerships
We are not just scaling training fleets. We are scaling our inference fleet around the world, quadrupling the number of countries where Azure AI Services are available. … We offer the most complete selection of AI accelerators, including from Nvidia and AMD as well as our own Azure Maia, all dynamically optimized for the workloads.
That means whether you’re using Microsoft Copilot or building your own copilot apps, we ensure that you get the best accelerator performance and the best cost. … You see this in what has happened with GPT-4. It’s 12X cheaper and 6X faster since it launched. … It all starts, though, with this very deep, deep partnership with Nvidia, which spans the entirety of the Copilot stack, across both all of the hardware innovation as well as the system software innovation.
Together, we offer Azure confidential computing on GPUs to really help you protect sensitive data around the AI models end to end. … We will be among the first cloud providers to offer Nvidia’s Blackwell GPUs B100s as well as GB200 configurations.
And we’re continuing to work with them to train and optimize both large language models like GPT-4o as well as small language models. … Beyond the hardware, we are bringing Nvidia’s key enterprise platform offerings to our cloud, like the Omniverse Cloud and DGX Cloud to Azure, with deep integration with even the broader Microsoft Cloud.
For example, Nvidia recently announced that their DGX Cloud integrates natively with Microsoft Fabric. That means you can train those models using DGX Cloud with the full access to Fabric data. And Omniverse APIs will be available first on Azure for developers to build their industrial AI solutions.
We are the first cloud to deliver a general availability of VMs based on AMD MI300X AI accelerators. It’s a big milestone for both AMD and Microsoft. … It offers the best price/ performance on GPT-4 inference.
And we will continue to move forward with Azure Maia. In fact, our first clusters are live. And soon, if you’re using Copilot or one of the Azure OpenAI Services, some of your prompts will be served using Maia hardware.
Microsoft Cobalt, OpenAI
Beyond AI, our end-to-end systems optimization also makes the development of cloud-native apps better.
Six months ago is when we announced our first, general-purpose, Arm-based compute processor, Microsoft Cobalt. … I’m really excited to announce the public preview of Cobalt-based VMs.
Cobalt is being used for video processing, permissions management in Microsoft 365, helping power billions of conversations on services like Microsoft Teams already.
And we’re delivering that same Arm-based performance and efficiencies to many customers. … In our most recent benchmark data and tests, our Cobalt 100 VMs delivered up to 40 percent better performance than any other generally available Arm-based VM. So we are very, very excited about Cobalt getting into the market.
Let’s move up the stack to the foundation models. With Azure AI, we offer the broadest selection of frontier and open-source models, including LLMs and SLMs, so you can choose the model that makes the most sense for your unique needs and your application needs.
In fact, more than 50,000 organizations use Azure AI today. It is great momentum. And it all starts with our most strategic, most important partnership with OpenAI. Just last week, OpenAI announced GPT-4o, their latest multimodal model, which was trained on Azure.
It is an absolute breakthrough. It has text, audio, image and video as input and output. It can respond and just have a humanlike conversation that’s fast and fluid. It can even be interrupted midsentence. … As OpenAI innovates, our promise is that we will bring all that innovation to Azure too.
In fact, the same day that OpenAI announced GPT-4o, we made the model available for testing on Azure OpenAI Service. … I’m excited to say that it’s generally available on Azure OpenAI. What this means is that now we can have these groundbreaking apps that all of you can build using this capability.
One of the coolest things is that now any app, any website can essentially be turned into a full multimodal, full duplex conversational canvas. … I really want to thank the OpenAI team for their partnership and really their responsible approach to innovation, helping our industry going forward.
Phi-3 Model Family Updates
We are bringing lots and lots of other models as well from Cohere and Databricks and Deci, Meta, Mistral, Snowflake all to Azure AI.
We want to support the broadest set of models from every country, every language. … We’re bringing models from Cohere, G42, NTT Data, Nixtla, as well as many more as models-as-a-services because that’s the way you can easily get to managed AI models. … We are adding not just large language models, but we’re also leading the small language model revolution.
Our Phi-3 family of SLMs are the most capable and most cost-effective. They outperform models of the same size or the next size up even across a variety of language, reasoning, coding as well as math benchmarks. …. We are adding new models to the Phi-3 family to add even more flexibility across that quality-cost curve.
We’re introducing Phi-3-vision, a 4.2 billion parameter multimodal model with language and vision capabilities. It can be used to reason over real-world images, generate insights and answer questions about images. … We are also making our 7 billion-parameter Phi-3-small and 14 billion-parameter Phi-3-medium models available. With Phi, you can build apps that span the web, Android, iOS, Windows and the edge.
They can take advantage of local hardware when available and fall back on the cloud when not, simplifying really all of what we as developers have to do to support multiple platforms using one AI model.
Azure AI Studio
Azure AI Studio now is generally available. It’s an end-to-end development environment to build, train and fine-tune AI models and do so responsibly.
It includes built-in support for what is perhaps the most important feature in this age of AI, which is AI safety. Azure AI Studio includes state-of-the-art safety tooling to everything, from detecting hallucinations in model outputs, risk and safety monitoring.
It helps understand which inputs and outputs are triggering content filters. Prompt shields, by the way, to detect and block these prompt-injection attacks. …. We are adding new capabilities, including custom categories so that you can create these unique filters for prompts and completions with rapid deployment options, which I think is super important as you deploy these models into the real world if an emerging threat appears.
Beyond Azure AI Studio, we recognize that there are advanced applications where you need much more customization of these models for very specific use cases. … Azure AI custom models will come, giving you the ability to train a custom model that’s unique to your domain, to your data that is perhaps proprietary.
The same builders and data scientists who have been working with OpenAI brought all of the Phi advances to you and will work with all of you to be able to build out these custom models.
The output will be domain-specific. It will be multitask and multimodal. Best-in-class as defined by benchmarks, including perhaps even specific language proficiency that may be required.
Microsoft Fabric
We now have over 11,000 customers, including leaders in every industry, who are using Fabric. It’s fantastic to see the progress. … With Fabric, you get everything you need in a single, integrated SaaS platform.
It’s deeply integrated at its most fundamental level with compute and storage being unified, your experiences unified, governance is unified and more importantly the business model is unified. … We’re introducing real-time intelligence in Fabric.
Customers today have more and more of that real-time data coming from your IoT systems, your telemetry systems.
In fact, cloud applications themselves are generating lots of data. But with Fabric anyone can unlock actionable insight across all of your data estate. … We’re making it even easier to design, build and interoperate with Fabric with your own applications.
In fact, we’re building out a new app platform with Fabric Workload Development Kit. … This is the first time where the analytics stack is really a first-class app platform as well.
And beyond Fabric, we are integrating the power of AI across the entirety of the data stack. There is no question that RAG is core to any AI-powered application, especially in the enterprise today.
And Azure AI Search makes it possible to run RAG at any scale, delivering highly accurate responses using state-of-the-art retrieval systems.
And with built-in OneLake integration, Azure AI Search will automatically index your unstructured data, too. And it is also integrated into Azure AI Studio to support bringing your own embedding model, for example.
GitHub Copilot Workspace
With [GitHub] Copilot Workspace … we are an order of a magnitude closer to a world where any person can go from idea to code in an instant.
You start with an issue. It creates a spec based on its deep understanding of your codebase. It then creates a plan which you can execute to generate the code across the full repo. … At every point in this process … you are in control. You can edit it.
And that’s really what is fundamentally a new way of building software. And we’re looking forward to making it much more broadly available in the coming months. …. We’re really thrilled to be announcing GitHub Copilot extensions.
Now you can customize GitHub Copilot with capabilities from third-party services, whether it’s Docker, Sentry and many, many more.
And of course we have a new extension for Azure, too. GitHub Copilot for Azure. You can instantly deploy to Azure to get information about your Azure resources just using natural language. And what Copilot did for coding, we’re now doing for [infrastructure] and [operations].
Copilots Akin To PC Era Start
We built Copilot so that you have the ability to tap into the world’s knowledge as well as the knowledge inside of your organization and act on it.
Copilot has had a remarkable impact. It’s democratizing expertise across organizations. … It reminds me of the very beginning of the PC era where work, the work artifact and the workflow were all changing.
And it’s just really having broad enterprise business process impact. … Since no two business processes are the same, with Copilot Studio you now can extend Copilot to be able to customize it for your business processes and workflows. … We are introducing Copilot connectors in Copilot Studio so you can ground Copilot with data from across the [Microsoft] Graph, from Power Platform, Fabric, Dataverse. … You now have all the third-party connectors for SaaS applications from Adobe, Atlassian, ServiceNow, Snowflake and many, many more.
This makes the process of grounding Copilot in first- and third-party line-of-business data just a wizard-like experience, enabling you to quickly incorporate your own organizational knowledge and data.
We are also extending Copilot beyond a personal assistant to become a team assistant. … You will be able to invoke a Team Copilot wherever you collaborate in Teams. It can be in Teams, it can be in Loop, it can be in Planner. ... It can be your meeting facilitator when you’re in Teams, creating agendas, tracking time, taking notes for you.
Or a collaborator writing chats, surfacing the most important information, tracking action items, addressing unresolved issues. And it can even be your project manager, ensuring that every project that you’re working on as a team is running smoothly.
These capabilities will all come to you all and be available in preview later this year. And we’re not stopping there.
With Copilot Studio, anyone can build copilots that have agent capabilities and work on your behalf and independently and proactively orchestrate tasks for you. Simply provide your Copilot a job description or choose from one of our pre-made templates … and Copilot will work in the background and act asynchronously for you.
That’s one of the key things that’s going to really change in the next year. You are going to have copilots plus agents with this async behavior.
You can delegate authority to copilots to automate long-running business processes. Copilot can even ask for help when it encounters situations that it does not know much about and that it can’t handle.