Malek El Khazen
Data, AI & IoT Cloud Solution Architect at Microsoft
By Malek el Khazen – edited text by OpenAI
In the AI world, the obsession with “bigger” has driven an arms race for larger models, faster chips, and sprawling data center setups. But bigger doesn’t mean better. The future will reward precision, efficiency, and purpose-built solutions over sheer scale.
Smaller models are emerging as the cornerstone of this shift. We’re transitioning from overly prompt-heavy frameworks to something more streamlined, agentic in design. It’s not just about saving compute costs—it’s about crafting systems that deliver smarter results. Training these smaller models, however, remains compute-intensive, requiring continuous innovation in chip design due to multimodality demands. Whether examining closed-source giants or open-source initiatives, the debate isn’t about size; it’s about deployment strategy. Both open- and closed-source GenAI models are converging in performance, but their distinctions persist, mirroring other technologies such as security or enterprise software in terms of open-source and closed source. Enterprises prioritize accountability, SLAs, and deployment ecosystems, making the decision one of trade-offs based on specific needs rather than a binary choice.
The Supply Chain Crunch and the Slow Path to Sustainability
Data centers are under strain. GPU demand, led by Nvidia, continues to soar, while AI servers face supply chain bottlenecks and environmental challenges. Renewable energy receives much attention, yet the reality is more nuanced. Advances in battery storage, from solid-state to ultracapacitors, alongside hydrogen and nuclear (notably Small Modular Reactors), hold promise. However, significant regulatory barriers, particularly lengthy licensing processes, hinder progress. Streamlining these approvals will be crucial for meaningful advancements in the next six to seven years.
Looking ahead, cold plate cooling and energy-efficient DCIM (Data Center Infrastructure Management) will play pivotal roles. Solutions such as anomaly detection, real-time monitoring, and peak shaving are already reducing costs and environmental impacts. These incremental yet impactful strategies defined much of 2024, and they will likely accelerate in FY25. Efficiency isn’t about scaling down—it’s about scaling smarter and optimizing the usage of our resources to meet our Environmental goals.
Healthcare’s Lesson: Precision Saves Lives
Healthcare remains a proving ground for AI’s transformative potential. Consider the AI-powered sepsis detection system from Johns Hopkins. This model wasn’t about scale—it excelled by focusing on precision. By combining patient histories, lab results, and real-time symptoms, it provided actionable warnings that saved lives.
Microsoft’s Project Sparrow extends this ethos to ecological challenges. Sparrow, a lightweight AI-driven edge device, collects and transmits vital ecological data from remote and isolated regions. This compact, targeted approach exemplifies how smaller models and specialized devices can deliver outsized results when thoughtfully deployed. The message is clear: impact isn’t a function of size but of focus.
The Copilot Effect: Redefining Everyday AI
Microsoft’s Copilot Vision is redefining productivity. Integrating innovations such as Click to Do, advanced search, and real-time browser interactions, AI is evolving into an executive aide, intuitively enhancing workflows and reshaping the digital workspace.
Testing Microsoft Copilot+ PCs has revealed an important shift. While their NPUs achieve impressive processing power exceeding 40 TOPS, the true transformation lies in their ability to enhance productivity. This shift underscores a broader trend: designing technology for outcomes, not raw performance. Copilot exemplifies this principle by delivering impactful results with efficiency at its core.
Humanoid Robotics: Purpose-Driven Progress
The robotics sector continues to affirm that “bigger doesn’t mean better.” Purpose-built humanoid robots from Boston Dynamics and Tesla demonstrate this. These machines are not universal but are optimized for specific tasks like manufacturing, logistics, or maintenance.
Tesla’s Optimus focuses on repetitive tasks, leveraging the Full-Self Driving platform. It exemplifies the agentic framework, where robots are tailored for niche applications, whether in bioscience, advanced reasoning, or data center operations. The challenge is not limited to software—hardware design for multifunctionality and remains an obstacle which will make it extremely unlikely to create a superhuman robot at least not in the next 5 years. Current capabilities allow robots to excel in three or four tasks, but hardware innovation will be essential for broader applications. Robotics is moving into a domain of high-performance specialization, where each system is designed for precision rather than breadth.
Quantum Computing: A Niche Revolution, Not the Next Big Thing
Quantum computing is generating excitement, with breakthroughs from Microsoft and Google, such as logical qubits and the Google Willow chip. While these advancements are noteworthy, quantum remains a niche technology—a trajectory similar to the metaverse a few years ago.
Building a quantum ecosystem requires more than hardware. Cooling systems, compatibility frameworks, software drivers, and software support present formidable challenges. If swapping an Nvidia GPU for AMD is challenging today, integrating quantum hardware into enterprise environments is exponentially harder. For now, quantum will remain limited to specialized research and niche applications. Its value lies in opening new research avenues rather than replacing existing systems. It is significant, but mainstream adoption remains several years away.
The Way Forward: Smaller, Smarter, Better
As we transition into FY25, the narrative of “bigger is better” will continue to lose relevance. The most impactful advancements in AI, data centers, robotics, and quantum computing will emphasize precision and thoughtful design over scale.
The approach remains clear: focus on solutions tailored to the problem at hand. This is not only about resource optimization but also about fostering innovation that avoids unnecessary complexity. FY25 will build on the momentum of iterative advancements. These will reshape longevity in healthcare, productivity through AI integration, and forecasting capabilities across industries. Change will not come as a single transformative leap but through steady, focused innovation —laying the foundation for a sustainable and impactful technological future.