yash samat

Benjamin Lee is a Professor in the Department of Electrical and Systems Engineering and the Department of Computer and Information Science at the University of Pennsylvania. He is also a visiting researcher at Google in the Global Infrastructure Group. Dr. Lee's research focuses on computer architecture (e.g., microprocessors, memories, datacenters), energy efficiency, and environmental sustainability. He builds interdisciplinary links to machine learning and algorithmic economics to better design and manage computer systems.

Dr. Lee was an Assistant and then Associate Professor at Duke University. He received his post-doctorate at Stanford University, his S.M. and Ph.D. from Harvard University, and his B.S. from the University of California at Berkeley. He has held visiting research positions at Meta AI, formerly Facebook AI Research, at Microsoft Research, at Intel Corporation, and at Lawrence Livermore National Laboratory.

A great deal of conversation around artificial intelligence (AI) is centered on models. Larger models. Faster models. More capable models.

However, the systems that make those models possible receive less attention. In reality, AI runs on infrastructure: massive datacenters that consist of processors, memory, networking, equipment, cooling systems, and power delivery systems that operate continuously to serve computation at enormous scale.

Earlier today, I attended a talk given by Dr. Benjamin Lee that focused on a question that will most likely define the next phase of computing: how do we sustain the explosive growth of computation required by modern AI systems?

The challenge is not just computational performance. It's energy, water, carbon emissions, and the physical limits of infrastructure.

One reason why the problem is becoming urgent is that the dominant source of AI computation may soon shift. Most expenses today go into training large models. But once a widely adopted application emerges, the bulk of computation will likely come from inference. This is the process of serving model predictions to users in real time.

Training a model happens relatively infrequently, even though the process of doing so may require massive amounts of computational resources. Inference, on the other hand, happens every time a user interacts with a system. If AI systems become widely embedded in everyday software, then they could eventually serve trillions of requests per day. At such a scale, the infrastructure required to support inference could dwarf the cost of training the models.

When Computation Outpaces Efficiency

For decades, the computing industry has relied on two trends that quietly enabled rapid progress.

Moore's Law allowed engineers to pack more transistors into smaller spaces, hence reducing the cost per transistor. Dennard Scaling ensured that as transistors became smaller, power density remained manageable. These trends together allowed computing power to increase without a proportional rise in energy consumption.

Those trends, however, have slowed.

At the same time, demand for computation has accelerated at a dramatic rate. Most computing today takes place within datacenters, where efficiency improvements have allowed total electricity usage to grow modestly while delivered computation has grown much faster.

But efficiency improvements can't alone keep pace indefinitely with exponential demand.

The question that engineers face today is not just how to build faster processors, but also how to build systems capable of sustaining massive computational growth without overwhelming the energy systems that support them.

The Hidden Resources Behind AI

Energy is the most obvious resource consumed by datacenters, but it isn't the only one.

Cooling systems rely heavily on water. Some cooling approaches use evaporative techniques that consume water directly. But much larger volumes of water are used indirectly in the generation of electricity which powers the systems.

One of the examples given during the talk illustrates how deeply embedded these resource costs can be. Generating a single email with GPT can require around 500 mL (half a liter) of water when accounting for the energy required by the system.

Clearly, AI infrastructure is not just a digital phenomenon. It's a physical system with environmental consequences.

The carbon footprint of AI also operates on two levels. There is embodied carbon, produced during the manufacturing of hardware such as processors and memory, and there is operational carbon, generated by the electricity used to power running systems.

Operational carbon currently dominates the footprint of most AI systems, though both components are growing as demand for computing increases.

The Sustainability Tradeoffs

Designing sustainable computing infrastructure is a problem filled with tradeoffs.

Reducing water consumption increases electricity demand. Increasing reliance on renewable energy introduces variability in power supply. Batteries can help smooth fluctuations in renewable energy production, but they introduce additional material and infrastructure costs.

Even the pursuit of efficiency can produce unintended consequences. This effect is described as Jevons Paradox: improvements in efficiency can reduce the cost of using a resource, which in turn leads to greater overall consumption.

Computing appears to follow a similar pattern. As systems become more and more efficient, the total demand for computation continues to rise.

Powering the Future of AI

One possible solution frequently discussed in technology circles is nuclear energy. Scaled nuclear power could theoretically provide stable, carbon-free electricity for datacenters operating around the clock.

However, the path toward widespread nuclear deployment is complicated.

Many startups are developing small modular reactors, but the ecosystem remains fragmented. Scaling nuclear energy would likely require coordinated government policy, regulatory frameworks, and long-term planning around nuclear waste management. In other words, the problem is not purely technological. It is also political and institutional.

Another idea that surfaces periodically is whether energy used in computation can somehow be reused in the same way water is recycled in many industrial systems.

One approach involves adiabatic circuits, which attempt to recover some of the energy used during computation rather than dissipating it as heat. In principle, this can improve energy efficiency. In practice, however, such circuits operate more slowly than conventional designs, which limits their usefulness for high-performance computing systems.

The result is a recurring theme in infrastructure engineering: theoretical efficiency does not always translate into practical scalability.

A Research Agenda for Sustainable Computing

Looking further ahead, Dr. Lee shared a research agenda for making computing sustainable at scale. One of the central challenges is balancing embodied carbon with operational carbon.

Efforts to reduce one often increase the other. For example, building additional renewable infrastructure or deploying new hardware can reduce operational emissions, but doing so also increases the embodied carbon associated with manufacturing and construction. Designing sustainable systems therefore requires thinking about the full lifecycle of computing infrastructure.

Dr. Lee outlined four areas where progress will be necessary over the coming years: measuring the lifecycle impact of computing technologies, designing carbon-efficient hardware and software, managing datacenters to optimize energy and water use, and educating the next generation of engineers and policymakers who will ultimately shape these systems.

The Infrastructure Era of AI

The talk reinforced an idea that is becoming increasingly clear as AI systems scale.

The future of AI will not be determined solely by advances in machine learning algorithms. It will also be shaped by the physical systems that support computation: datacenters, energy grids, cooling technologies, and the policies that govern them.

The next phase of AI innovation may depend just as much on electrical engineering, infrastructure planning, and environmental policy as it does on advances in neural networks.

Understanding those systems will be essential for anyone trying to understand where AI is actually going.