The AI industry has a cost problem that nobody wants to talk about honestly. Not training costs. Everyone loves talking about training costs. "GPT-4 cost $100 million to train!" "Gemini Ultra cost even more!" These numbers get thrown around at conferences and in pitch decks because they sound impressive and they're a one-time expense. You train the model, you're done. The real money pit is inference. Every time you ask ChatGPT a question, every time Gemini summarizes an email, every time Copilot suggests a line of code — that's inference. And it happens billions of times per day, every single day, with the meter running. I've spent the last month digging through annual reports, earnings calls, and infrastructure analyses from the big three AI spenders. Here's what the numbers actually look like. ## The Capex Explosion Let's start with what these companies are spending to build the infrastructure. Microsoft's capital expenditures hit $55.7 billion in fiscal year 2025, up from $44.5 billion the prior year. A significant chunk of that went to AI datacenter buildouts. CFO Amy Hood told analysts on their Q2 2025 earnings call that "over half of our capital expenditure is going toward long-lived assets that will serve us for 15 years or more" — meaning the buildings, the power infrastructure, the cooling systems. The rest goes to shorter-lived assets like the GPUs themselves. Google's parent Alphabet spent $75.3 billion in capex in 2025, with CFO Anat Ashkenazi confirming that AI infrastructure represented the largest portion. They'd already committed $100 billion over the following years. For context, Alphabet's total revenue in 2025 was around $400 billion. They're spending roughly a fifth of their revenue on infrastructure. Meta went from $28 billion in capex in 2024 to approximately $60-65 billion in 2025, with Mark Zuckerberg explicitly stating on an earnings call that the company was "investing aggressively in AI infrastructure." Meta's unique position — they don't sell cloud services — means every dollar of that spend needs to be justified by improvements to their own products: better ad targeting, better content recommendation, better user engagement. Add it all up and the three largest AI spenders alone are putting well over $200 billion per year into infrastructure. That's before you count Amazon, Oracle, and the dozens of smaller players. ## What a Datacenter Actually Costs Let's break down what it actually takes to build and run an AI-optimized datacenter, because the numbers are wild. **Land and construction.** A 100-megawatt datacenter — which is modest by current standards — costs roughly $1-2 billion to build from scratch. That includes the building, the electrical infrastructure, the cooling systems, and the security. Larger campuses being built by Microsoft and Google in places like Virginia, Texas, and the Nordics run 500MW to 1GW and cost $5-10 billion or more. **Power infrastructure.** This is where it gets interesting. AI datacenters are power-hungry in ways that traditional cloud datacenters never were. A single NVIDIA DGX H100 server draws about 10.2 kilowatts. A rack of eight GPUs with networking and storage can pull 40-60kW. A traditional server rack pulls maybe 7-10kW. So AI racks consume 4-8x more power per rack than conventional compute. SemiAnalysis estimated that global datacenter critical IT power demand would surge from 49 GW in 2023 to roughly 96 GW by 2026, with AI consuming about 40 GW of that. That's the equivalent power consumption of a mid-sized country. The Netherlands uses about 17 GW of peak electricity. **Electricity costs.** The hyperscalers negotiate power purchase agreements (PPAs) that typically range from $0.03-0.06 per kilowatt-hour in favorable locations. At $0.04/kWh, a 100MW facility running at 80% utilization costs about $28 million per year just in electricity. A 1GW campus? $280 million per year. In electricity alone. But here's the thing — power isn't always available. Microsoft signed a 20-year deal to restart a unit at the Three Mile Island nuclear plant in Pennsylvania. Google signed agreements with Kairos Power for small modular nuclear reactors. Amazon bought a nuclear-powered datacenter campus from Talen Energy. These companies are literally building their own power generation because the grid can't keep up. **GPUs.** This is the line item that makes CFOs lose sleep. An NVIDIA H100 GPU costs roughly $25,000-30,000 each. A DGX H100 system (8 GPUs plus networking) runs about $200,000. The newer B200 and GB200 systems cost even more. A 100MW AI cluster might contain 20,000-30,000 GPUs. At $25K each, that's $500-750 million in GPU costs alone. And GPUs depreciate. NVIDIA releases new architectures every 18-24 months that are 2-3x more efficient. Today's $30,000 H100 becomes tomorrow's discount hardware. Microsoft alone is estimated to have deployed over 600,000 H100-class GPUs by the end of 2025. At $25K each, that's $15 billion in GPU costs — and that's before the B200 rollout. **Cooling.** AI GPUs generate enormous heat. Traditional air cooling doesn't cut it at the power densities we're talking about. The industry is rapidly shifting to liquid cooling — either direct-to-chip liquid cooling or full immersion cooling. Retrofit costs to add liquid cooling to an existing datacenter run $2-5 million per megawatt. New builds design it in from the start, but the equipment and plumbing add 15-25% to construction costs. **Staffing.** A large datacenter campus employs 50-200 full-time staff for operations, security, and maintenance. At fully loaded costs of $150-200K per employee per year, that's $10-40 million annually. Not huge relative to the other costs, but not nothing. ## The Training vs. Inference Split Here's where the economics get really interesting, and where most of the public conversation misses the point. Training a frontier model is expensive. Estimates for GPT-4's training run range from $60-100 million. Gemini Ultra probably cost more. GPT-5 and its successors will cost hundreds of millions. These are big numbers. But training is a one-time cost. You train the model once. Maybe you do a few runs. Then you're done. Inference is forever. OpenAI reportedly processes over 100 million weekly active users. Every query requires GPU compute. Every response needs to be generated token by token. And the trend is toward longer, more complex interactions — agentic workflows that might make dozens of API calls to complete a single task. The cost of inference depends on the model size, the input length, the output length, and the hardware. But rough numbers: running a GPT-4-class model costs about $0.01-0.03 per query for a typical conversation turn. That might not sound like much, but multiply it by hundreds of millions of queries per day and you're burning through millions of dollars daily. Microsoft's AI revenue in fiscal Q2 2025 was driven primarily by Azure OpenAI Service and Copilot subscriptions. But here's the dirty secret: the margins on AI inference are thin. Traditional Azure cloud services run at 60-70% gross margins. AI inference services? Much lower. When Satya Nadella says AI revenue is growing fast, the follow-up question nobody asks loudly enough is: "Yes, but at what margin?" This is why every lab is racing to make inference cheaper. Mixture-of-experts architectures (like Mistral and DeepSeek use) route queries to smaller specialist subnetworks, using a fraction of the total parameters for each query. Speculative decoding generates candidate tokens in parallel. Quantization shrinks model weights from 16-bit to 8-bit or 4-bit precision. KV-cache optimization reduces redundant computation. Each technique shaves 20-50% off inference costs. The inference cost curve is declining, but usage is growing faster. Every time inference gets cheaper, companies find new ways to use it — longer context windows, multi-step reasoning, real-time processing. It's a treadmill. ## The Unit Economics Nobody Talks About Let me walk through some rough unit economics that explain why AI companies are both growing fast and burning cash. **GitHub Copilot.** Microsoft charges $19/month for individual Copilot. Reports suggest the average user costs Microsoft $40-80/month in compute. At $19/month in revenue against $40-80/month in costs, Microsoft is losing money on every single Copilot subscriber. They're doing it to build market share and drive Azure consumption. But it's a subsidy, not a business. **ChatGPT Plus.** OpenAI charges $20/month. Heavy users who ask dozens of GPT-4 questions per day probably cost $30-60/month to serve. Light users cost much less. The average might break even or lose slightly. The $200/month Pro tier for power users probably has healthier margins, but the user base is much smaller. **Google Gemini.** Google gives away Gemini for free on search results. Every AI Overview costs Google compute without generating direct additional revenue. The bet is that better search results keep users on Google and maintain ad revenue. But it's adding cost to a business that was already profitable without it. This is the fundamental tension in AI economics right now. The companies building AI are spending more to serve it than they're earning from it, in many cases. They're betting that costs will come down (they will), that willingness to pay will go up (maybe), and that AI will become so embedded in everything that it'll be impossible to unwind (probably). ## Where the Money Actually Goes If you break down total AI infrastructure spending into buckets, it looks roughly like this: **40-45%: Servers and GPUs.** The silicon is the biggest single expense. NVIDIA's data center revenue hit $115 billion in fiscal year 2025. Almost all of that came from AI GPUs purchased by hyperscalers, cloud providers, and enterprises. **25-30%: Construction and real estate.** Buildings, power substations, transformers, generators, fiber optic connections. This is the "long-lived assets" Amy Hood mentioned. These costs are rising because of competition for suitable sites and supply chain constraints on electrical equipment. High-voltage transformers now have lead times of 18-36 months. **15-20%: Power and electricity.** Operational electricity costs plus the capital cost of power purchase agreements, on-site generation, and grid connections. This share is growing as AI workloads consume more power per rack. **5-10%: Cooling systems.** Liquid cooling infrastructure, chillers, heat exchangers. The shift from air cooling to liquid cooling is adding capital costs but improving efficiency. **3-5%: Networking.** InfiniBand switches, high-speed optical transceivers, spine-leaf fabric. AI training clusters need all-to-all GPU communication at enormous bandwidth. NVIDIA's InfiniBand and Ethernet switches for AI are a multi-billion dollar business on their own. **2-3%: Operations and staffing.** People, maintenance, security, software tools. ## The Geography of Cost Where you build matters enormously. Virginia's "Data Center Alley" in Northern Virginia hosts the largest concentration of datacenters on earth. But the region is hitting power limits. Dominion Energy, the local utility, has warned that it can't guarantee power availability for new large loads before 2030 in some areas. The bottleneck isn't building the datacenter. It's getting electricity to it. This is pushing buildouts to new geographies. Iowa, where Facebook and Google both have massive campuses, offers cheap wind power and cool climate. Texas offers abundant cheap power from its independent grid but brings risks — the 2021 winter storm reminded everyone that the Texas grid has its own problems. The Nordic countries — Sweden, Finland, Norway, Iceland — offer cheap hydroelectric power and natural cooling that reduces energy costs by 30-40%. Microsoft has been particularly aggressive in international expansion, with major AI datacenter projects in Australia, the UK, Japan, and Germany. Google has campuses in the Netherlands, Belgium, Finland, and Chile. The economics of each location depend on power costs, construction costs, proximity to users (for latency-sensitive inference), and regulatory environment. ## The Sustainability Question Here's something the AI industry would rather you not think about too carefully. SemiAnalysis projects that datacenters will consume about 4.5% of global electricity generation by 2030, up from roughly 2% in 2023. AI is the primary driver of that increase. The hyperscalers have all made carbon neutrality commitments. Google claims to have been carbon neutral since 2007 and has committed to running on 24/7 carbon-free energy by 2030. Microsoft has pledged to be carbon negative by 2030. But their AI buildouts are making these targets harder to hit. Google's 2024 environmental report showed that its total greenhouse gas emissions had increased 48% since 2019, largely due to datacenter energy consumption. There's a real tension between "we need to build AI infrastructure as fast as possible" and "we need to reduce our carbon footprint." Nuclear power is the most honest answer — it's carbon-free and reliable — which is why we're seeing these deals with nuclear operators. But new nuclear capacity takes years to come online. In the short term, many AI datacenters are running on natural gas. In some cases, they're running on temporary gas turbine generators while waiting for grid connections. The AI industry's dirty secret is that the "clean energy" narrative applies to the company's overall portfolio, not necessarily to the specific datacenter running your inference queries. ## What This Means The economics of AI infrastructure are unsustainable at current margins. Something has to give. The most likely outcome is that inference costs continue to drop — through hardware improvements (NVIDIA's next-gen chips, custom ASICs from Google TPUs and Amazon Trainium), software optimization (better inference engines, more efficient architectures), and scale (unit costs decrease as clusters get larger). At the same time, prices will go up. The era of "free" or heavily subsidized AI is ending. OpenAI has raised ChatGPT Plus to $20/month and launched a $200/month Pro tier. Microsoft is raising Copilot prices. Google will eventually have to charge for Gemini usage that currently rides free on Search. The companies that survive the infrastructure cost squeeze will be the ones that can do three things: build efficient hardware (NVIDIA, Google TPU, Amazon Trainium), optimize inference (the entire MLOps ecosystem), and generate enough revenue per user to cover the cost of serving them. Right now, nobody's figured out that last part. But they're spending $200+ billion per year betting that they will.