Token Bill Comes Due: Industry scrambles to manage runaway AI costs


Across the industry, companies are beginning to balk at the price of AI. Uber blew through The entire 2026 AI development budget will be completed by April. Microsoft canceled it Claude Code was licensed to developers months after they were enabled. A Priceline employee told TechCrunch that routine Cursor contract renewals have become 4-5 times more expensive.

Although the prices of each token have fallen, the pressure for more adoption of AI and increasingly autonomous agents has seen token consumption rise higher and higher. Companies that gorged themselves in early 2025 with “all-you-can-eat” subscriptions are now scrambling to understand where their money is going, pull back on spending, and see if they can salvage some ROI from the wreckage of their balance sheets.

At the same time, a market is forming to meet them there. Startups, established vendors and a new standards body are racing to provide companies with the tools and language to track what they spend.

“Six months ago, I was having a conversation with a client and it was all about, ‘What can he do? Is it good enough?” Alexander Embrikus, president of the OpenAI Foundation, told TechCrunch at an event in New York City this week. “Our conversations are never about that now. The conversations now are about, “Hey, we’re spending a lot. What visibility do you have? What audit capability do you have? What token controls do you have? How efficient are your models?”

Against this backdrop, the Linux Foundation this week unveiled plans for the Tokenomics Foundation, a new standards body that aims to instill the same cost regime around AI tokens that FinOps did for cloud spending.

“In April and May, I started hearing from companies: ‘Oh my God, we’re three times over our all-time symbolic budget for 2026, and we’re only in April,’” J.R. Storment, executive director of the FinOps Foundation, a project of the Linux Foundation, told TechCrunch. “We started hearing about existential crises, and the whole conversation shifted from com. tokenmaxxing And “hurry up” to “We need guardrails, how can we control this?”

The cries heard across the tech world have followed intense demands from CEOs who pushed their teams to use the best models and move quickly, regardless of the costs. New models released in November, such as Anthropic’s Claude Opus 4.5, OpenAI’s GPT-5.1, and Google’s Gemini 3 Pro, brought significant improvements to proxy tools, doubling consumption. It’s how one company does It is said Claude found herself with a $500 million bill after she forgot to set usage limits for employees.

“It’s like a cocaine epidemic,” says Chris Reed, senior director of IT finance at Priceline, when asked about the pricing problem with AI. “They let you try it to get you addicted to it, and now you’re kind of indebted to it.”

Vitaly Gordon, CEO of engineering operations platform Faros AI, said he recently spoke to a CTO who told him: “One of my engineers spent $40,000 on tokens last month, and I really don’t know if I should stop him or if I should go and tell everyone to be like him.”

March reconnaissance Faros found that among 20,000 developers, production was rising, as were bugs and rewrites. Likewise, Jellyfish, an engineering management platform, found that engineers who used the most codes were nearly twice as productive as those who used AI the least, but spent 10 times as many codes getting there.

Nicholas Arcolano, head of research at Jellyfish, told TechCrunch via email that AI spending is growing largely due to proxy features, with consumption per developer rising about 18.6 times in nine months. Overall, these statistics make the productivity situation murkier than spending suggests.

“Whether or not the excessive spending pays off comes down to the ultimate business value of the charged tokens (i.e. revenue), which most companies still cannot measure,” Arcolano said.

At least some of the measurement issues are the sheer scale at which AI is being used today.

“Tracking cloud costs is a data problem of hundreds of millions of rows per month,” Storment said. “Tracking token costs is a data problem of trillions of rows per month. You can’t just paste that into any spreadsheet or even a basic tool. You have to fundamentally rethink your tools, specifications, and accounting systems to do that.”

At Priceline, Reid already sees contradictions. He noted that there were issues between the seller’s reported usage and Priceline’s internal data.

“I started my career managing telecom expenses, and I see all the same similarities, from telecom to cloud to AI,” he said. “Anytime you introduce something new, there are billing errors and opportunities for audit and improvement.”

The market is starting to form around this issue. There are pure play companies, like Pay-i, that track, measure and optimize the costs and performance of GenAI investments. paidAt the same time, it allows developers to track costs, measure usage, and bill users based on actual value rather than subscription fees.

Then there are companies like Jellyfish, Waydev, and Pharos AI, all of which provide AI agent monitoring to prove the ROI of their developer tools. Most of the 180 vendors within the FinOps organization are leaning toward this area, Storment says.

Companies with existing distribution are also adding new features to take advantage of this new market. The ramp recently moved to Artificial intelligence spending management; Datadog and New remains They handled services like cloud cost management, token-level observability, and GPU monitoring. At next week’s FinOps

Tiffany Luck, partner at NEA, believes token efficiency and observability will likely be added to the “belt or application layer.” She pointed to Factory A start This makes enterprise AI agents, this week Fired A template router that automatically selects the right template for each task.

Gordon expects Frontier Labs and other model providers to adopt OpenRouter-style optimization to route queries to cheaper models — a trend already seen in enterprise cloud billing.

“The financial reporting of how much you spend on Anthropic, even if you call the model Opus, some of the spending will be on sonnet or haiku, because they are smart enough to do that,” Jordan said. “I think this is going to become more and more a thing.”

But all of these tools are created without a common language or common definitions of what a token costs, what it produces, and how to compare spending across vendors. This is where Tokenomics hopes to prove useful.

The Foundation is building a basic definition and framework for the “token economy”; Open standards, specifications, and metrics for AI token use and billing; In addition to new metrics for AI economics, such as cost per intelligence or tokens per watt. It also plans to set metrics across token factory effectiveness and consumption efficiency. The group plans an official launch in July, and is about to announce more members at the FinOps

“Token economics are fundamentally more abstract and arcane than anything we have managed at this scale before,” Nishant Gupta, chief availability officer at Salesforce, said in a statement. “It requires a different operating force than the one the industry has built for the cloud.”

So said Goldman Sachs Projects Global code usage will increase 24-fold by 2030. Businesses that are already over budget need solutions now, and we are still months away from delivering the first enterprise product.

“We may have created a steam engine, but we still haven’t discovered the assembly line,” Gordon said.

According to Arcolano, the smart move is widespread and moderate adoption.

“The best return on investment comes from moving the broad medium from low usage to moderate usage, not pushing heavy users up,” he said.

Russell Brandom and Tim Fernholz contributed to this report.

When you make a purchase through the links in our articles, We may earn a small commission. This does not affect our editorial independence.

Leave a Reply

Your email address will not be published. Required fields are marked *