Artificial intelligence stakeholders are going all out to make the technology profitable as China raises the stakes with lower token model prices to meet surging global demand, and make itself an effective AI cost-control center, with HKSAR acting as a ‘bridgehead’. Luo Weiteng reports.

Editor’s note: A new compute-driven economy is taking shape as AI scales up. This series examines how China is building across the stack and how Hong Kong SAR connects it to global markets.
Not long after global investors fretted over fears of an artificial intelligence job apocalypse and huge tech disruptions, chipmaker boss Jensen Huang called for Wall Street to focus on a “new economic reality”.
“Compute equals revenues,” he repeated the mantra four times during the February earnings call for Nvidia’s best quarter in history.
As a co-founder of the world’s first enterprise ever to surpass a market capitalization of $5 trillion in October, Huang, donning his signature leather jacket, has been intensely promoting what could become an iron rule for all companies in the realm of machine-generated intelligence. China, at the same time, is leveraging its massive reserves of cheap electricity and computing to process and “export” tokens — the basic unit of AI processing — as a much-awaited sustainable path to AI monetization.
ALSO READ: WIC: AI agents enter transformative phase
Amid a growing crunch in computational power, monetization-hungry AI players worldwide are yearning for an “inflection point” to validate stretched stock valuations, exorbitant capital spending and make all well with the AI boom. This month, AI safety and research company Anthropic blocked Claude subscriptions from powering third-party agent tools, such as OpenClaw, whose meteoric rise triggered a huge increase in token demand and posed a critical question: Who will keep paying for all this compute, and at what price?
Weekly data from OpenRouter — one of the most widely used platforms where global developers access AI models through a unified application programming interface (API) — showed that among the 10 most-used models by token consumption, Chinese models have overtaken their United States peers in the past month with 37 trillion tokens against 12 trillion. According to Internet Data Centers, token rates for mainstream Chinese models are priced at nearly one-sixth to one-tenth of comparable international rivals.
At the root of the pricing disparity is China’s lower costs for energy and computing infrastructure. Electricity generated in the country’s western regions — from solar installations in the Gobi Desert to wind farms across its grasslands — is channeled through an extensive grid into large-scale computing clusters. There, it is converted into tokens and transmitted globally to meet surging demand.
Yet, this is far from a simple tale of affordable Chinese alternatives undercutting premium US services. China is quietly positioning itself across the value chain — from energy and compute to models and output. All the dynamics are not reserved for deep-pocketed tech giants and AI laboratories. In the Hong Kong Special Administrative Region, some nimble, mid-market challengers are already riding the wind.
In what McKinsey & Company hailed as a “generative AI breakout year”, Hong Kong Nanzhuo Data Technology (NZData) was formed in Guangzhou, Guangdong province, in 2023 and subsequently set up shop in the SAR.
Bryan Fan, general manager of NZData, has seen the company’s core business — designing and operating tailor-made AI solutions for clients, ranging from filmmakers to traditional Chinese medicine enterprises — generate 80 percent of revenues. Yet its “big bet over the next three to five years” is on something more fluid — turning underutilized compute into a shared resource that can be effectively deployed, easily accessed, and eventually delivered across borders from Hong Kong.
Wasted compute has become the silent tax on AI worldwide. Despite billions having been poured into AI infrastructure, most graphics processing units (GPUs) sit idle or underused for much of their lifecycle. Industry figures suggest that in-house GPU clusters remain idle 40 percent of the time on average, according to Shenzhen-based cloud service provider, SeaArea.
Arvin Luo, marketing manager of NZData, calls their centralized platform an “orchestrator” seeking to pool isolated pockets of compute scattered across data centers into meaningful, ready-to-allocate resources.
Such a strategy echoes a deliberate, top-down architectural shift, with Chinese policymakers urging a unified national computing network, larger intelligent computing clusters and tighter coordination in which computing power is built and used.
National compute synergy
This year’s Government Work Report placed “compute-electricity synergy” as a national priority aimed at aligning data center deployment with energy resources and addressing geographic mismatches between supply and demand through initiatives like west-to-east power transmission and the “East Data, West Computing”.
“The nation-spanning computing network is already taking shape,” notes Fan. “Our platform is designed to plug into the system to draw on a more robust and versatile array of resources. Without that, we had to navigate a patchwork of standards, interfaces and nodes across different providers.”
He sees a standardized, interoperable network as a supercharger for moving compute across regions and taking that capacity overseas in a more efficient manner.
For Michael Ho, Hong Kong-based partner and the head of digital assets platform Asia Pacific at Oliver Wyman, efficiency explains how new economics of compute and token work in China.
“The country’s AI developers have prioritized efficiency to make the most of less beefy hardware than their US peers that have broader access to a cutting-edge, full power-hungry AI chip lineup,” he says.
Such production efficiency can be seen from more competitive output per watt — the metric Nvidia’s Huang sees as “the single most important thing for the top line of companies”. On the other side of the equation is the cost per million tokens, driven primarily by compute assets and the electricity needed to run data centers — a combination Ho believes has continued to tilt the advantage towards China.
According to Ho, the electricity price for data centers in China can be as low as 0.2 to 0.3 yuan ($0.044) per kilowatt-hour, or roughly one-third or a quarter of those in the US and Europe. Even with the absolute global price trough, Luo estimates that energy costs alone still account for nearly 40 percent of the total compute bill in China.
NZData sources the bulk of its compute from the Chinese mainland, tapping into the vast renewable energy reserves of the northwest. The mathematics is straightforward. “For Hong Kong clients, connecting to mainland data centers over local installations could slash costs by as much as 60 percent,” says Fan.
The abundance of green energy, harvested from the renewable-rich hinterlands, not only hosts high-density clusters in Guizhou province and the Inner Mongolia autonomous region capable of handling sea-sized training and inference workloads, but also serves as a primary hedge against the spiraling costs of making AI. This positions the mainland as NZData’s compute base and cost-control center, says Luo.
What makes Hong Kong the logical counterpart? Luo describes the city as a “calculated bridgehead” between Chinese supply and global demand. The SAR’s primary allure is its infrastructure, especially its status as the Asia-Pacific’s premier network exchange center.
“With over 30 international undersea fiber optic cables linking directly to major global economies, Hong Kong offers the top-tier bandwidth and ultra-low latency required for international scale,” he says.
Equally important is Hong Kong’s regulatory positioning. Under “one country, two systems”, the SAR imposes fewer constraints on cross-border data flows and aligns more closely with international standards, including the European Union’s General Data Protection Regulation. Hong Kong’s Personal Data (Privacy) Ordinance is one of Asia’s longest-standing comprehensive data protection laws, making the city more compatible with the requirements of multinational enterprises.
“We are essentially piping the mainland’s ‘digital crude’ through Hong Kong’s high-speed, compliant hub to a hungry global market,” says Luo.
To Fan, Hong Kong’s sophisticated legal and arbitration regime is more of a structural necessity for navigating the geopolitical complexities of compute export.
Currently, the appetite for mainland-sourced computing is primarily confined to Chinese enterprises expanding via the SAR, catering to their specific needs to keep data onshore. But for the overseas market, routing computing powered by Chinese infrastructure would often require data to flow through China. In practice, Fan points out, most of their clients opt for local alternatives to satisfy data sovereignty rules and internal compliance constraints.
Economics is not the only defining factor, says Ho. Data governance rules, AI security systems and local vendor ecosystems make it hard for China’s cost-effective compute to become a regional utility story.
Overseas expansion paths
Citing the growing share of Chinese models on OpenRouter, Ho believes the figures are telling, but represent only a subset of the market. The platform’s users, typically individual developers and small firms with Chinese users comprising a single-digit share, tend to be more price-sensitive and less concerned about how data moves across borders. For them, Chinese tokens offer the world’s best “bang for the buck”. Silicon Valley venture capital Andreessen Horowitz’s general partner Martin Casado estimates that 16 to 24 percent of startup pitches are running Chinese open-source models.
Up to 90 percent of global token consumption, however, comes from larger corporate users operating within closed cloud ecosystems. In this high-stakes enterprise arena, US models maintain a lead, Ho notes.
As AI becomes critical infrastructure, compliance demands tend to rise, not fall, warns Victoria Mio, head of Greater China equities at Janus Henderson Investors.
Yet, regulatory hurdles are not closing off China AI’s overseas expansion. They are reshaping the map. Each pathway, says Mio, is calibrated to balance “where China has defensible advantages” and “where regulatory friction is manageable”.
This recalibration comes as the industry itself evolves. Deloitte says AI today is no longer just about massive, capital-intensive undertaking of training bigger models. It is about real-time and latency-sensitive inference at scale, with production systems, multi-step reasoning and autonomous agents on track to account for approximately two-thirds of all AI compute in 2026.
“The real battleground this year is the industrialization of inference — costs, throughput, utilization and infrastructure bottlenecks,” says Mio.
API-based token export, therefore, stands as “the clearest pathway”. Beyond that, building regional compute centers in Southeast Asia offers another viable route. Singapore emerges as an interconnection hub, while Malaysia’s Johor state has rapidly scaled up as an overflow market.
From data centers, engineering, procurement and construction services to power and cooling systems, and network connectivity, Mio frames this route as “China’s AI supply chain internationalizing into the nearest region where demand is exploding and constraints are lower, aligning with both data residency and economics”.
In the past year, Fan has seen demand growing firsthand, with Southeast Asian economies racing to scale up compute infrastructure. NZData is targeting Singapore as a foothold.
Further up the stack, Mio points to a pathway linking token growth with content creation and agentic enterprise automation — categories where customers care more about results than where the model comes from, and have proven to monetize well internationally.
ALSO READ: Zhipu’s 120% surge in HK highlights China’s new AI market favorites
For instance, Chinese tech giant ByteDance’s cloud unit Volcano Engine recently launched API services for public access to its most advanced video generation model, Seedance 2.0, whose quality of generated clips thrills global early adopters with costs pared back to roughly one yuan per second.
For Mio, the “often underappreciated pathway” lies in the “picks-and-shovels” exports. High-density data-center design, advanced cooling, power management, optical interconnects and systems integration, such a whole package of infrastructure, connectivity and efficiency tech “remain investable and exportable” even when model-level geopolitics become fraught, she says.
“The biggest misreading in China’s AI ‘going global’ narrative is that technology can simply travel abroad, overlooking geopolitical fractures,” says Pang Ming, distinguished senior research fellow of the National Institution for Finance and Development. “What is underestimated, instead, is the export of operational know-how and algorithmic services under localized compute deployment.”
“Even when hardware sits outside China, control over the system and the business loop can still be with Chinese companies — a form of technology export in its own right,” says Pang. “It’s time to rethink ‘going global’ not as the movement of physical assets, but as the expansion of systems and standards.”
Contact the writer at sophialuo@chinadailyhk.com
