A dramatic pricing realignment is reshaping the artificial intelligence inference market, with Chinese providers slashing costs at rates that appear unsustainable for many competitors. The catalyst is DeepSeek V4, whose aggressive pricing structure has created a stark divide between North American and Chinese AI services, according to AI Weekly.

The scale of the disruption is remarkable. DeepSeek V4-Pro's cached input tokens now cost just $0.0036 per million, establishing a cost differential of 15 to 30 times lower than comparable American offerings. This isn't merely a promotional discount. Rather, it reflects a fundamental repricing of what inference services should cost in markets where computational efficiency and scale economics drive down unit costs.

Forced Adaptation Across the Sector

The pressure extends well beyond DeepSeek itself. Xiaomi, the smartphone manufacturer that also operates cloud services, executed a 99 percent reduction in token pricing to remain competitive. This move carries particular significance because Xiaomi's Q1 net profit had already declined 43.1 percent, demonstrating that aggressive repricing now functions as a structural necessity rather than a discretionary business choice.

According to AI Weekly, this pattern signals that Chinese providers have entered a new competitive phase where pricing floors are determined by technological efficiency gains rather than traditional profit margin requirements. Companies without superior model efficiency or infrastructure advantages face an impossible choice: accept razor-thin margins or exit the market.

Alternative Business Models Emerge

Alternative Business Models Emerge
Photo by 易 凡 on Pexels.

Not all competitors are accepting the race to near-zero pricing. MiniMax, another significant player in the Chinese AI landscape, recently unveiled a hybrid billing structure that suggests the token-per-million pricing model may have reached its practical limit. The new approach aims to diversify revenue streams beyond simple inference compute, potentially offering usage-based tiers, feature-based pricing, or tiered service levels that allow profitable operations even as commodity token prices collapse.

This tactical shift reflects an industry-wide recognition: sustainable competition in AI inference requires differentiation beyond raw computational efficiency. Value-added services, specialized model fine-tuning, context length management, and integrated platform features may become the actual profit drivers while inference itself becomes a loss-leader or break-even service.

Global Market Implications

The widening price gap between North American and Chinese providers creates distinct market segments. American enterprises may continue paying premium prices for domestically-hosted services due to data residency requirements, latency preferences, or regulatory concerns. Meanwhile, startups and cost-conscious developers increasingly have incentives to architect applications around Chinese inference services, regardless of their physical location.

This bifurcation could accelerate the decoupling of AI infrastructure along geopolitical lines. Rather than a single global AI compute market, we may see crystallization of two parallel ecosystems with different cost structures, performance characteristics, and regulatory environments.

The sustainability of these price points remains unclear. Whether Chinese providers can maintain $0.0036-per-million pricing long-term while funding continued model development, infrastructure expansion, and competitive research depends on whether they're pursuing market share at near-zero margins or whether their cost structure genuinely supports these rates. For competitors watching their margins compress, the answer will determine whether adaptation remains possible or exit becomes inevitable.