The Hidden Economics of AI SaaS: Compute, Tokens and Real Margins in 2026

In the first article of this series, I focused on the “classic” SaaS model: ARR, ARPU, churn, CAC, LTV and CAC payback. The conclusion was simple: SaaS is still viable in 2026, but only when the underlying economics are treated with discipline. AI-based SaaS products add a new layer of complexity on top of that picture. On the surface, they look similar: subscription pricing, recurring revenue, a cloud-based product. Under the hood, however, the cost structure is very different. Every user action that triggers a large language model (LLM) has a direct, variable cost. And those costs are far from trivial at scale. In this article, I will unpack the hidden economics of AI SaaS: how tokens translate into money, how this affects ARR, ARPU and margins, and why many apparently attractive AI products are more fragile than they look.

📘 Article index

1. How AI SaaS Really Makes (and Spends) Money
2. From Prompt to Invoice: How Tokens Become a Cost Line
3. An AI SaaS Example: Revenue vs LLM Costs
4. Gross Margin in AI SaaS: The Real Picture
5. ARR, ARPU, CAC and AI Costs: Combining the Two Worlds
6. Sensitivity Analysis: Where AI SaaS Can Go Wrong
7. What Founders of AI SaaS Must Control From Day One
8. Conclusion: AI Does Not Cancel Gravity

1. How AI SaaS Really Makes (and Spends) Money

Let us start from the basics. A typical AI SaaS product has the following layers:

Interface and UX: the app the user sees (editor, dashboard, chatbot, plugin, etc.).
Orchestration and prompt logic: how inputs are transformed into prompts, tools and calls to different models.
Model layer: calls to LLMs (e.g. GPT-style models) and possibly other models (vision, embeddings).
Data and storage: documents, vectors, logs, user settings, analytics.
Security, governance and compliance: identity, permissions, audit trails.

Revenue still comes mostly from subscriptions:

Monthly / annual plans per user or per seat.
Sometimes usage-based components (credits, generations, tokens).

The crucial difference is in COGS (Cost of Goods Sold). In a classic SaaS, COGS is dominated by infrastructure and some third-party services. In an AI SaaS, COGS includes:

API calls to LLM providers (or the cost of running your own models).
Additional infrastructure and storage to support embeddings and context-heavy workloads.
Sometimes GPU usage for fine-tuning or custom models.

Every serious founder of an AI SaaS needs to be able to answer a simple question: How much does an average active user cost me in LLM usage per month? If that number is vague, the business is flying blind.

2. From Prompt to Invoice: How Tokens Become a Cost Line

Most LLM providers bill based on tokens, a rough unit of text. An AI SaaS typically incurs token costs in at least three areas:

Input tokens: the user’s content + system prompts + context.
Output tokens: the generated response.
Embeddings / retrieval: when documents are embedded and queried.

2.1 A simple example: generating a document or slide deck

Imagine an AI app that helps users generate drafts of documents or slide decks. A typical interaction might involve:

User provides a brief or uploads some content.
The app adds system instructions and history.
The LLM generates several sections, then revisions.

Roughly speaking (figures just for illustration):

Total input + output per “generation”: ~5,000–7,000 tokens
A heavy user might perform 50–100 such generations per month
A light user might only do 5–10

That means the monthly token usage per user can easily range from:

Low usage: 30,000–50,000 tokens
Medium usage: 200,000–300,000 tokens
High usage: 500,000+ tokens

If your pricing and plans do not reflect this variance, you may have a problem.

3. An AI SaaS Example: Revenue vs LLM Costs

Let us build a simplified model for an AI SaaS product that looks, conceptually, like a “smart document” or “AI presentation” tool.

3.1 Base assumptions

We assume:

ARPU: 20 USD/month (individual or SMB plan).
Paid users: 5,000 in Year 1, with moderate growth afterwards.
LLM pricing: we will not rely on a specific provider or model; instead, we consider three cost scenarios:

Cheap model: 0.20 USD per million tokens (e.g., distilled or smaller models).
Mid-range model: 1.00 USD per million tokens.
Premium model: 5.00 USD per million tokens.

For token usage per user per month, we define three profiles:

Light: 50,000 tokens
Average: 200,000 tokens
Heavy: 600,000 tokens

In practice, your user base will be a mix of these.

3.2 LLM cost per active user

We can now estimate a per-user monthly LLM cost under different scenarios. Table 1 – LLM cost per user per month (USD)

Model cost (USD / 1M tokens)	Usage profile (tokens / month)	Tokens in millions	Cost per user / month (USD)
0.20	50,000 (Light)	0.05	0.01
0.20	200,000 (Average)	0.20	0.04
0.20	600,000 (Heavy)	0.60	0.12
1.00	50,000 (Light)	0.05	0.05
1.00	200,000 (Average)	0.20	0.20
1.00	600,000 (Heavy)	0.60	0.60
5.00	50,000 (Light)	0.05	0.25
5.00	200,000 (Average)	0.20	1.00
5.00	600,000 (Heavy)	0.60	3.00

At first glance, these amounts may look small. But they are pure variable cost per user. At scale, they become material. If you charge 20 USD/month per user:

Under a cheap-model scenario, LLM cost is negligible compared to ARPU.
Under a premium-model scenario with heavy users, LLM cost alone can eat 15% of ARPU (3 USD out of 20) before infrastructure, support or anything else.

If you are selling to enterprises at 40–60 USD per user, you have more room. If your price point is closer to 10 USD or less, you have less.

4. Gross Margin in AI SaaS: The Real Picture

In an AI SaaS, COGS includes at least:

LLM/API costs.
Cloud infrastructure (compute, storage, networking).
Third-party services tightly tied to the product.

Let us rebuild a simplified margin picture using the example above.

4.1 Base revenue and cost per user

Assume:

Subscription price (ARPU): 20 USD/month.
Non-LLM infra and other variable costs: 1.50 USD/user/month.
We compare three LLM cost regimes: cheap, mid, premium.
We assume an “average” user: ~200,000 tokens/month.

Table 2 – Estimated monthly margin per user (USD)

Scenario	ARPU	LLM cost	Other variable costs	Total COGS	Gross margin (USD)	Gross margin (%)
Cheap model (0.20/M)	20.0	0.04	1.50	1.54	18.46	92.3%
Mid model (1.00/M)	20.0	0.20	1.50	1.70	18.30	91.5%
Premium model (5.00/M)	20.0	1.00	1.50	2.50	17.50	87.5%

In this example, margins look very comfortable. However, three important caveats:

Heavy users may consume many more tokens than the “average”, lifting LLM costs to 2–3 USD/user/month or more.
We have assumed non-LLM infra and support are well controlled; in reality, these can grow quickly.
If your pricing is aggressive (e.g. 8–12 USD/month) and your LLM cost per user creeps upward, the margin picture can deteriorate fast.

The point is not that AI SaaS is doomed. On the contrary, if you manage usage and model choice carefully, you can still operate with gross margins above 80–85%. The risk lies in ignoring these dynamics.

5. ARR, ARPU, CAC and AI Costs: Combining the Two Worlds

The next step is to integrate AI-specific costs into the SaaS economics we discussed in Article 1.

5.1 Revisiting ARR and ARPU

In an AI SaaS:

ARR still reflects your recurring revenue base.
ARPU is still central, but it now has a cost counterpart that scales more directly with usage.

Two questions become critical:

Is your ARPU high enough to comfortably absorb your worst-case LLM costs per user plus all other variable costs?
Is your pricing aligned with actual value and usage, or are your heavy users quietly eroding your margins?

For example, if:

ARPU = 15 USD/month
Heavy LLM usage = 3 USD/month
Other variable costs = 2 USD/month

Then:

Total COGS/user/month = 5 USD → gross profit = 10 USD → margin = 66.7%

That is no longer “classic SaaS margin territory”. It is more like a traditional software + service business. It may still be viable, but the economics are fundamentally different from a 90% gross margin SaaS.

5.2 CAC and CAC payback in AI SaaS

CAC in AI SaaS is still driven by:

Paid acquisition
Content and brand
SDRs and outbound
Partnerships

However, AI products often enjoy a strong initial pull (they are “hot”), which can mask deeper problems:

High trial sign-ups with low conversion to paid.
Short-term usage spikes followed by drop-off.
Over-investment in features that impress, but are rarely used in practice.

If CAC per user is, for example, 150 USD, and your monthly gross profit per user is:

18 USD in a high-margin scenario → payback ≈ 8–9 months
10 USD in a lower-margin scenario → payback ≈ 15 months

Then pricing and usage patterns tied to AI costs directly affect your ability to fund growth.

6. Sensitivity Analysis: Where AI SaaS Can Go Wrong

It is useful to look at how a few parameters can change the economics of an AI SaaS. Let us assume:

ARPU = 20 USD/month.
CAC per user = 150 USD.
Non-LLM variable costs = 1.50 USD/month.
Base token usage = 200,000 tokens/month/user.
Mid model pricing = 1.00 USD per million tokens.

6.1 Base case

From earlier calculations:

LLM cost ≈ 0.20 USD/month/user.
Total COGS ≈ 1.70 USD/month.
Gross profit ≈ 18.30 USD/month.
CAC payback ≈ 150 / 18.30 ≈ 8.2 months.

6.2 What if usage doubles?

If average usage per user goes from 200,000 to 400,000 tokens/month (e.g. due to new features or more intensive workflows), LLM cost becomes:

0.40 USD/user/month instead of 0.20.

COGS ≈ 1.90 USD; gross profit ≈ 18.10 USD; CAC payback barely changes. That looks harmless. However, if your heavy users triple or quadruple usage, or if your user base shifts toward heavy usage without a corresponding increase in ARPU, things change quickly.

6.3 What if the effective model cost doubles?

If the average cost per million tokens for your model goes from 1.00 to 2.00 USD—either because you switch to a more capable model or because you use a more complex orchestration—then:

LLM cost per user/month (at 200,000 tokens) goes from 0.20 to 0.40 USD.
Total COGS: 1.90 USD.
Gross profit: 18.10 USD.
Payback: 150 / 18.10 ≈ 8.3 months.

At these absolute levels, the change is still manageable. The real risk appears at lower ARPU and higher usage.

6.4 The dangerous zone: low ARPU and heavy usage

Consider a pricing model where:

ARPU = 10 USD/month (typical for B2C or prosumer tools).
Non-LLM variable costs = 1.50 USD.
Heavy usage: 600,000 tokens/month.
LLM cost: 0.60 USD/month at 1.00 USD per million tokens.

Then:

Total COGS ≈ 2.10 USD.
Gross profit ≈ 7.90 USD.
Gross margin ≈ 79%.

On paper, that still looks respectable. But:

Your gross profit per user is less than half of the previous example.
If CAC per paying user is, say, 60–80 USD, CAC payback can easily drift beyond 9–10 months.
If you ever need to move to a more expensive model or support more complex operations, margin can drop into the 60–70% range.

At that point, you are no longer playing the high-margin SaaS game. You are closer to a blended software + services model, with a cost structure that depends heavily on external providers.

7. What Founders of AI SaaS Must Control From Day One

From my perspective, there are at least five things every AI SaaS founder—and every investor looking at these businesses—should insist on.

7.1 A clear view of token economics per user

Average tokens per user per month, broken down by plan and segment.
Distribution of usage (not just averages): what does the top 10% of users look like?
Mapping of each major feature to its token cost.

7.2 Pricing aligned with value and cost

Plans that differentiate features based on value and cost (not just number of projects or “AI calls”).
Clear guardrails for “free tier”: which features are allowed, with what caps.
Mechanisms to prevent a small number of users from consuming disproportionate resources at low price points.

7.3 A model strategy that avoids lock-in and cost shocks

A realistic plan to combine cheaper models for routine tasks and more powerful models for premium actions.
Awareness of the risk of being locked into a single provider or model with no negotiable alternatives.
Experiments with model distillation or hybrid architectures when scale justifies it.

7.4 Integration of AI costs into core SaaS metrics

Incorporating LLM and AI-related costs directly into COGS and margin analysis.
Looking at ARR, ARPU, CAC, LTV and CAC payback with AI costs considered explicitly, not as an afterthought.
Running regular sensitivity analyses: what happens if model prices change, usage increases, or ARPU must be adjusted?

7.5 A sober narrative towards investors and the market

Avoiding the temptation to present AI as a magic margin expander when it is, in fact, a new cost centre.
Framing AI as a value driver that justifies higher ARPU and better retention, not just as a shiny feature.
Being explicit about both upside and risks: performance, costs, regulation, and competition.

8. Conclusion: AI Does Not Cancel Gravity

AI is powerful and, in many cases, genuinely transformative. But it does not cancel financial gravity. An AI SaaS is still a SaaS: it lives or dies on its economics. If there is one message I would highlight for founders and investors, it is this: The most interesting AI SaaS businesses in 2026 are not the ones with the most impressive demos, but the ones that can explain, in concrete terms, how tokens, models and usage translate into sustainable margins and healthy payback. In the next article of this series, I will switch to the investor’s perspective:

What do investors really look for in AI SaaS businesses?
How do they read ARR, CAC and margins in this new context?
And how should founders present their AI story so that it stands up to due diligence?

For now, the priority for anyone building or funding AI SaaS is clear: bring token economics and SaaS economics into the same conversation, and do it early.

The Hidden Economics of AI SaaS: Compute, Tokens and Real Margins in 2026

📘 Article index

1. How AI SaaS Really Makes (and Spends) Money

2. From Prompt to Invoice: How Tokens Become a Cost Line

2.1 A simple example: generating a document or slide deck

3. An AI SaaS Example: Revenue vs LLM Costs

3.1 Base assumptions

3.2 LLM cost per active user

4. Gross Margin in AI SaaS: The Real Picture

4.1 Base revenue and cost per user

5. ARR, ARPU, CAC and AI Costs: Combining the Two Worlds

5.1 Revisiting ARR and ARPU

5.2 CAC and CAC payback in AI SaaS

6. Sensitivity Analysis: Where AI SaaS Can Go Wrong

6.1 Base case

6.2 What if usage doubles?

6.3 What if the effective model cost doubles?

6.4 The dangerous zone: low ARPU and heavy usage

7. What Founders of AI SaaS Must Control From Day One

7.1 A clear view of token economics per user

7.2 Pricing aligned with value and cost

7.3 A model strategy that avoids lock-in and cost shocks

7.4 Integration of AI costs into core SaaS metrics

7.5 A sober narrative towards investors and the market

8. Conclusion: AI Does Not Cancel Gravity

Recent posts

Previous

Next

Somos B2B Business Makers

The Hidden Economics of AI SaaS: Compute, Tokens and Real Margins in 2026

📘 Article index

1. How AI SaaS Really Makes (and Spends) Money

2. From Prompt to Invoice: How Tokens Become a Cost Line

2.1 A simple example: generating a document or slide deck

3. An AI SaaS Example: Revenue vs LLM Costs

3.1 Base assumptions

3.2 LLM cost per active user

4. Gross Margin in AI SaaS: The Real Picture

4.1 Base revenue and cost per user

5. ARR, ARPU, CAC and AI Costs: Combining the Two Worlds

5.1 Revisiting ARR and ARPU

5.2 CAC and CAC payback in AI SaaS

6. Sensitivity Analysis: Where AI SaaS Can Go Wrong

6.1 Base case

6.2 What if usage doubles?

6.3 What if the effective model cost doubles?

6.4 The dangerous zone: low ARPU and heavy usage

7. What Founders of AI SaaS Must Control From Day One

7.1 A clear view of token economics per user

7.2 Pricing aligned with value and cost

7.3 A model strategy that avoids lock-in and cost shocks

7.4 Integration of AI costs into core SaaS metrics

7.5 A sober narrative towards investors and the market

8. Conclusion: AI Does Not Cancel Gravity

Recent posts

Previous

Next

Related posts

What Investors Really Look For in AI SaaS: Metrics, Narrative and Risk in 2026