StrategyAI News

Your 'Cheap' AI Just Got More Expensive: What Gemini 3.5 Flash's Pricing Means for Your Marketing Budget

Everyone told you AI would keep getting cheaper. Then Google launched a 'budget' model that costs three times more per token than the one it replaced. Here's what that actually means for how you plan spend.

TL;DR

Google's new Gemini 3.5 Flash is cheaper than its Pro model but noticeably pricier than the old Flash it replaces. That breaks the 'AI gets cheaper forever' assumption a lot of marketing budgets are quietly built on. The fix isn't panic. It's understanding where your AI costs actually live and planning for usage, not just subscriptions.

$1.50/$9Flash price per M tokens
~3xPricier than the old Flash
86.4%Marketing teams now use AI
19.2%Run AI agents end-to-end
Share This Article
TL;DR

At Google I/O 2026, Gemini 3.5 Flash launched at roughly $1.50 per million input tokens and $9 per million output tokens, several times the price of the previous Flash model. As more of your marketing work shifts to AI agents that run tasks in the background, your costs scale with usage, not with a flat monthly fee. This is a planning problem, and it's very solvable.

What actually happened at Google I/O

On 19 May 2026, Google launched Gemini 3.5 Flash at its I/O conference. [3] Flash is Google’s “fast and affordable” line, the model you’re meant to use for high-volume, everyday tasks. The interesting part wasn’t the speed. It was the price tag.

Gemini 3.5 Flash came in at roughly $1.50 per million input tokens and $9 per million output tokens. [1] Cheaper than Google’s premium Pro model, yes. But several times more expensive per token than the Flash model it replaced. [2] One headline summed up the mood: a faster model “with higher prices but no generational leap.” [5]

If “tokens” makes your eyes glaze over, here’s the plain version: a token is a chunk of text, roughly three-quarters of a word. You pay per chunk going in (your prompt) and per chunk coming out (the AI’s answer). The more you use AI, the more chunks you rack up. And Google just raised the per-chunk price on the model most people will use most.

This isn’t a Google story, really. It’s a signal. The race to the bottom on AI pricing has a floor, and we may have just hit it.

Why 'AI keeps getting cheaper' is a half-truth

For two years the industry narrative has been simple: models get better and cheaper every few months, so just wait and your costs will fall. That was true. It’s now only half true, and the half that’s changing matters for your budget.

Here’s what’s actually happening. The price of a given level of capability keeps dropping. But the capability you want keeps moving up. Newer models do more (they reason, they use tools, they run agentic tasks), and those capabilities cost more to run. So the sticker price of “the model everyone’s using right now” isn’t falling in a straight line. Sometimes, like with Flash this month, it goes up. [4]

The shift in plain terms

It’s like phone plans. The cost per gigabyte of data fell for years. But your bill didn’t shrink, because you started streaming video, then using it for everything. AI is following the same path: cheaper per unit, more units consumed, bigger total bill.

And consumption is climbing fast. Across the industry, 86.4% of marketing teams now use AI in some part of their workflow, and nearly one in five are already running AI agents to automate work end-to-end. [6] Agents are the key word, because agents consume tokens at a completely different scale than a person typing the occasional prompt.

The budget math most teams are getting wrong

Most marketing teams budget for AI the way they budget for software: a flat monthly subscription, one line in the spreadsheet, done. That model is about to break.

The old way: “We pay $20 a head for ChatGPT, times the team, that’s our AI cost.” Clean and predictable.

The new way, as more work moves to agents and automated workflows: your cost scales with usage. A workflow that drafts and personalises a thousand emails consumes far more than one person writing five. An agent that monitors competitors daily runs whether or not anyone’s watching. The bill tracks activity, not headcount.

This is the trap. Teams pilot an AI workflow, love the results, scale it across every campaign, and then get surprised by a usage bill that grew with the success. The workflow didn’t get more expensive. You just used it more, which was the whole point.

The reframe

Stop asking “what does the AI tool cost?” Start asking “what does this AI workflow cost per use, and how often will we run it?” That single question turns AI from an unpredictable line item into something you can actually forecast.

Where your real AI costs are hiding

The subscription is rarely the expensive part. Here’s where the money actually goes, in roughly the order it surprises people.

High-volume automation. Personalising emails at scale, generating product descriptions for a big catalogue, running ad-copy variations: anything that multiplies across thousands of items. This is where token costs add up quietly, because each item is cheap and the volume is invisible until the invoice arrives.

Always-on agents. A monitoring agent or a “research this every morning” task runs on a schedule whether it finds anything useful or not. Set up ten of those across the team and you’re paying for a lot of background activity.

The expensive-model-by-default habit. Plenty of teams pipe every task through the most powerful (and priciest) model out of caution. Most marketing tasks (drafting, summarising, reformatting) run perfectly well on cheaper models. Using a frontier model to rewrite a subject line is like taking a taxi to your mailbox.

Rework. Vague prompts produce mediocre output that someone has to fix or regenerate. Bad prompting isn’t just a quality problem, it’s a cost problem, because every regeneration is another bill.

This is also why the free and open options matter more than ever. Genuinely capable free models like DeepSeek V4 are worth knowing about precisely because they change the cost equation for routine work.

What to actually do about it

None of this means pump the brakes on AI. It means spend like a grown-up. Five concrete moves:

1. Match the model to the task. Use cheaper, faster models for routine drafting and summarising. Save the premium models for genuinely hard reasoning. Most tools let you pick. This one change alone can cut a bill meaningfully.

2. Track usage, not just spend. Once a month, look at which workflows consume the most. You’ll almost always find one or two automations doing 80% of the consumption, and half of that will be tasks that didn’t need a top-tier model.

3. Tighten your prompts. A clear, well-structured prompt gets it right the first time. Fewer regenerations, lower cost, better output. Prompting skill is now a budget skill.

4. Set a usage ceiling before you scale. Before rolling a workflow across every campaign, estimate cost per run times expected volume. If the number scares you, redesign the workflow, don’t discover it on the invoice.

5. Don’t over-automate. Some tasks genuinely don’t need an always-on agent. A weekly manual run might cost a fraction of a daily automated one and be just as useful.

Building a cost-aware AI stack

If you’re a lean team, you don’t need a sprawling toolkit. You need a tiered one. Think of your AI spend in three layers.

The free layer. For one-off drafting, brainstorming, and quick questions, free tiers of the major tools handle a surprising amount. Don’t pay for what you use occasionally.

The workhorse layer. One or two paid tools for the work your team does daily. This is where Claude, ChatGPT, or Gemini earn their subscription. If you want a sense of how much practical work a single tool can absorb, our breakdown of Claude for small business workflows is a useful map.

The automation layer. The agents and high-volume workflows. This is the layer to watch like a hawk, because it’s the one that scales with usage. Pilot small, measure cost per run, then scale deliberately.

The teams who’ll win the next year aren’t the ones spending the most on AI. They’re the ones who know exactly what each layer costs and refuse to pay frontier prices for mailbox-distance tasks. We saw the same discipline pay off at Google Marketing Live 2026: the wins went to teams who matched the tool to the job.

What to do this quarter

Here’s your action list, and none of it requires a finance degree.

This week, open your AI tools and find the usage or billing dashboard. Just look at it. Most teams have never checked which workflows consume the most, and the answer is usually surprising.

This month, run the “match the model to the task” audit. List your recurring AI tasks, and for each one ask: does this genuinely need the most powerful model, or will a cheaper one do? Downgrade the ones that don’t.

This quarter, change how you budget. Add a “usage” line next to your “subscription” line, and start estimating cost per workflow before you scale anything. That’s the habit that keeps AI an asset instead of a surprise.

The era of assuming AI just gets cheaper is over. That’s not bad news. It just means AI is becoming a normal operating cost, the kind you plan for, optimise, and control. The marketers who treat it that way now will be the ones who aren’t blindsided later.

Frequently asked questions

Did Gemini 3.5 Flash really get more expensive?

Yes and no. Gemini 3.5 Flash launched at around $1.50 per million input tokens and $9 per million output tokens, which is cheaper than Google's premium Pro model but several times more expensive than the previous Flash model it replaced. So compared with the older budget option, costs went up.

What is a token, and why does it affect my marketing budget?

A token is a small chunk of text, roughly three-quarters of a word. AI tools charge per token going in (your prompt) and per token coming out (the response). The more AI work you do, especially high-volume tasks like mass email personalisation, the more tokens you consume and the higher your bill, regardless of any flat subscription fee.

Does this mean I should use AI less?

No. It means you should use it more deliberately. Match cheaper models to routine tasks, save premium models for genuinely hard work, tighten your prompts to avoid costly regenerations, and estimate the cost of a workflow before you scale it across every campaign.

How should a small marketing team budget for AI in 2026?

Think in three layers: free tools for occasional one-off work, one or two paid tools for daily tasks, and a closely watched automation layer for agents and high-volume workflows. Add a 'usage' line to your budget alongside the subscription line, since agentic work scales with activity, not headcount.

Where do most teams overspend on AI?

Three places: routing every task through the most expensive model by default, running always-on agents that consume tokens whether or not they produce value, and rework caused by vague prompts that need regenerating. Fixing these usually cuts cost without reducing output.

About this guide

This guide is for marketing leaders and small-team owners trying to plan AI spend in a market where pricing no longer only moves downward. It explains the Gemini 3.5 Flash pricing shift in plain terms and turns it into a practical budgeting framework. Pricing details are drawn from Google's I/O 2026 announcement and reputable industry reporting.

Hina Mian
Hina Mian — Co-Founder, Future Factors AI

Hina brings 10+ years of marketing strategy and brand growth experience to the AI conversation. She helps businesses and teams cut through the noise and apply AI where it actually matters. Future Factors offers AI Bootcamps, Corporate Workshops, and Speaking & Consulting for businesses ready to adopt AI without the overwhelm.

More about Hina →
Sources
  1. [1] Simon Willison. Gemini 3.5 Flash: pricing and analysis. 2026.
  2. [2] Tech Times. Google Ships Gemini 3.5 Flash, an Agent Model That Costs 3x More Per Token. 2026.
  3. [3] CNBC. Google debuts new AI models and personal AI agents. 2026.
  4. [4] MarkTechPost. Google Introduces Gemini 3.5 Flash at I/O 2026. 2026.
  5. [5] Trending Topics. Google Launches Gemini 3.5 Flash With Higher Prices but No Generational Leap. 2026.
  6. [6] HubSpot. 2026 State of Marketing Report. 2026.

Psst, Hey You!

(Yeah, You!)

Want helpful AI tips flying Into your inbox?

Weekly tips. Real examples. Practical help for busy professionals.

We care about your data, check out our privacy policy.