Salesforce just made “AI labor” real for finance teams.
In its fiscal 2026 Q4 results (reported February 25, 2026), Salesforce highlighted two things side by side: Agentforce ARR of $800M (up 169% YoY) and a new consumption-style output metric called Agentic Work Units (AWUs), noting 2.4B AWUs delivered and about 19T tokens processed all-time. (investor.salesforce.com) That combination is not a random investor slide. It is Salesforce signaling a bigger shift in how CRMs will be bought, budgeted, and governed: from paying for access (seats) to paying for work (usage or outcomes).
TL;DR
- Agentic work units are Salesforce’s attempt to measure “AI labor” as completed tasks, not tokens or seats. (salesforce.com)
- Buyers should treat AWUs like a billing and governance layer, not a KPI. If you optimize for “more AWUs,” you will create vanity usage.
- The practical move: budget AI like a workforce. Define the job-to-be-done, estimate volume, set quality thresholds, then map unit cost to spend.
- Ask vendors what counts as a unit, how reruns are billed, and how you audit failures. CIO-level skepticism is warranted if definitions are fuzzy. (cio.com)
- SMB and mid-market teams can adopt agents without enterprise sprawl by centralizing AI work inside the CRM, enforcing approval points, and tracking ROI at the workflow level.
What Salesforce means by “Agentic Work Units” (and why it is not just branding)
Salesforce’s own definition is straightforward: an Agentic Work Unit (AWU) is one discrete task accomplished by an AI agent. Salesforce positions AWUs as the “moment where raw intelligence is converted into real work,” and explicitly frames AWUs as a way to move beyond token counting. (salesforce.com)
Two important nuances from Salesforce’s framing:
- AWUs are a platform-level output counter, spanning Agentforce and even Slack AI, not just one product surface. (salesforce.com)
- Salesforce is tracking both AWUs and tokens because tokens are an infrastructure cost proxy, while AWUs are intended to be a value proxy. (salesforce.com)
That pairing is the tell. Salesforce is trying to create a metric that finance, procurement, and ops can use to answer:
“How much work did our AI workforce actually do, and what did it cost?”
The buyer risk: AWUs can become “engagement metrics in a new costume”
CIO.com’s critique is fair: a shiny metric does not automatically equal business value. The hard part is what counts as a work unit, how consistently it is measured, and where it is exposed for audit. (cio.com)
So treat agentic work units the same way you treat:
- “emails sent” in outbound,
- “calls made” in SDR teams,
- “tickets touched” in support.
They are activity measures. They can be useful, but they are also easy to game.
The bigger shift: AI is moving CRM pricing from seats to labor
Historically, CRMs monetized “humans using software.” Seats made sense because marginal cost per user was low.
Agentic AI changes the economics:
- Each agent run triggers compute, retrieval, orchestration, tool calls, and verification loops.
- Marginal costs are real, variable, and non-linear.
Salesforce has already been pushing in this direction with Agentforce consumption pricing, including:
- $2 per conversation pricing (legacy model for some customer-facing use cases). (salesforce.com)
- Flex Credits priced at $500 per 100,000 credits, with Salesforce describing 20 credits per action (about $0.10 per action) in its May 15, 2025 announcement. (salesforce.com)
- “Digital Wallet” style tracking and multiple buying models (pre-purchase, PayGo, pre-commit). (salesforce.com)
This is the pattern: billing units are converging on “actions” and “work.” AWUs are the narrative wrapper that makes this legible to buyers and investors.
Expect hybrid pricing, not a clean break
Even Salesforce has publicly entertained the idea that many customers still prefer predictability, pushing agentic pricing back toward seat-like constructs in some contexts. (techradar.com)
Reality will likely be:
- Seats for humans
- Usage for agents
- Commit discounts for predictability
- Guardrails to prevent runaway spend
That is exactly why you need an internal budgeting method that works regardless of which vendor’s unit wins.
What “AI labor” measurement actually means inside a CRM
If you strip away the branding, “AI labor” measurement means you are instrumenting work the way operations teams instrument humans:
Inputs
- prompts, context, tokens, tool calls
Outputs
- researched accounts, enriched records, sent follow-ups, updated fields, scheduled meetings, created tasks
Quality
- accuracy, deliverability compliance, data completeness, hallucination rate, escalation rate
Cost
- per action, per conversation, per workflow, plus human review time
The key insight for buyers
You do not want “cheap AWUs.”
You want low total cost per acceptable outcome.
That means your ROI model must include:
- the AI’s variable cost, and
- the human time needed to supervise, approve, and fix.
A simple budgeting framework for AI labor inside your CRM (that finance will actually accept)
Here is a lightweight framework you can run in a spreadsheet. It works whether your vendor bills by agentic work units, actions, credits, conversations, or something else.
Step 1: Define the job-to-be-done (JTBD) in CRM terms
Do not budget “an agent.” Budget a job that currently consumes selling time.
Start with a short list of CRM-native jobs, for example:
- Research
- summarize company, recent news, triggers
- identify stakeholders and likely objections
- Enrichment
- fill missing fields (industry, size, tech stack)
- normalize titles, dedupe accounts
- Follow-ups
- write personalized emails and sequences
- create next-step tasks based on meeting notes
- Pipeline updates
- update stage, close date, MEDDICC fields
- detect risk signals and recommend actions
If you want these workflows to run inside Chronic Digital, map them to platform capabilities like:
- Lead Enrichment for data completion and normalization
- AI Lead Scoring to prioritize where agents should spend time
- AI Email Writer for controlled personalization at scale
- Sales Pipeline to connect “AI labor” to forecast impact
- ICP Builder to constrain agent work to the right accounts
Step 2: Estimate volume (per week and per month)
Forecast the work the same way you forecast SDR capacity.
Example (mid-market B2B SaaS):
- 6 SDRs
- 800 new leads/week
- 250 accounts/week need enrichment
- 1,200 follow-up emails/week
- 300 opportunities/month need stage hygiene
Now translate that into “units” based on the vendor’s meter:
- If “one enrichment update” is 1 unit, you have 250 units/week.
- If “one follow-up email draft + CRM update” is 2 units, you have 2,400 units/week.
Do not worry about being perfect. You are building a cost envelope.
Step 3: Set quality thresholds (so you do not buy vanity usage)
This is where most teams fail. They fund usage, not outcomes.
Attach minimum acceptance criteria per job:
Research quality thresholds
- Must cite at least 2 reliable sources, or must only use approved sources
- Must produce 3 account-specific talking points, not generic fluff
Enrichment quality thresholds
- Required fields completeness increases from 60% to 90%
- Error rate below 2% on sampled records (define “error”)
Follow-up quality thresholds
- Must pass deliverability rules and brand tone
- Must include 1 specific personalization token from enrichment
- Must include 1 clear CTA mapped to stage
Pipeline update thresholds
- No stage change without a logged reason
- Required fields populated before stage advance
Quality thresholds are your defense against:
“We generated 50,000 agentic work units, therefore success.”
Step 4: Map to cost (AI + human supervision)
Now compute all-in cost per job:
All-in cost per acceptable outcome =
- AI variable cost (credits, actions, agentic work units)
- plus human review time (minutes per item x loaded hourly rate)
- plus rerun cost (how often the AI fails and needs a second pass)
Salesforce’s own public pricing examples for Agentforce show the direction: consumption models like Flex Credits translate actions into dollars (Salesforce has described an action costing roughly $0.10 in its press release framing). (salesforce.com)
Even if your vendor is not Salesforce, your budgeting math should look like labor math:
- cost per unit
- throughput per week
- acceptance rate
- rework rate
How to evaluate ROI without getting trapped in “vanity usage”
1) Use a “good AWU” definition
Define a “good unit” internally as:
a unit that passed quality thresholds and reduced human time.
Track:
- AWUs produced
- AWUs accepted
- AWUs escalated to human
- AWUs rerun
Your goal is not to maximize AWUs. Your goal is to maximize accepted AWUs per dollar and time saved per dollar.
2) Tie AI labor to pipeline math, not activity
For revenue teams, acceptable north-star metrics include:
- meetings booked per 100 enriched accounts
- reply rate lift on AI-personalized follow-ups
- stage conversion rate change after better CRM hygiene
- reduction in cycle time from faster next-step execution
If you want a practical way to operationalize this, the playbook format in AI Sales Agent KPIs helps separate value metrics from “AI busywork” metrics.
3) Budget for “failure modes” up front
AI agents fail in predictable ways:
- missing context
- incorrect enrichment
- wrong stakeholder
- overconfident recommendations
- deliverability issues if outbound is not governed
If you have not built your data foundation, you will pay for rework. This is why CRMs that standardize objects and fields before turning on agents tend to get better unit economics. Use the checklist approach in AI-Ready CRM Data Model to reduce reruns and escalations.
Buyer checklist: what to demand before you accept agentic work units as a billing layer
Use this as a procurement and implementation checklist, regardless of vendor.
1) What counts as a billable unit?
Ask for examples in writing:
- Is “draft an email” one unit or multiple?
- Is “update 5 fields” one unit or five?
- Does a failed attempt still count?
- Does a “tool invocation” count even if the downstream system errors?
Salesforce defines an AWU as a “discrete task accomplished,” but buyers should still demand clarity on edge cases and partial completions. (salesforce.com)
2) Auditability and traceability
You need logs that a finance team can reconcile:
- timestamp
- actor (agent name/version)
- input context sources
- tools invoked
- outputs written to CRM
- approval status
- unit consumption and cost
If you cannot audit it, you cannot budget it.
3) Rerun costs (the silent budget killer)
Ask:
- If you rerun the same job with updated context, do you pay again?
- If the agent loops for verification, who pays for the loop?
- Can you cap retries per job?
4) Failure handling and stop rules
Require explicit policies:
- what counts as failure
- escalation path
- auto-stop thresholds (for example, “stop after 2 failed attempts”)
- quarantine rules for risky changes (bulk edits, stage changes, email sends)
If you want a concrete SOP structure for this, adapt the guardrails format in Autonomous SDR Agent SOP.
5) Human approval points (where you enforce judgment)
Not everything should be autonomous.
Common approval points in B2B sales:
- first-touch outbound message
- adding a new stakeholder to a sequence
- changing opportunity stage
- modifying close date and forecast category
- sending pricing or legal terms
You can still get leverage: the agent does 80% of the work, humans approve the 20% that carries risk.
6) “Digital wallet” controls, budgets, and alerts
If your vendor offers wallet-like tracking (Salesforce does for Agentforce usage), require:
- per-team budgets
- alert thresholds (80%, 100%, 120%)
- rate limiting
- anomaly detection (spikes, loops, runaway automations)
Salesforce explicitly markets Digital Wallet tracking for Agentforce consumption. (salesforce.com)
How SMB and mid-market teams adopt AI agents without enterprise sprawl
Enterprise sprawl happens when:
- every team spins up agents differently,
- data definitions are inconsistent,
- no one owns cost controls,
- “usage” becomes the goal.
SMB and mid-market teams can win by being more disciplined.
The anti-sprawl operating model (simple and effective)
-
Centralize AI work inside the CRM
- Avoid agents operating across five disconnected tools.
- If the CRM is the system of record, keep the unit economics measurable there.
-
Start with 2 workflows that are high-volume and low-risk
- enrichment and routing
- follow-up drafting with approvals
-
Use ICP constraints to prevent wasted AI labor
- If an account is outside ICP, the agent should not spend time researching it.
- This is where ICP Builder style workflows pay for themselves.
-
Prioritize with scoring before you automate
- If your agent is doing “work” on low-intent leads, you are buying busywork.
- Connect automation triggers to AI Lead Scoring outputs so your agentic work units concentrate on pipeline.
-
Make the pipeline the scoreboard
- Run AI work where you can observe lift: stage conversion, cycle time, win rate.
- Tie agent actions to opportunity outcomes in your Sales Pipeline.
Practical comparison note (so you do not overbuy)
Many teams will evaluate Salesforce against other stacks for outbound and CRM workflow execution. If you are doing that, keep the evaluation grounded in budgeting and governance, not feature checklists:
- Chronic Digital vs Salesforce for agent governance, unit economics visibility, and CRM-first execution: Chronic Digital vs Salesforce
- If your current stack leans heavily on lead list building and outbound motion: Chronic Digital vs Apollo
- If your team runs on “classic pipeline CRM” patterns: Chronic Digital vs HubSpot or Chronic Digital vs Pipedrive
Why this matters now (the market context behind the metric)
Salesforce is not introducing AWUs in a vacuum. SaaS pricing more broadly is moving into a hybrid era, with consumption components rising as AI introduces variable cost structures. Flexera has explicitly described the shift from seats to consumption and highlights growing adoption of hybrid models in the market. (flexera.com)
Separately, SaaS spend pressure is rising, which makes CFO scrutiny of “AI line items” inevitable. Zylo’s 2025 SaaS Management Index reports average SaaS spend of $4,830 per employee, up 21.9% YoY, with ongoing license waste. (zylo.com)
Against that backdrop, AWUs are Salesforce giving finance a story:
“We are not charging you for novelty. We are charging you for labor.”
Your job as a buyer is to ensure you are paying for useful labor.
FAQ
What are agentic work units?
Agentic work units are a task-based metric that represents discrete work completed by an AI agent. Salesforce defines an Agentic Work Unit (AWU) as “one discrete task accomplished by an AI agent,” intended to measure real work rather than token consumption. (salesforce.com)
Are AWUs the same as tokens?
No. Tokens measure text processed by a model, which correlates more directly with compute cost. AWUs aim to measure completed tasks, which is closer to business output. Salesforce is tracking both AWUs and tokens because they describe different layers: infrastructure cost vs operational work delivered. (salesforce.com)
How should I budget for AI agents inside a CRM?
Budget by workflow, not by “agent.” Define the job-to-be-done (research, enrichment, follow-ups, pipeline updates), estimate monthly volume, set quality thresholds, then calculate all-in cost including AI usage plus human review and reruns.
What questions should procurement ask about billable units?
Ask what counts as a unit, whether failures and reruns are billed, what audit logs exist, how usage caps and alerts work, and which actions require human approval. CIO-level skepticism is reasonable when the vendor cannot clearly define or consistently verify a “work unit.” (cio.com)
How do I avoid vanity usage when measuring AI labor?
Track “accepted units,” not total units. A good unit is one that passes quality thresholds and reduces human time. Tie AI labor to pipeline outcomes (conversion rates, cycle time, meetings booked), not to activity totals like AWUs generated.
Can SMB teams use agentic AI without creating tool sprawl?
Yes, if they centralize workflows in the CRM, start with 1-2 high-volume jobs, enforce approval points for high-risk actions, and use scoring and ICP constraints to prevent the agent from working low-value records. This keeps unit costs predictable and outcomes measurable.
Put AWUs on a budget: start with two workflows and a hard acceptance bar
If Salesforce’s Agentic Work Units are the signal, the shift is clear: AI inside the CRM is becoming labor, and labor gets budgeted and governed.
This week, do three things:
- Pick two CRM jobs (example: enrichment + follow-up drafting).
- Write an acceptance bar (what “good output” means, and what triggers escalation).
- Forecast volume and cap spend, then review weekly using accepted outputs, reruns, and pipeline impact.
That is how you benefit from agentic work units without letting “AI usage” become the new seat bloat.