Buying an “AI CRM” in 2026 is easy. Buying one that keeps clean data, avoids deliverability landmines, and proves what it did is the hard part. Demos will not save you. A vendor can click through a pretty copilot screen while your pipeline quietly rots in the background.
This post is the buyer weapon. It is an AI CRM features checklist built as 27 questions. Categorized. With pass/fail red flags. With “prove it” prompts vendors must answer in writing.
TL;DR
- If it cannot keep data clean automatically, your AI is just hallucinating faster.
- If it can take actions but cannot show approvals and audit trails, it will eventually do something dumb.
- If it sends email without throttling, stop rules, and authentication checks, it will torch your domain.
- If scoring is a black box, it is not scoring. It is vibes.
- Demand written answers, sample logs, and exportable evidence. No demo theater.
How to use this AI CRM features checklist (no demos required)
Send this as a doc to every shortlisted vendor. Give them 72 hours.
Rules
- Written answers only. “We can show you on a call” means “it breaks in production.”
- Evidence beats opinions. Screenshots of logs, sample exports, redacted customer audit trails.
- Pass/fail first. Do not negotiate on basics like authentication, audit logs, and writeback.
Scoring
- Pass = meets requirement and shows evidence.
- Conditional = meets requirement with limits, manual work, or extra tools.
- Fail = “on roadmap,” “via Zapier,” “our partners,” or “we recommend.”
Category 1: Data hygiene automation (because garbage data kills AI)
If your CRM can’t maintain clean, consistent objects, “AI” just means “wrong answers at scale.”
1) What data quality rules are enforced automatically, not suggested?
What you want
- Required fields, type validation, format checks, and dedupe rules that run continuously.
- Rules apply at ingest and on updates.
Pass
- Field-level validation, required fields by pipeline stage, and automated cleanup workflows.
Fail red flags
- “Users just need to be disciplined.”
- “We have a dashboard for duplicates.”
Prove it (in writing)
- Share a list of enforceable rules and where they run (ingest, nightly, real time).
- Provide a sample “rule hit” log export.
2) How do you dedupe across people, accounts, and domains?
What you want
- Fuzzy matching, domain normalization, and merge suggestions with conflict handling.
Pass
- You can define merge precedence rules and keep a merge audit trail.
Fail red flags
- Dedupe only on email address.
- No merge audit trail.
Prove it
- Provide a redacted before/after merge record and the audit entry that explains it.
3) Do you prevent “ghost fields” and inconsistent picklists across teams?
What you want
- Controlled schemas.
- Picklist governance. No random “Enterprise-ish” values.
Pass
- Central schema management and controlled vocabularies with change history.
Fail red flags
- Every rep can create fields.
- No field change history.
Prove it
- Export schema with creation dates, owner, and change log.
4) Can the system auto-fix enrichment drift (role changes, job hops, new domains)?
What you want
- Continuous refresh of critical fields.
- Alerts when a contact bounces or changes title.
Pass
- Automated refresh schedules and triggers based on engagement and bounce signals.
Fail red flags
- “Run enrichment again if you need it.”
- Refresh costs extra per record with no rules.
Prove it
- Show refresh cadence options and a sample “job change detected” event.
If you care about enrichment that stays current, map this to your buying criteria for Lead Enrichment.
Category 2: Agentic actions (the AI does work, not just narrates work)
“AI notes” are cute. You need autonomous actions that move pipeline.
5) Which actions can the agent take end-to-end inside the CRM?
Ask for a checklist. Minimum viable agentic CRM should do:
- Create and complete tasks
- Update fields and stages
- Create contacts/accounts
- Draft and send emails
- Pause or stop sequences
- Book meetings
Pass
- Actions run with guardrails and write back to records.
Fail red flags
- Agent can only draft text.
- Agent lives in a sidebar and never touches the database.
Prove it
- Provide a list of actions and the exact objects they can modify.
6) Can the agent run multi-step playbooks with conditions?
Example playbook:
- Enrich lead
- Score fit + intent
- If score > 80, send sequence A
- If bounce, stop and flag record
- If reply positive, create meeting task and update stage
Pass
- Conditional logic, stop rules, and state tracking.
Fail red flags
- “You can build that with workflows” but no agent state.
Prove it
- Show a redacted execution trace of a playbook run.
7) Does every agent action write back with provenance?
You need: who/what changed a field, when, and why.
Pass
- Field history includes agent identity, policy, and source signals.
Fail red flags
- “We log it somewhere” but you cannot export it.
- No per-field change history.
Prove it
- Export of field history for one record showing at least 10 edits.
8) Can the agent operate on your ICP automatically, not manually selected lists?
If the agent depends on hand-curated lists, it is not autonomous. It is an intern with autocomplete.
Pass
- ICP definition drives targeting and actions continuously.
Fail red flags
- ICP exists only as tags and filters.
- No ICP-driven lead sourcing.
Prove it
- Provide a screenshot or export of the ICP definition and how it feeds acquisition.
For an ICP-first workflow, anchor your evaluation against an actual ICP Builder.
Category 3: Scoring and prioritization (fit plus intent, not vibes)
Scoring decides who gets contacted. It also decides who gets ignored. So it needs evidence.
9) Do you separate fit scoring from intent scoring?
Fit = firmographics, technographics, role match.
Intent = buying signals, behavior, timing.
Pass
- Two scores. Two explanations. One combined priority.
Fail red flags
- One magic number.
- “Our model figures it out.”
Prove it
- Provide scoring features list and a sample explanation for 5 leads.
If you want the dual model done right, benchmark against AI Lead Scoring.
10) What are your top 10 scoring inputs, and can I edit weights?
You do not need full transparency into model internals. You do need control over your go-to-market reality.
Pass
- Editable weights or rules. Override layers. Segment-specific scoring.
Fail red flags
- No controls.
- “Talk to support if you want changes.”
Prove it
- Show weight controls and the change history of scoring configuration.
11) Can you explain “why this lead, why now” in one paragraph?
This must be readable by an operator. Not a data scientist.
Pass
- A human-readable reason backed by data points.
Fail red flags
- “AI thinks this is a good fit.”
- Explanation does not cite any inputs.
Prove it
- Provide 10 example explanations and the underlying signals.
12) Can scoring trigger actions safely?
Scoring is pointless if it does not route attention and outbound.
Pass
- Thresholds trigger sequences, routing, tasks, or escalation with approvals.
Fail red flags
- Scoring lives in a report.
- No ability to act on scoring.
Prove it
- Show an automation rule that triggers on score plus the resulting writeback.
Category 4: Deliverability and sending controls (your domain is not a toy)
If a CRM sends email, it owns deliverability. Full stop.
Google and Yahoo’s bulk sender rules require authentication (SPF, DKIM, DMARC) and one-click unsubscribe for bulk mailers, plus spam complaint thresholds. Treat this as table stakes, not “email marketing stuff.” See: Google and Yahoo requirement breakdowns from credible deliverability vendors like Valimail and Klaviyo, plus explainers such as BuzzStream’s checklist.
- Valimail compliance checklist PDF
- Klaviyo overview of Google and Yahoo sender requirements
- BuzzStream explainer
Microsoft has also pushed bulk sender requirements for Outlook.com and related domains, including authentication and unsubscribe expectations. One accessible summary:
13) Do you enforce SPF, DKIM, and DMARC checks before sending?
Pass
- The system blocks sending if auth is missing or misaligned.
Fail red flags
- “We recommend you set it up.”
- No preflight checks.
Prove it
- Provide a screenshot of the preflight gating or a sample “send blocked” event.
14) Do you support one-click unsubscribe correctly (header + behavior)?
One-click unsubscribe is not “we put an unsubscribe link in the footer.” Providers look for the correct headers and behavior.
Pass
- Supports List-Unsubscribe and one-click where required for bulk mail.
Fail red flags
- “We include unsubscribe text.”
- Unsubscribe processed manually.
Prove it
- Provide raw message headers from a real send showing List-Unsubscribe.
15) Do you throttle sending per mailbox, domain, and segment?
You need controls like:
- Daily caps per mailbox
- Ramp schedules for new domains
- Per-provider throttling (Gmail, Outlook, Yahoo)
- Reply-rate aware pacing
Pass
- Fine-grained throttles and ramp rules.
Fail red flags
- One global send limit.
- “We send as fast as you want.”
Prove it
- Share throttle settings page and an export of sends per mailbox per day.
16) Do you have stop rules that actually stop?
Minimum stop rules:
- Bounce hard: stop.
- Spam complaint: stop.
- Unsubscribe: stop.
- Negative reply: stop.
- Auto-reply: pause or route.
Pass
- Stop rules at event level with immediate enforcement.
Fail red flags
- Negative replies still get followed up.
- Stops depend on manual tags.
Prove it
- Show a redacted event timeline where a bounce stops a sequence.
If you want a full operator-level standard for this, tie it back to “engagement-first outbound” concepts like throttling and stop rules. Chronic’s point of view lives here: 2026 Deliverability: The Engagement-First Outbound System.
Category 5: Governance (permissions, approvals, audit trails)
Agentic systems without governance are just fast mistakes with a confident tone.
Standards bodies have been blunt about governance and risk management for AI. NIST’s AI Risk Management Framework defines functions like GOVERN, MAP, MEASURE, and MANAGE, and pushes documentation and accountability as core mechanics, not “nice to have.”
ISO also released ISO/IEC 42001:2023, an AI management systems standard that formalizes governance expectations.
17) Can you restrict what the agent can do by role, object, and field?
Pass
- Permissions at field-level and action-level.
- Separate permissions for read, write, send, export.
Fail red flags
- Admin-only controls.
- All-or-nothing agent permissions.
Prove it
- Provide a permission matrix export.
18) Do you support approvals for risky actions?
Examples:
- Sending new sequences to a new segment
- Editing pipeline stages
- Bulk enrichment
- Auto-creating opportunities
Pass
- Approvals built-in with queues and timeouts.
Fail red flags
- “Just review it afterward.”
- Approvals only via Slack.
Prove it
- Show an approval workflow and an audit event that captures approval.
19) Do you maintain immutable audit logs for agent actions and human actions?
Pass
- Append-only audit logs.
- Exportable.
- Filterable by actor, object, time, action type.
Fail red flags
- Logs expire fast.
- No export.
Prove it
- Export 30 days of audit logs from a sandbox.
For deeper governance mechanics, align your questions with a control-plane mindset: The Agentic CRM Control Plane: Permissions, Approvals, and Audit Trails.
20) Can you run the agent in “suggest only” mode, then graduate to “auto”?
Pass
- Mode controls per workflow.
- Easy rollback.
Fail red flags
- It is either fully manual or fully autonomous.
Prove it
- Provide workflow settings showing suggestion vs auto modes.
Category 6: Reporting and attribution (prove pipeline, not activity)
If your CRM cannot prove what created pipeline, you will eventually cut the wrong program and keep the wrong one.
21) Do you provide multi-touch attribution for outbound and inbound together?
Pass
- Tracks touches across email, calls, meetings, and web conversions.
- Connects to opportunity and revenue.
Fail red flags
- Attribution only based on “last touch.”
- Outbound sequences tracked separately from CRM pipeline.
Prove it
- Provide a sample attribution report with definitions.
22) Can you answer: “What did the agent do last week that created meetings?”
Pass
- Agent activity report tied to outcomes: replies, meetings booked, opportunities created.
Fail red flags
- Activity metrics only (sends, opens).
- No outcome linking.
Prove it
- Share a report export with columns: lead, action, timestamp, outcome.
If you want a metric spine that operators actually use, connect this to an outbound measurement stack: The Outbound ROI Stack for 2026: 6 Metrics Your CRM Must Own.
23) Do you report deliverability health inside the CRM?
Minimum:
- Bounce rates
- Spam complaint rates
- Unsubscribe rates
- Inbox placement proxies
- Domain health indicators
Pass
- Alerting and thresholds.
Fail red flags
- “That’s your email tool’s job.”
- No spam complaint tracking.
Prove it
- Provide deliverability dashboard screenshots and alert configuration.
24) Can you track pipeline impact by segment, ICP, and signal?
This is where “AI” becomes practical. You need to know which signals actually convert.
Pass
- Segment reports by ICP filters and intent signals.
Fail red flags
- No way to slice outcomes by enrichment or scoring inputs.
Prove it
- Provide an example report: “Hiring signal leads vs baseline.”
Category 7: Integrations and writeback (no data islands)
If your CRM cannot write back cleanly to your systems, you will rebuild the truth in spreadsheets. Again.
25) Which integrations are native, and which are glue code?
Ask specifically about:
- Email and calendar
- Data warehouses
- Enrichment providers
- Ad platforms
- Calling
- Customer data platforms
Pass
- Native integrations for core systems.
- Clear API coverage.
Fail red flags
- “We integrate with everything” but it is all Zapier.
- No writeback, only sync in.
Prove it
- Provide an integration list with directionality: read, write, bi-directional.
26) Do integrations support writeback with conflict handling?
Two systems will disagree. The question is whether the CRM handles it like an adult.
Pass
- Field precedence rules and conflict logs.
Fail red flags
- Last write wins, silently.
- No conflict visibility.
Prove it
- Provide a sample conflict event and resolution policy.
27) Can I export everything, including agent logs, scoring history, and field changes?
If you cannot export it, you do not own it.
Pass
- Bulk export or API for:
- Records
- Field history
- Scoring inputs and outputs
- Agent execution traces
- Audit logs
Fail red flags
- Export only “basic CRM fields.”
- No way to extract decisioning history.
Prove it
- Provide API docs or a sample export schema.
Pass/fail dealbreakers (print this and tape it to your monitor)
If any of these fail, do not buy.
Hard fails
- No exportable audit trail for agent actions.
- No deliverability controls beyond “send limit.”
- No DMARC/SPF/DKIM preflight checks or guidance that is enforceable.
- No clear separation of fit vs intent scoring.
- No writeback to core objects, agent lives in a chat window.
Soft fails that turn into hard fails
- “On the roadmap” for approvals.
- “We can build it for you” for stop rules.
- “Just use another tool” for reporting.
Vendor response template (copy/paste)
Use this to force clear answers.
For each question
- Answer: (1-3 paragraphs)
- Pass/Conditional/Fail: (pick one)
- Limits: (rate limits, objects, roles, extra fees)
- Evidence: (screenshots, exports, docs)
- Implementation time: (days, not quarters)
- Owner: (product, support, solutions)
If they cannot fill this in, they cannot run your revenue system.
Where Chronic fits (one line, no circus)
Chronic runs end-to-end outbound till the meeting is booked. It finds leads, enriches them, scores fit + intent, writes and sends emails, and books meetings. Pipeline on autopilot. Start your evaluation from the capabilities that actually move pipeline:
If you are comparing stacks, keep it clean:
- Salesforce buyers usually want governance, then regret the tool sprawl. Here: Chronic vs Salesforce
- HubSpot buyers want all-in-one, then hit per-seat pricing and limits. Here: Chronic vs HubSpot
- Apollo buyers want data and sending, then still bolt on five tools. Here: Chronic vs Apollo
- Attio buyers want a modern CRM, then ask where the outbound engine is. Here: Chronic vs Attio
FAQ
What’s the difference between an AI copilot and an agentic CRM?
A copilot suggests. An agent executes. An agentic CRM takes actions like updating fields, launching sequences, and routing leads, and it writes back with logs and governance. If it cannot act, it is not agentic. If it can act but cannot prove what it did, it is dangerous.
What is an “AI CRM features checklist” and why does it beat demos?
An AI CRM features checklist is a written evaluation of capabilities that matter in production: data hygiene, autonomous actions, scoring, deliverability controls, governance, reporting, integrations, and writeback. Demos show the happy path. Checklists expose the failure path, which is where real costs live.
Which deliverability requirements should an AI CRM meet in 2026?
At minimum, it should enforce or gate sending based on SPF, DKIM, and DMARC alignment, and support one-click unsubscribe for bulk mail expectations from major mailbox providers. Google and Yahoo enforcement began in 2024 for bulk senders, and Microsoft rolled out bulk sender requirements in 2025. Start with the vendor explainers and checklists here:
- https://use.valimail.com/rs/936-SWF-978/images/checklist_google_and_yahoo_compliance_2024.pdf
- https://www.klaviyo.com/marketing-resources/2024-google-yahoo-sender-requirements
- https://www.inboxally.com/docs/compliance-industry-updates/microsoft-outlook-changes-for-bulk-senders-in-2025/
How do I validate lead scoring without trusting a black box?
Require fit and intent separation. Require “why this lead, why now” explanations. Require editable weights or override rules. Then ask for 10 real examples with the signals that drove the score. If they cannot show that, the score is not operational.
What governance features are non-negotiable for AI agents in a CRM?
Three things: permissions, approvals, audit trails. Permissions must be role, object, and field level. Approvals must cover risky actions like bulk sends and pipeline changes. Audit trails must be exportable and show actor, timestamp, and change details. Use NIST AI RMF as a governance reference point: https://nvlpubs.nist.gov/nistpubs/ai/nist.ai.100-1.pdf
Can I buy an AI CRM if my data is a mess?
Yes, but only if the product automates hygiene. If the vendor says “import your CSV and clean it later,” expect scoring garbage, broken routing, and inaccurate attribution. Your first buying filter should be Category 1 in this checklist.
Send the checklist. Demand receipts. Buy the one that can prove it.
Email every vendor the 27 questions. Tell them you want written answers and evidence. No demos. No vibes. The winner is the platform that can:
- keep data clean without heroics,
- take autonomous actions with stop rules,
- protect deliverability by default,
- and prove every change with audit logs.
Everything else is theatre.