What is Agentforce 360?

Agentforce 360 is the agent layer of the Salesforce Platform: it lets you build AI agents that don't just chat — they do work. An agent can look up an order, file a case, screen a job candidate or update a record, using your org's real data and automation, 24/7. This guide is the pillar: every section stands alone, cross-links run everywhere, and the deep-dive articles on our blog cluster around it.

If you're new to this: think of the difference between a chatbot and a capable new teammate. A classic chatbot follows a rigid decision tree you drew in advance. An Agentforce agent is given a goal, a set of tools (actions), and rules (instructions and guardrails) — then it figures out the steps itself, using a large language model to understand the request and the Atlas Reasoning Engine to plan what to do.

How a request actually flows

1. Utterance — the user says: "Where's my order 8842?"
2. Subagent selection — the reasoning engine matches the request to a subagent (e.g. Order Support). Terminology note: Salesforce recently renamed topics to "subagents" — same concept, no functional change; you'll see both words in docs and in this guide.
3. Plan — the engine looks at the subagent's available actions and builds a plan: look up the order → fetch shipping events → summarize. Planning is iterative: after each action it re-evaluates until the request is fulfilled (or it asks the user for more information).
4. Actions execute — Flows, Apex, prompt templates or API calls run against real org data.
5. Response — the LLM composes the answer, passed through the Einstein Trust Layer (data masking, toxicity checks, audit trail, zero data retention with external LLM providers).

The developer toolbox (official)

Per the official Get Started hub, these are the tools you'll meet in this guide:

Tool	What it's for	Where in this guide
Agentforce Builder	Low-code studio: create, configure, test agents	Lab 1
Agent actions	Flows, Apex invocables, Apex REST, AuraEnabled methods, named queries, prompt templates	Lab 2
Agent Script	Scripting language mixing natural language with programmatic control	Lab 3
Models API + Prompt Builder	Direct LLM access (Apex/REST) and grounded prompt templates	Models & Prompts
Agent API	REST endpoint to chat with agents headlessly	Agent API
Agentforce DX, Python SDK, Mobile SDKs	Pro-code build/test via CLI & VS Code, programmatic creation, iOS/Android chat	DX & SDKs
Testing Center / Testing API	Bulk-test agent behavior via UI, CLI or code	Testing
Multi-Agent Orchestration	Orchestrator + connected subagents working as a team	Multi-Agent
Data 360 + Intelligent Context	Unified data, zero-copy, RAG grounding for agents	Data & Grounding
MCP	Open protocol connecting agents to external tools, governed centrally	MCP

How this guide is organized

Series	Start at	Cluster deep-dives on the blog
Foundations	Core Concepts → Architecture → Trust Layer	Agentforce 360 explained
Builder & Agent Script	Agentforce Builder → Lab 3	Agent Script guardrails
Multi-Agent Orchestration	Agents as a Team	More cluster posts landing here soon
Data & Grounding	Data 360, Context & RAG	RAG knowledge agent build
MCP & Integrations	MCP → Agent API	MCP explained · Headless agents · DX MCP server
Developer corner	Models & Prompts → Lab 2	Models API from Apex · Prompt Builder hands-on
Quality & trust	Testing → Observability	More cluster posts landing here soon
Channels & rollout	Voice, Slack & Portals → Scenarios	AI Projects Lab

Sources: Agentforce Developer Guide — Get Started · Get Started with Agents

Core Concepts

Five building blocks cover almost everything in Agentforce. Learn these and the rest of the platform snaps into place.

Concept	What it is	Everyday analogy
Agent	The deployable assistant with a role, company context, channels (chat, Slack, voice, API) and a set of subagents.	The employee
Subagent (topic)	A job-to-be-done grouping: classification description, scope, instructions and the actions it may use.	A responsibility on the job description
Instructions	Natural-language rules inside a subagent: "Always verify the contact before discussing order details."	The training manual
Action	A tool the agent can invoke — Flow, Apex invocable, Apex REST, AuraEnabled method, named query, prompt template, API, or (beta) an MCP tool.	The systems the employee can use
Atlas Reasoning Engine	The planner that picks subagents, sequences actions and decides when to ask follow-up questions; configurable via hybrid reasoning.	The employee's judgment

Variables — the concept everyone skips (don't)

Agents also have variables: typed state that survives across turns and subagents. Two kinds matter (you'll use both in Lab 3):

Linked variables — bound to session context, e.g. @MessagingSession.Id or @MessagingEndUser.ContactId. The platform fills them; your agent reads them.
Mutable variables — the agent's own memory, e.g. currentInterviewStep = "Permission", updated with the built-in @utils.setVariables action. This is how you build state machines that an LLM can't wander out of.

Agent types & security model

Service agents — customer-facing (web chat, WhatsApp, voice). Run as a designated agent user, so you control exactly what data they can reach.
Employee agents — internal helpers (often in Slack). They inherit the current user's permissions — an employee can never see more through the agent than they could themselves.
Headless agents — any agent invoked via the Agent API from your own app or backend.

Security in one line: employee agents = user's permissions; service agents = assigned agent user's permissions; every LLM call goes through the Einstein Trust Layer. Lab 2 shows the concrete permission setup this implies.

Set Up Your Org

You can do every lab in this guide for free — Developer Edition orgs include Agentforce and Data Cloud.

Get an org

Sign up at developer.salesforce.com/signup and choose the option that includes Agentforce, or use a sandbox of a licensed org.

Enable Einstein

Setup → Einstein Setup → turn on Einstein. This activates the generative AI platform behind agents, prompt templates and the Models API.

Enable Agentforce

Setup → Agentforce Agents → toggle on. You now have the agent list and Agentforce Builder.

Assign permissions

Builders need the Agentforce admin permissions; a service agent also needs its own agent user whose profile/permission sets define what the agent can touch (you'll configure this hands-on in Lab 2).

Open Agentforce Builder

Create an agent and you land in Builder: subagents and actions on the left, live test conversation on the right — you can also describe the agent in natural language and let Builder scaffold it.

Admins: you don't need code for Lab 1. Developers: keep your org handy — Labs 2 and 3 build on the same setup.

Lab 1 — Your First Agent, No Code

Level: Beginner / Admin ~45 minutes Tools: Agentforce Builder + Flow

The scenario: Falcon Retail's #1 case type is "where is my order?". We'll build a service agent that answers it from real data — the same starter pattern retailers use in production to deflect their highest-volume cases.

Create the agent

In Agentforce Builder, create a new service agent. Role: "You help customers of Falcon Retail with order questions." Set the tone to friendly-professional. Resist adding more jobs — one narrow job is the fastest path to a working agent.

Add the subagent (topic)

Label:                Order Status
Classification desc:  Handle questions about the status, location or
                      delivery date of a customer's existing order.
Scope:                Only discuss orders belonging to the verified customer.
Instructions:
  - Always verify the customer's identity before sharing order details.
  - If the order number is missing, ask for it — never guess.
  - If the order is delayed, apologize and offer to create a case.
  - Never promise delivery dates that aren't in the data.

Write instructions like rules for a new hire: explicit, testable, and clear about what the agent must never do. The classification description is what routing runs on — keep it distinct from any other subagent.

Build the Flow the agent will call

Create an autolaunched flow named Get Order Status:

Input variable: orderNumber (Text, available for input).
Get Records: Order where OrderNumber = {!orderNumber}, first record only.
Output variables: orderStatus (Text), expectedDelivery (Text), orderFound (Boolean) — set via Assignment; when no record is found set orderFound = false instead of failing.

The golden rule of agent-friendly automation: return misses as data (orderFound=false), not as errors. The agent can then apologize and ask again instead of breaking its plan.

Attach the Flow as an action

In the subagent, add an action referencing your flow. Now the part beginners skip and experts obsess over — the descriptions. The reasoning engine literally reads these to decide when and how to use your tool:

Action label:        Get Order Status
Action description:  Looks up one order by its order number and returns
                     its current status and expected delivery date.
Input orderNumber:   The customer's order number, e.g. 8842. Required.
Output orderStatus:  The order's current status (e.g. Shipped, Processing).
Output orderFound:   False when no order matches the number.

Test like you mean it

You type	Pass criteria
"Where's order 8842?"	Calls the flow, answers from real data, no invented details
"my stuff hasn't arrived"	Asks for the order number instead of guessing
"order 99999" (doesn't exist)	Politely says it can't find it, offers to re-check the number
"what's your refund policy?"	Declines / hands off — out of this subagent's scope

Checkpoint: agent → subagent → instructions → action → test. That loop is 80% of Agentforce. Everything after this lab is a more powerful version of one of those five things.

Lab 2 — Custom Apex Action

Level: Developer ~60 minutes Tools: Apex + Agentforce Builder

The scenario (from Salesforce's official example): a furniture retailer's customers ask compound questions like "What chairs do you have under $400 that are white or grey?" — category + colors + price in one utterance. Flow struggles with multi-criteria search; Apex doesn't. We'll build the custom inventory-search action the official docs describe, end to end including the security setup most tutorials forget.

Create the invocable Apex class

Setup → Apex Classes → New. This implementation follows the official example's contract (inputs: category, color, max price → matching Product2 records) plus the best practices from the Agentforce Apex actions guide:

public with sharing class ProductInventorySearch {

    @InvocableMethod(
        label='Search Furniture Inventory'
        description='Searches the product inventory for furniture matching a category, an optional list of colors, and an optional maximum price. Returns the matching products with their prices.'
    )
    public static List<SearchResult> search(List<SearchInput> inputs) {
        List<SearchResult> results = new List<SearchResult>();

        for (SearchInput input : inputs) {
            SearchResult res = new SearchResult();
            try {
                // Build the query dynamically but safely (bind variables only)
                String category = input.category;
                Decimal maxPrice = input.maxPrice;
                List<String> colors = String.isBlank(input.colors)
                    ? new List<String>()
                    : input.colors.replaceAll('\\s','').split(',');

                List<PricebookEntry> entries = [
                    SELECT Product2.Name, Product2.Family,
                           Product2.Color__c, UnitPrice
                    FROM PricebookEntry
                    WHERE Pricebook2.IsStandard = true
                      AND Product2.IsActive = true
                      AND Product2.Family = :category
                      AND (Product2.Color__c IN :colors OR :colors.isEmpty())
                      AND UnitPrice <= :maxPrice
                    WITH USER_MODE
                    ORDER BY UnitPrice ASC
                    LIMIT 20
                ];

                if (entries.isEmpty()) {
                    res.found = false;
                    res.summary = 'No products match those criteria.';
                } else {
                    res.found = true;
                    List<String> lines = new List<String>();
                    for (PricebookEntry pe : entries) {
                        lines.add(pe.Product2.Name + ' (' + pe.Product2.Color__c +
                                  ') — $' + pe.UnitPrice);
                    }
                    res.summary = String.join(lines, '; ');
                }
            } catch (Exception ex) {
                // Errors go back as data so the agent can recover gracefully
                res.found = false;
                res.summary = 'Search failed: ' + ex.getMessage();
            }
            results.add(res);
        }
        return results;
    }

    public class SearchInput {
        @InvocableVariable(required=true label='Category'
            description='Product family to search, e.g. Chairs, Tables, Sofas')
        public String category;
        @InvocableVariable(label='Colors'
            description='Optional comma-separated list of acceptable colors, e.g. "White, Grey"')
        public String colors;
        @InvocableVariable(required=true label='Maximum Price'
            description='Highest acceptable unit price in USD')
        public Decimal maxPrice;
    }

    public class SearchResult {
        @InvocableVariable(label='Found' description='Whether any products matched')
        public Boolean found;
        @InvocableVariable(label='Results Summary'
            description='Readable list of matching products with colors and prices')
        public String summary;
    }
}

Wire it to the agent

Create a subagent named Search Inventory. The official example is deliberately forceful in its instructions — copy that energy:

Instructions:
  You MUST call the "Search Furniture Inventory" action for every
  product-related question. Never answer product questions from memory.
  If the customer gives no price limit, ask for their budget.

Add the action with Reference Action Type = Apex, category Invocable Method, and select your class.

The security setup everyone forgets

Give the agent user read access to Product2, PricebookEntry and Pricebook2 (permission set or profile).
Add the class to the agent's Einstein Agent profile Apex class access (Setup → Profiles → enabled Apex classes).
Because the class runs with sharing + WITH USER_MODE, whatever the agent user can't see, the agent can't leak. That's the design, not an accident.

Test with the official utterance

Ask: "What chairs do you have under $400 that are white or grey?" — watch the planner extract category=Chairs, colors=White,Grey, maxPrice=400 from one sentence and call your method. Then try to break it: no budget given, misspelled color, a category you don't sell.

Sources: Official example: Custom Apex action for inventory search · Best practices for Agentforce Apex actions · Get Started with Actions (also covers Apex REST, AuraEnabled, and Named Query action types)

Lab 3 — Agent Script, For Real

Level: Developer / Architect ~90 minutes Tools: Agent Script + Agentforce DX or Builder

Agent Script is the language for building agents with hybrid reasoning: natural-language instructions (after a |) for conversation, deterministic expressions (after a ->) for business rules. The code below is taken from Salesforce's official InterviewAgent sample — a candidate-screening agent that must ask questions in a fixed order no matter how the conversation meanders. Real syntax, not pseudocode.

Anatomy of a .agent file

system:
    instructions: "You are an interview screening agent for Wonderland that evaluates candidates and routes outcomes."
    messages:
        welcome:|
            Hi, I'm an AI assistant. How can I help you?
        error: "Sorry, it looks like something has gone wrong."

config:
    developer_name: "InterviewAgent"
    agent_label: "InterviewAgent"
    description: "An interview screening agent that evaluates candidates and passes eligible applicants to human resources."

variables:
    ContactId: linked string
        source: @MessagingEndUser.ContactId
        description: "The verified contact for this messaging session"
    currentInterviewStep: mutable string = "Permission"
        description: "Tracks the current interview step across topics"

system: — the agent's persona and canned messages.
config: — identity. Gotcha from the docs: every agent needs a unique developer_name, or deployments collide.
variables: — linked variables bind to session context (note the source:); mutable variables are agent memory with defaults.

The router: deterministic first, LLM second

start_agent topic_selector:
    label: "Topic Selector"
    description: "Welcome the user and determine the appropriate topic"

    reasoning:
        instructions: ->
            if @variables.currentInterviewStep == "Permission":
                transition to @topic.permission
            if @variables.currentInterviewStep == "Eligibility":
                transition to @topic.eligibility
            if @variables.currentInterviewStep == "Salary":
                transition to @topic.salary
            if @variables.currentInterviewStep == "End":
                transition to @topic.end_interview

            | If no step matches, welcome the user and determine
              the appropriate topic based on their input.

        actions:
            go_to_permission: @utils.transition to @topic.permission
                description: "Transition to confirm the candidate's right to work."
            go_to_eligibility: @utils.transition to @topic.eligibility
                description: "Transition to confirm minimum eligibility."

Read that carefully — it's the whole philosophy in one block. The -> conditions run programmatically: if the step variable says "Salary", the agent goes to the salary topic, full stop — no LLM negotiation. Only when no rule fires does the | natural-language instruction hand control to the model.

A step topic: LLM freedom inside hard rails

topic permission:
    label: "Permission"
    description: "Confirm the candidate has the legal right to work."

    reasoning:
        instructions: ->
            | Confirm whether the candidate has the legal right to work.
              Ask a clear yes or no question and allow the candidate to
              provide context if needed.
              If the candidate is eligible, call
              {!@actions.setCurrentInterviewStep} with
              currentInterviewStep set to "Eligibility".
              If the candidate is NOT eligible, call
              {!@actions.setCurrentInterviewStep} with
              currentInterviewStep set to "End".

        actions:
            setCurrentInterviewStep: @utils.setVariables
                description: "Set the CurrentInterviewStep variable"
                with currentInterviewStep = ...

The LLM converses freely inside the topic — rephrasing, handling context, accepting "yeah I can work here" as a yes. But the only exits are through @utils.setVariables, and the router's hard conditions decide what that means. The candidate cannot sweet-talk their way to the salary question.

The two patterns to steal

Step-variable state machine (this sample): a mutable variable + a deterministic router = enforced sequence with natural conversation at each step. Use for interviews, KYC, guided troubleshooting, multi-step approvals.
Router + specialist subagents (official Customer Support example): a thin start_agent exposing transition tools to an Identity Verification subagent (request email → send code → validate code) and an Order Management subagent. Verification state lives in variables, so order data is unreachable until identity is confirmed.

Field notes from the docs: unique developer_name per agent; if the deploy errors on the last line of your script, append a blank line or comment (known quirk). Get runnable examples from the Agent Script Recipes sample app.

Sources: Agent Script — Agentforce Developer Guide · Official example: subagent sequencing with variables (InterviewAgent) · Official example: customer support with identity verification

Models API & Prompt Builder

Sometimes you don't need a whole agent — you need one AI capability inside your own logic. Two tools cover that:

Models API — raw LLM access, trust included

Text generation, chat and embeddings against models from OpenAI, Anthropic and Google — via Apex, REST, LWC or Flow — always through the Einstein Trust Layer. You can also bring your own LLM (Azure OpenAI, Vertex, Bedrock) or plug custom models in via the LLM Open Connector, and experiment in the Playground before shipping.

// Apex (aiplatform namespace)
aiplatform.ModelsAPI.createGenerations_Request request =
    new aiplatform.ModelsAPI.createGenerations_Request();
request.modelName = 'sfdc_ai__DefaultOpenAIGPT4OmniMini';

aiplatform.ModelsAPI_GenerationRequest body =
    new aiplatform.ModelsAPI_GenerationRequest();
body.prompt = 'Summarize this account in two sentences: ' + groundingData;
request.body = body;

aiplatform.ModelsAPI.createGenerations_Response response =
    new aiplatform.ModelsAPI().createGenerations(request);
System.debug(response.Code200.generation.generatedText);

// REST equivalents:
// POST https://api.salesforce.com/einstein/platform/v1/models/{model}/generations
// POST https://api.salesforce.com/einstein/platform/v1/models/{model}/chat-generations
// POST https://api.salesforce.com/einstein/platform/v1/models/{model}/embeddings

Prompt Builder — reusable grounded prompts

Prompt templates merge real CRM data into prompts via merge fields — record fields, related lists, flows and Apex as data providers — and run from Flow, Apex, LWC, record pages or as agent actions. The workflow is always: create → ground → test against real records in the preview → activate.

Template type	Built for
Sales Email	Personalized outreach grounded in contact/opportunity data
Field Generation	Filling one record field with AI (summary fields, descriptions) from a record page button
Record Summary	Summarizing a record and its related data for Agentforce to present
Flex	Everything else — you define the inputs (up to several records) and use it anywhere

Model choice is configurable per template — Salesforce-managed models (OpenAI, Anthropic, Google) or your own via BYOLLM — so admins can switch models without touching the prompt. Rule of thumb: Prompt Builder when admins should own the prompt text; Models API when the prompt is an implementation detail of your code.

Hands-on builds using both live in our AI Projects Lab: an AI case summarizer (Prompt Builder + Flow) and an account-brief generator (Models API in Apex).

Sources: Models and Prompts — Get Started · Access Models API with Apex

Agent API — Headless Agents

The Agent API is a REST interface to any agent: start a session, exchange messages (synchronously or streamed over SSE), end the session. Your agent's subagents, actions and guardrails all still apply — only the front end is yours. One agent can serve your portal, a WhatsApp bot and a nightly automation.

# 1. Start a session
POST https://api.salesforce.com/einstein/ai-agent/v1/agents/{agentId}/sessions

# 2. Exchange messages (sync, or SSE streaming)
POST https://api.salesforce.com/einstein/ai-agent/v1/sessions/{sessionId}/messages

# 3. End the session
DELETE https://api.salesforce.com/einstein/ai-agent/v1/sessions/{sessionId}

Auth: External Client App, client-credentials flow, scopes including chatbot_api and sfap_api; mint tokens against your My Domain.
Not supported for the "Agentforce (Default)" agent type — use a custom agent, activated.
Full setup with cURL examples and a real portal integration: our Headless Agentforce walkthrough.

Source: Agent API — Get Started

Agentforce DX & SDKs

Tool	What you do with it
Agentforce DX	Build and test agents pro-code style: Salesforce CLI + VS Code, agent metadata in Git, scripted tests in CI/CD — agents become just another artifact in your ALM.
Agentforce Python SDK	Create and manage agents programmatically from Python (official GitHub repo) — useful for generating agents from templates at scale.
Mobile SDKs	Agentforce Mobile SDK and Enhanced In-App Chat SDK put your agent inside native iOS/Android apps.
MCP client	Agents consuming external tools over the Model Context Protocol — now covered in depth in its own section.

For AI-assisted development (your coding assistant querying your org, running tests), see the open-source DX MCP server — hands-on in this Lab article.

Testing & Deployment

Agents are probabilistic — so testing isn't a phase, it's the job. Salesforce gives you four official layers:

Builder test panel — interactive: watch which subagent matched and which actions fired for each utterance, with the plan visible.
Agentforce Testing Center — bulk-run suites of test utterances against expected subagents/actions/responses; your regression net for instruction changes.
Testing API — the same evaluations, callable from code — wire agent tests into CI like unit tests.
Agentforce DX — CLI-driven test runs plus agent metadata in version control.

Deployment reality check: agents are metadata — deploy with your normal ALM. Treat instruction text like code: review it in PRs. A one-line instruction edit can change behavior as much as a code change; your Testing Center suite is what catches it.

Build your test suite from reality: pull the top 50 real utterances from case descriptions and chat logs, write the expected subagent + action for each, and run the suite before every release. Teams that do this ship agents that survive contact with customers.

Real-World Scenarios

Four patterns from actual implementations, at increasing depth:

Admin level

Case deflection for a subscription business

A streaming service gets thousands of "how do I change my plan?" cases. An admin builds a service agent with two subagents — Plan Questions (grounded in knowledge articles via a Data Library) and Plan Changes (one Flow action that updates the subscription after confirmation). No code. FAQ cases get answered instantly; humans get the edge cases, with transcripts attached.

Admin → Developer

HR screening bot (the InterviewAgent pattern)

A staffing agency screens hundreds of nursing candidates weekly. Lab 3's step-variable state machine runs the exact sequence compliance requires — work authorization → certification → availability → salary — while the LLM keeps each exchange human. Passing candidates route to HR with a structured summary; the sequence cannot be gamed because transitions are deterministic.

Developer level

Field-service dispatcher assistant

A utilities company builds an employee agent in Slack: "who's the nearest available tech with gas certification?" An Apex action queries Service Territory + skills data; a second books the appointment. Because it's an employee agent, it runs with the dispatcher's own permissions — no data-leak surface. Dispatchers stop juggling three screens per assignment.

Architect level

Cross-system returns desk: MCP + Agent Script + headless

A retailer's returns agent checks the order (Salesforce), the warehouse (SAP via MuleSoft MCP server) and the courier's API — and runs inside the company's existing React portal via the Agent API. Agent Script enforces refund eligibility deterministically; MCP makes the external tools swappable without touching the agent; identity verification gates everything, Customer-Support-example style. This is the full modern stack in one build.

Best Practices

Instructions & subagents

One subagent = one job. Overloaded subagents confuse classification; split them.
Write instructions as testable rules ("Always X before Y"), not vibes ("be helpful").
State the negative space explicitly: what the agent must refuse or hand off.
Keep classification descriptions distinct across subagents — routing runs on them.
Be forceful where it matters: the official inventory example literally writes "You MUST call the action for every product question".

Actions & security

Labels and descriptions are your API contract with the reasoning engine — write them like documentation for a junior developer, and keep Apex annotations consistent with action config.
with sharing + WITH USER_MODE + minimal field exposure, always. Then do the agent-user permission setup from Lab 2.
Return errors and misses as data, not exceptions — agents recover from data, they break on exceptions.
Deterministic sequences belong in Agent Script conditions or composite actions — never in hopeful prose.

State & rollout

Anything that must be remembered reliably goes in a variable, not in conversation context — context gets compressed; variables don't.
Ship narrow, measure, expand. One great subagent beats five mediocre ones.
Build the Testing Center suite before go-live, from real logs; review escalation transcripts weekly — they're your backlog.

Learning Path by Role

You are…	Do this, in order	You're ready when…
Admin / beginner	Core Concepts → Architecture → Trust Layer → Set Up → Lab 1 → Testing (Builder panel + Testing Center) → the case-deflection scenario	You can ship a one-subagent service agent with a Flow action and a real test suite.
Developer	Everything above → Lab 2 → Models & Prompts → Lab 3 → Builder pro-code view → Agent API → the dispatcher scenario → AI Projects Lab builds	You can build custom actions with correct security, call LLMs from Apex, and read/write Agent Script.
Architect	Everything above → Multi-Agent Orchestration → Data & Grounding → MCP → Observability → the returns-desk scenario → design review: variables-as-state, identity gating, registry governance, evaluation suites in CI	You can defend where determinism lives (script), where flexibility lives (LLM), and how every integration is governed.

Keep going: Agentforce 360 explained · MCP integration · Headless Agent API · official Agentforce examples library.

Architecture: Atlas, Agent Anatomy & Editions

Everything an agent does passes through the same pipeline. Knowing the pipeline is what separates "I configured an agent" from "I can explain why it did that" — the question every architect gets asked in week two.

The Stack, Layer by Layer

Layer	Job	Where you shape it
Channels	Web chat, Slack, voice, WhatsApp, Experience Cloud, Agent API	Channels
Atlas Reasoning Engine	Interprets the utterance, picks a subagent, plans and sequences actions, decides when to ask follow-ups; supports hybrid reasoning (deterministic rules + LLM judgment)	Instructions, classification descriptions, Agent Script
Subagents & actions	The agent's jobs and tools	Core Concepts, Lab 2
Trust Layer	Masking, guardrails, audit on every LLM round-trip	Trust Layer
Data foundation	CRM data, Data 360, Intelligent Context / RAG retrieval	Data & Grounding
Models	The LLMs behind reasoning and generation — Salesforce-hosted or your own (BYOLLM)	Models & Prompts

Standard Agent Types Compared

Agent type	Audience	Typical jobs	Runs as
Service Agent	Customers	Case deflection, order status, returns, appointment booking	Designated agent user (you scope its permissions)
Sales Agent — SDR	Prospects	Lead engagement, qualification, meeting booking	Designated agent user
Sales Agent — Coach	Sales reps	Role-play practice, pitch feedback grounded in your CRM	The rep (employee pattern)
Employee Agent	Your staff	IT/HR helpdesk, data lookups, internal workflows — often in Slack	The current user (inherits their permissions)

Standard types ship with prebuilt subagents and actions you extend; fully custom agents start blank. Either way the anatomy is identical — which is why the labs transfer to every type.

Editions & Consumption in One Minute

Agentforce pricing is consumption-based: usage draws down flex credits rather than classic per-seat licensing, so cost scales with conversations and actions — a design input, not just a billing detail.
Design consequence: wasteful agents cost real money. Tight entry classification, grounded answers (fewer retries), and deterministic Agent Script paths all reduce consumption — see Observability for monitoring it.
Developer Edition orgs include enough to run every lab in this guide for free; check the official pricing pages for current commercial terms rather than any blog (including this one).

Sources: Agentforce Developer Guide · Agentforce (product)

The Einstein Trust Layer

Every generative call in Agentforce — agent reasoning, prompt templates, the Models API — runs through the Trust Layer. It's the reason "we can't send customer data to an LLM" stopped being a blocker; learn what each stage actually does so you can answer security review questions precisely instead of hand-waving.

What Happens to Every Prompt

Secure data retrieval: grounding data is fetched under the running user's (or agent user's) permissions — FLS and sharing apply before anything reaches a model.
Data masking: PII (names, emails, card numbers and other configurable patterns) is replaced with placeholders before the prompt leaves the trust boundary, and re-inserted in the response after it returns.
Zero data retention: agreements with external model providers mean prompts and responses aren't stored or used for training by the provider.
Toxicity & safety scanning: responses are scored for harmful content before the user sees them.
Audit trail: prompts, responses, masking events and safety scores are captured — feedback and audit data can flow into Data 360 for reporting, which is how compliance teams get evidence instead of assurances.

Guardrails Are Layered, Not One Switch

Layer	Mechanism	You control it in
Identity & access	Agent user permissions, employee agents inheriting user access	Permission sets (Lab 2)
Behavioral	Instructions, refusal rules, escalation paths	Subagent instructions (Best Practices)
Deterministic	Hard-coded sequences and conditions the LLM can't override	Agent Script
Platform	Masking, toxicity scanning, audit	Trust Layer settings in Setup
Human	Approvals and handoffs for high-stakes moves	Escalation design (Multi-Agent)

Review-meeting one-liner: "Data is permission-filtered before retrieval, masked before the model, never retained by providers, scanned on the way back, and audited end to end — and the agent can only act through actions we explicitly granted."

Sources: Einstein Trust Layer (Trailhead) · Agentforce Developer Guide

Agentforce Builder: Three Ways to Build One Agent

Agentforce Builder is the studio where agents are created — and its defining feature is that three build modes edit the same agent, so admins and developers finally work on one artifact instead of throwing configs over a wall.

The Three Modes

Mode	How it works	Best for
Conversational build	Describe the agent in natural language ("a returns agent that verifies identity, checks eligibility, creates an RMA") — Builder scaffolds subagents, instructions and action stubs	First drafts, prototypes, admins
Low-code canvas	Visual editing of subagents, instructions, actions and variables, with the live test panel beside it	Day-to-day iteration
Pro-code view	The same agent as Agent Script — diffable, reviewable, versionable text	Developers, code review, deterministic logic

The workflow that works in practice: scaffold conversationally, refine on the canvas, then open the pro-code view to lock the parts that must never improvise. Changes in one mode appear in the others — it's one definition.

Migrating from the Legacy Builder

The original Agent Builder is being retired and the new-agent entry point removed — build everything new in Agentforce Builder, and migrate existing agents now rather than during a forced window.
Migration is mostly mechanical (agents open in the new Builder), but treat it as a release: re-run your Testing Center suite before and after, because reasoning-engine behavior is what you're regression-testing, not markup.
While migrating, standardize terminology in your team docs too: topics are now subagents — same concept, new name, and both appear in older documentation.

Versioning & Moving Between Orgs

Agents are metadata — retrieve them with Agentforce DX, keep them in Git, deploy sandbox → production like any other artifact (DX & SDKs).
Version deliberately: a new agent version is your rollback unit. Never hot-edit the live version's instructions on a Friday.
Instruction text is code. Review it in pull requests; a one-line wording change can shift behavior as much as an Apex change.

Sources: Get Started with Agents · Agent Script

Multi-Agent Orchestration: Agents as a Team

The newest architectural shift: instead of one agent that knows everything, an orchestrator agent routes work to connected subagents — specialist agents wired together, each with its own instructions, actions and permissions. It's the org chart pattern: a front desk that knows exactly who handles what.

How Routing Actually Works

You connect existing agents to an orchestrator as subagents. Each connected subagent carries a description — what it does and when it should step in.
The orchestrator's reasoning engine reads those descriptions (plus each subagent's instructions and actions) to pick the right specialist per request — the description is the routing contract, so write it with the same care as an Apex action description.
Shared context travels with the conversation: when the Sales agent hands off to Service, the customer doesn't repeat themselves — across channels too.

Multi-Agent vs One Mega-Agent

Signal	Stay single-agent	Go multi-agent
Scope	One domain (support for one product line)	Distinct domains (sales + service + IT) with different owners
Permissions	One data-access profile fits everything	Domains need different data access — separate agent users per specialist is cleaner and safer
Teams	One team maintains it	Different teams ship independently, versioning their own specialist
Classification	Subagents within one agent route fine	Routing keeps misfiring because the agent's jobs are too diverse

Start single-agent. Multi-agent earns its complexity when ownership, permissions or scale demand it — not before. (The same judgment call as microservices, and the same failure mode when made too early.)

Governance & Debugging a Team of Agents

Permissions per specialist: each subagent runs with its own scoped access; the orchestrator doesn't need (and shouldn't have) union-of-everything permissions.
Escalation paths: define per specialist where "hand to a human" goes — one shared queue recreates the call-center transfer nightmare agents were meant to end.
Human-in-the-loop: high-stakes actions (refund above threshold, data deletion) gate on approval regardless of which specialist proposes them.
Debugging: when a conversation goes wrong, the first question is now "which agent had it, and why was it routed there?" — session tracing (Observability) shows the handoffs; misroutes almost always trace back to overlapping or vague subagent descriptions.

Sources: Multi-Agent Orchestration (official) · Agentforce Developer Guide

Data 360, Intelligent Context & RAG

A brilliant model with bad data makes confident mistakes — the most dangerous kind. Grounding is how agents answer from your truth instead of the model's training data, and it's where most real-world agent quality is won or lost.

The Grounding Stack

Piece	What it does	Know it as
Data 360 (formerly Data Cloud)	The unified data foundation: ingested + zero-copy data, identity resolution into unified profiles, calculated insights	The single source agents draw from
Zero-copy	Query data where it lives (warehouse, lake) instead of duplicating it into Salesforce	Why "move all data first" is no longer the prerequisite
Intelligent Context	Grounding agents in unstructured content — docs, PDFs, knowledge articles — prepared for retrieval	The agent's library card
Search indexes & retrievers	Content is chunked, embedded into vectors, indexed; a retriever finds the most relevant chunks per question	The RAG machinery
Data Libraries	The admin-friendly wrapper: point at knowledge/files and get retrieval without building indexes by hand	Where Lab 1-style agents get their answers

How a RAG Answer Happens

1. The question is embedded into a vector (a numeric meaning-fingerprint).
2. The retriever finds the closest content chunks in the search index — semantic match, not keyword match.
3. Those chunks are injected into the prompt as grounding, under Trust Layer controls.
4. The model answers from the chunks — and can cite them, which is your hallucination defense (Observability).

Full hands-on build — Data Library, retriever behavior, testing groundedness: our RAG knowledge agent walkthrough.

The Data-Quality Checklist Before Any Rollout

Coverage: do knowledge articles actually answer the top 50 real questions? (Pull them from case data — don't guess.)
Freshness: who owns updating content the agent quotes? An agent confidently reciting last year's policy is worse than no agent.
Chunkability: long walls of text retrieve poorly — restructure articles with clear headings and one answer per section.
Contradictions: two articles disagreeing means the retriever picks one at random. Deduplicate before go-live.
Permissions: retrieval respects access controls — verify the agent user can read what it's supposed to and nothing it isn't.

Watching Usage

Generative usage is queryable on-platform (e.g. the AiAgentGenerativeAiUsage object), which means standard reports and dashboards can answer "which agents, which subagents, how much, trending how?" — the raw material for the cost and quality monitoring covered in Observability.

Sources: Agentforce Developer Guide · Agentforce Guide — Context Engineering (official)

MCP: Connecting Agents to External Tools

The Model Context Protocol is the USB standard of the agent era: an open protocol that lets any agent use tools from any compliant server. For Salesforce work it answers the question that used to require custom integration every time — "how does my agent safely call things that live outside the org?"

The Pieces

Piece	Role
MCP server	Exposes tools ("check shipment", "query warehouse") with typed inputs/outputs — built by you, a vendor (MuleSoft, Tableau) or the community
MCP client (in Agentforce)	Lets agents discover and call those tools as actions
Registry & trusted gateway	Central admin control over which servers and tools agents may use, with policies — the governance layer that makes MCP enterprise-safe
Hosted MCP servers	Salesforce-hosted server support — note the auth requirement: hosted MCP uses an External Client App (connected apps aren't supported), consistent with the platform's newer auth direction (see the Integration guide)

Security: The Part Architects Get Grilled On

Tool poisoning: a malicious or compromised MCP server can lie in its tool descriptions ("this tool formats text" — it exfiltrates data). Mitigations: only registry-approved servers, review tool descriptions like code, least-privilege credentials per server.
Scope the gateway: policies decide which agents can reach which tools — a customer-facing agent should not see internal admin tooling, even if one server exposes both.
Treat tool output as untrusted input: ground and validate before acting on it — the same discipline as any external API, plus prompt-injection awareness (tool output goes into the model's context).

Hands-On Paths

Consume: concepts and setup in our MCP + Agentforce deep dive.
Build for development: wire the open-source DX MCP server into your coding assistant — agentic DevOps hands-on.
The full pattern in context: the returns-desk scenario in Real-World Scenarios shows MCP, Agent Script and the Agent API working together.

Sources: Agentforce Developer Guide · Agentforce (official)

Channels: Voice, Slack, Messaging & Portals

One agent definition, many front doors. Channel choice changes the design constraints — latency tolerance, message length, identity — more than the agent itself.

The Channel Matrix

Channel	Reality check	Design notes
Web chat / in-app	The default; richest UI (buttons, carousels)	Start here; everything else inherits the tested agent
Voice	Real-time speech in/out — latency and turn-taking dominate UX	Short responses, explicit confirmations for actions, design for interruptions; transcription quality improves continuously but test with real accents and noise
Slack	The natural home of employee agents (IT/HR/Sales apps)	Employee permission model shines here — the agent answers with exactly what that user may see
Messaging (WhatsApp / SMS / email)	Asynchronous, two-way, high volume	Idempotent actions matter (people double-send); keep sessions and identity verification tight
Experience Cloud	Self-service portals with an embedded agent — including help-center patterns where the agent fronts the knowledge base	Guest access needs the strictest permission review of any channel; portals can also be toggled for AI-readability (GEO) so external assistants can parse your content
Agent API	Your own UI, any platform	See Agent API — you own auth, rendering and session lifecycle

Employee vs Customer Channel Design

Identity: customers must be verified in-conversation (Lab 3's pattern); employees arrive authenticated.
Tone and risk: a wrong answer to an employee wastes minutes; a wrong answer to a customer creates a case, a refund or a headline. Ground customer agents harder and gate more actions.
Multilingual: language support is configured per agent and channel — test grounded answers in each target language, not just the greeting.

Sources: Agentforce Developer Guide · Agentforce (official)

Observability: When Agents Fail Differently

Classic software fails loudly — exceptions, 500s, red dashboards. Agents add a second failure class: semantic failure, where everything ran "successfully" and the answer was still wrong, unhelpful or off-policy. Production readiness means instrumenting for both.

Two Failure Classes, Two Toolkits

	Technical failure	Semantic failure
Looks like	Action exception, timeout, permission error	Confident wrong answer, wrong subagent, tone off-policy, unnecessary escalation
Caught by	Logs, exception handling, Apex patterns you already know	Evaluation suites, scorers, transcript review, groundedness checks
Fixed in	Code and config	Instructions, descriptions, grounding data, Agent Script determinism

The Observability Stack

Session tracing: per-conversation traces show the chosen subagent, the plan, each action's inputs/outputs and the handoffs (crucial for multi-agent routing questions).
Agent Analytics: built-in dashboards over dozens of metrics — deflection, resolution, escalation rate, CSAT signals, latency — your week-one health report.
Custom Scorers (beta): define your KPIs ("did it offer the discount policy correctly?") and score conversations against them automatically instead of eyeballing transcripts.
Usage & cost: consumption analytics plus queryable usage data (see Data & Grounding) — set budget alerts before finance does it for you.

Building an Evaluation Suite (the Golden Dataset)

1. Collect 50–200 real utterances from cases, chats and call notes — including the ugly, ambiguous ones.
2. For each, record the expected subagent, expected action(s), and what a good answer must contain (and must not).
3. Run it in Testing Center / the Testing API on every instruction change — regression testing for behavior (Testing).
4. Add every production failure to the suite. Your golden dataset is a ratchet: agents only get better against it.

Hallucination Management in Practice

Ground first: answers should come from retrieved content, not model memory (Data & Grounding).
Cite: surfacing the source article turns "trust me" into "see for yourself" — and makes wrong answers debuggable.
Design refusal: an explicit "I don't have that information — routing you to a person" beats an eloquent guess. Write refusal rules into instructions and test them like features.
Gate the irreversible: confidence thresholds and human approval on actions you can't undo.

Go-live checklist, condensed: evaluation suite green on real utterances · session tracing reviewed for a pilot cohort · escalation paths tested end-to-end · refusal behavior verified · permission model re-audited · cost dashboard live with alerts. Then ship narrow and expand.

Sources: Agentforce Developer Guide · Agentforce (official)