The speed of innovation in the world of AI — and specifically, generative AI — is continuing at a breakneck pace. With the technical sophistication that’s available now, the industry is rapidly evolving from assistive conversational automation to role-based automation that augments the workforce. In order for artificial intelligence (AI) to mimic a human-level performance, it’s vital to understand what makes humans most effective at completing jobs: agency. Humans can take in data, reason across possible paths forward, and take action. Equipping AI with this kind of agency requires an extremely high level of intelligence and decision-making.
At Salesforce, we’ve tapped into the latest advancements in large language models (LLMs) and reasoning techniques to launch Agentforce. Agentforce is a suite of out-of-the-box AI agents — autonomous, proactive applications designed to execute specialized tasks — and a set of tools to build and customize them. These autonomous AI agents can think, reason, plan, and orchestrate at a high level of sophistication. Agentforce represents a quantum leap in AI automation for customer service, sales, marketing, commerce, and more.
This article sheds light on the innovations that culminated in the Atlas Reasoning Engine — the brain of Agentforce — which orchestrates actions intelligently and autonomously to bring an enterprise-grade agentic solution to companies.
Imagine a workforce with no limits
Transform the way work gets done across every role, workflow, and industry with autonomous AI agents.
The evolution from Einstein Copilot to Agentforce
Earlier this year we released Einstein Copilot, which has now evolved into an Agentforce Agent for customer relationship management (CRM). Einstein Copilot was a generative AI-powered conversational assistant that derived its intelligence from a mechanism called Chain-of-Thought reasoning (CoT). In this mechanism, the AI system mimics human-style decision-making by generating a plan containing a sequence of steps to attain a goal.
With CoT-based reasoning, Einstein Copilot could co-create and co-work in the flow of the work. This made it quite advanced compared to traditional bots, but it fell short of truly mimicking a human-like intelligence. It generated a plan that contained a sequence of actions in response to tasks, and then executed those actions one by one. If the plan was incorrect, however, it did not have a way to ask the user to redirect it. This led to an AI experience that was not adaptive: Users could not provide new and useful information as a conversation progressed.
As we put Einstein Copilot through rigorous testing with thousands of sellers from our own sales organization (Org 62), some patterns emerged:
- The natural-language conversational experience of Copilot was much better than traditional bots, as expected, but it was not yet achieving the holy grail of being truly human-like. It needed to be more conversational.
- Copilot did an excellent job fulfilling user goals with the actions it was configured with, but it couldn’t handle follow-up inquiries about information that already existed in conversation. It needed to use context better to respond to more user queries.
- As we added more actions to automate more use cases, Copilot’s performance started to degrade, both in terms of latency (how long it took to respond) and response quality. It needed to scale effectively to thousands of use cases and applications that could benefit from it.
We set out to find a solution to these problems, and that led to the birth of Agentforce.
Agentforce: A big leap in reasoning
Agentforce is the industry-first, enterprise-grade conversational automation solution that can make proactive and intelligent decisions at scale with little or no human intervention. Several advancements make that possible.
- Orchestration based on ReAct prompting vs. CoT. Extensive experimentation and testing showed that Reasoning and Acting (ReAct)-style prompting yielded much better results compared to the CoT technique. In the ReAct mechanism, the system goes through a loop of reason, act, and observe until a user goal is fulfilled. This kind of looping approach lets the system consider any new information and ask clarifying questions or confirmations so that the user’s goal is fulfilled as precisely as possible. This also leads to a much more fluid and natural language-like conversational experience.
- Topic classification. We introduced a new concept called topics, which maps to a user intent, or job to be done. When a user input comes in, it gets mapped to a topic, which contains the relevant set of instructions, business policies, and actions to fulfill that request. This mechanism helps define the scope of the task and the corresponding solution space for the LLM, allowing the system to scale effortlessly. Natural language instructions embedded in topics provide additional guidance and guardrails for the LLM. So, if we need some actions to be executed in a certain sequence, that can be a natural-language topic instruction. If there are business policies like “free returns up to 30 days,” they can be specified as an instruction and passed to the LLM, so it can take them into account and inform the user interaction accordingly. These concepts allow agents to scale to thousands of actions safely and securely.
- Use LLMs for responses. Previously, we restricted the system to respond only with action outputs, which severely constrained its ability to respond based on the information available in the conversation. By opening up the system to let the LLM respond using the context in conversation history, we’ve allowed for a far richer conversational experience. Now, users can ask for clarifications and ask follow-up questions to prior outputs, leading to an overall higher goal-fulfillment rate.
- Thoughts/reasoning. Prompting LLMs to share thoughts or provide reasons for selecting certain actions prevents hallucinations tremendously. This has the added advantage of providing visibility into the LLM’s behavior, so that admins and developers can fine-tune the agent to align with their needs. Reasoning is available in the Agent Builder canvas by default, and users can also prompt the agent with follow-up questions so it can explain its reasoning. This not only prevents hallucinations but helps build trust.
AI agents and assistants: What’s the difference?
Both can take action on a user’s behalf, but each serves a distinct purpose.
Additional Agentforce characteristics
Outside of the Atlas Reasoning Engine, Agentforce has several other noteworthy characteristics that sets it apart.
- Proactive action. User inputs are one way to trigger agents. But Agentforce agents can also be triggered by data operations on CRM or business processes and rules, such as a case status update, an email received by a brand, or a meeting starting in five minutes. These mechanisms give agents a level of proactiveness that makes them useful and deployable in various dynamic business environments, expanding their utility to both the front and back office.
- Dynamic information retrieval. Most business use cases involve retrieving information or taking action. One of the most prevalent mechanisms to feed static information to agents is through grounding. However, it’s the ability of agents to tap into dynamic information that unlocks a vast potential of use cases and applications.
Agentforce supports several mechanisms to tap into dynamic information. The first is retrieval augmented generation, or RAG. By using semantic search on structured and unstructured data in Data Cloud through RAG, agents can retrieve any relevant information from external data sources and databases.
Secondly, with the introduction of generic information retrieval tools like web search and knowledge Q&A, we’ve compounded the agent’s ability to handle complex tasks. Imagine researching a company or a product online using web search, and combining that with internal knowledge about the company’s rules and policies, and then taking an action in the form of an email summary to a contact. Combining data from multiple sources lets the agent handle business tasks much more effectively and efficiently.
Lastly, agents can be deployed in Flows, APIs, and Apex classes. By doing this, all the contextual information in a workflow, as well as information for various scenarios, can be passed to the agent, preventing the need to build custom solutions and handle each scenario separately. All these mechanisms that allow agents to tap into dynamic information let them understand their environment better, which increases their interactivity multifold.
- Transfer to human agent. AI is nondeterministic and it can hallucinate. That’s why we have pioneered the robust Einstein Trust Layer to provide toxicity detection, zero data-retention contracts, prompt-injection defense, and several other mechanisms. We have baked in rules in our system prompts to prevent LLMs from digressing and hallucinating. But despite all these mechanisms, LLMs are still not 100% accurate. For those critical business scenarios where the tolerance for error is zero, seamlessly transferring to a human is critical, and something that Agentforce supports natively. Agentforce treats “transfer to human agent” as yet another action, which allows for a conversation to be safely and seamlessly transferred to humans in any desired business scenario.
What’s next for Agentforce
Despite being in its nascent stage, Agentforce is a game-changer for our customers. Customers like Wiley and Saks Fifth Avenue are seeing an exponential impact on their business KPIs with Agentforce Service Agent. As the wheel of innovation and technological advancement continues to turn rapidly at Salesforce Research and within the industry, we continue to move at light speed to tap into various innovations to make agents even more robust and intelligent. Some of the advancements that customers can expect in the near future include:
- A testing and evaluation framework for agents. Bringing a complex agentic system like Agentforce to enterprises requires a Herculean amount of testing and validation. So, we’ve developed a robust evaluation framework to test the action outcomes, inputs, outputs, planning accuracy, topic classification, and planner state. We’ve been using this framework to optimize the agents on metrics like accuracy, latency, cost to serve, and trust. Unlike the majority of generally available frameworks and benchmarks that primarily focus on evaluating an LLM’s performance on tasks like math, science, and general knowledge proficiency, our evaluation framework is geared specifically toward CRM business use cases. We’ve also published the world’s first LLM benchmark and are currently working on making our evaluation framework for agents available to customers and partners.
- Multi-intent support. This is a key cornerstone of replicating a human-like conversation mechanism. Quite a few day-to-day expressions consist of multiple unrelated goals, such as “update my order and find me a shirt in size M,” “update the case status and email a summary of troubleshooting steps to the customer,” and “book a flight and reserve a hotel.” With the combination of the natural-language comprehension capabilities of LLMs, large–context window support, and our innovative concepts like topics, we’re continuing to experiment to create a reliable, scalable, and secure solution for our customers.
- Multimodal support. While the majority of digital interactions are text-based, voice- and vision-based interactions increase the richness of experiences several-fold because they represent the most natural way of human interaction. In fact, with advancements like simultaneous processing of multimodal inputs, faster response times, large-context windows, and sophisticated reasoning capabilities, the multimodal AI market is projected to grow by about 36% by 2031. There are several enterprise use cases that can straight away benefit from multimodality support:
- Voice use cases. Replacing interactive voice response (IVR) with generative AI-powered voice support, employee coaching, training, and onboarding.
- Vision use cases. Product search and comparison, user-interface (web, mobile) browsing, troubleshooting, and issue resolution for field service.
- Multi-agent support. Agent-to-agent interactions represent one of the most transformative business developments of our time. Given their ability to simultaneously retrieve, compile, and process information, multi-agent systems can exponentially decrease the processing times for long, complex workflows that currently proceed sequentially due to human-to-human hand-off. Digital agents can be inserted into these workflows for repeatable data-processing tasks, and they can also help humans involved in these processes be more efficient.
We’re already introducing this kind of multi-agentic paradigm in the sales process, where an agent can function as a sales development rep to nurture pipeline, or as a sales coach to advise a sales rep on how to best negotiate a deal. Specialized agents can also handle other aspects of the sales process like lead qualification, proposal preparation, and post-sale follow-up. In the same vein, a service workflow can consist of agents that troubleshoot, follow up, and assign tickets, as well as agents that respond to customer queries and help human reps.
Get ready for the third wave of AI
Agentforce represents the third wave of AI, after predictive AI and copilots. Using Agentforce, customers can build agents that don’t just respond to conversational prompts to take action, but also anticipate, plan, and reason with minimal help. Agents can automate entire workflows or processes, make decisions, and adapt to new information, all without human intervention. At the same time, these agents can ensure a seamless hand-off to human employees, facilitating collaboration across every line of business. Powered by the Atlas Reasoning Engine, these agents can be deployed in just a few clicks to augment and transform any business function or team.
Are you ready for AI?
Take our free assessment to see if your company is ready to take the next step with AI. You’ll receive customized recommendations to help you effectively implement AI within your organization.