Salesforce Unveils First LLM Benchmark For Enhanced CRM Performance

Salesforce has introduced the world's first LLM benchmark for CRM, which aims to assist businesses in evaluating the growing number of large language models (LLMs) available for use in their customer relationship management (CRM) systems. This benchmark provides a comprehensive evaluation framework that measures the performance of LLMs across four key metrics: accuracy, cost, speed, and trust and safety. It has been specifically designed to evaluate common sales and service use cases, including prospecting, lead nurturing, sales opportunities, and service case summaries.

Silvio Savarese, EVP & Chief Scientist at Salesforce AI Research, emphasized the importance of finding the right balance between performance, accuracy, responsibility, and cost to unlock the full potential of generative AI for driving business growth. He stated that Salesforce's new LLM Benchmark for CRM is a significant step forward in assessing AI strategies within the industry. The benchmark not only provides clarity on next-generation AI deployment but also accelerates time to value for CRM-specific use cases.

New LLM Benchmark for CRM by Salesforce

The existing LLM benchmarks have primarily focused on academic and consumer use cases, lacking business relevance and adequate expert human evaluations. They have also failed to address important considerations such as accuracy, speed, cost, and trust. This has left CRM customers without a reliable way to gauge the effectiveness of generative AI-powered CRM solutions, making it difficult for businesses to make informed decisions.

Developed by Salesforce AI Research, the benchmark stands out by using real-world CRM data and incorporating expert human evaluations by practitioners. This enables businesses to make more strategic decisions about incorporating generative AI into their CRM systems. The benchmark focuses on four key metrics:

1. Accuracy: This metric includes subcategories such as factuality, completeness, conciseness, and instruction-following. Accurate predictions and recommendations are valuable for teams across the organization, enabling them to take actions that improve customer experience. Techniques like prompt engineering and fine-tuning can be used to improve accuracy.

2. Cost: The cost metric is categorized as high, medium, and low based on percentiles. It represents the estimated operational cost that varies by CRM use case. Customers can evaluate the cost-effectiveness of different LLMs to ensure they align with their budget and resource allocation strategies.

3. Speed: This metric assesses the LLM's responsiveness and efficiency in processing and delivering information. Faster response times enhance the user experience, reduce wait times for customers, and enable sales and service teams to address inquiries and issues promptly.

4. Trust and Safety: This metric measures the LLM's capability to protect sensitive customer data, comply with data privacy regulations, secure information, and avoid bias and toxicity in CRM use cases. By assessing the reliability of LLMs for CRM, this benchmark provides organizations with transparency regarding trust and safety.

Organizations can use this benchmark to compare LLMs, identify the best solution, and make more informed decisions that drive customer success and propel their business forward. With Salesforce's Einstein Platform, customers have the option to choose from existing LLMs or bring their own models to meet their unique business needs. By selecting models for their CRM use cases using the benchmark, businesses can deploy more effective and efficient generative AI solutions.

Clara Shih, CEO of Salesforce AI, highlighted that businesses are looking to utilize AI to drive growth, cut costs, and deliver personalized customer experiences. She stated that customers have been seeking a purpose-built way to evaluate and select from the increasing number of AI models available. Salesforce's LLM benchmark for CRM aims to address this need by providing a comprehensive and dynamically evolving framework that empowers companies to make informed decisions, considering accuracy, cost, speed, and trust.

24K Gold / Gram
22K Gold / Gram
Advertisement
First Name
Last Name
Email Address
Age
Select Age
  • 18 to 24
  • 25 to 34
  • 35 to 44
  • 45 to 54
  • 55 to 64
  • 65 or over
Gender
Select Gender
  • Male
  • Female
  • Transgender
Location
Explore by Category
Get Instant News Updates
Enable All Notifications
Select to receive notifications from