Salesforce Unveils the World’s First LLM Benchmark for CRM

By News Desk Last updated Jun 20, 2024

Salesforce, a global leader in CRM solutions, has launched the world’s first large language model (LLM) benchmark specifically designed for customer relationship management (CRM) systems. This pioneering benchmark offers businesses an invaluable tool to evaluate the burgeoning array of LLMs, facilitating more informed decisions in their AI strategy.

The new benchmark is an all-encompassing evaluation framework that assesses LLM performance based on four critical metrics: accuracy, cost, speed, and trust and safety. It is tailored to scrutinize common sales and service scenarios such as prospecting, lead nurturing, and summarizing sales opportunities and service cases. A public leaderboard will also be available, guiding professionals in selecting the most suitable LLM for their CRM requirements. Salesforce plans to regularly update the benchmark with new use cases and fine-tuned LLMs.

“As AI evolves, enterprise leaders emphasize the importance of balancing performance, accuracy, responsibility, and cost to fully harness generative AI’s potential for business growth,” stated Silvio Savarese, EVP & Chief Scientist at Salesforce AI Research. “Our new LLM Benchmark for CRM is a groundbreaking advancement, providing clarity on next-generation AI deployment and accelerating time to value for CRM-specific applications. We are committed to continuously enhancing this benchmark to keep pace with technological advancements, ensuring it remains relevant and valuable.”

Significance of the Benchmark:

Traditional LLM benchmarks have focused on academic and consumer applications, offering limited relevance to business contexts. They often fall short in expert human evaluations and fail to address comprehensive criteria such as accuracy, speed, cost, and trust. These gaps have left CRM customers without a reliable means to assess the effectiveness of generative AI-driven CRM solutions. The new Salesforce benchmark addresses these deficiencies, enabling businesses to make strategic AI decisions based on robust and relevant data.

Also Read: Cyntexa Recognized as a Clutch Global Leader for Spring 2024

Benchmark Features:

– Accuracy: Evaluates factuality, completeness, conciseness, and adherence to instructions, ensuring that model predictions and recommendations are valuable and actionable.
– Cost: Categorized as high, medium, or low based on percentiles, allowing customers to assess the cost-effectiveness of different LLMs.
– Speed: Measures the efficiency and responsiveness of LLMs, enhancing user experience and operational efficiency.
– Trust and Safety: Assesses the LLM’s ability to protect sensitive data, comply with privacy regulations, and avoid bias and toxicity, ensuring secure and reliable CRM operations.

The benchmark, developed by Salesforce AI Research, leverages real-world CRM data and expert human evaluations, enabling businesses to integrate generative AI into their CRM systems strategically.

Einstein 1 Platform Integration:

Salesforce’s Einstein 1 Platform allows customers to select from existing LLMs or bring their own models, ensuring flexibility to meet unique business needs. By utilizing the benchmark, businesses can deploy more effective and efficient generative AI solutions tailored to their specific CRM use cases.

“Businesses aim to leverage AI to drive growth, reduce costs, and deliver personalized customer experiences,” said Clara Shih, CEO of Salesforce AI. “Our customers need a purpose-built tool to evaluate and select from the myriad of new AI models. We are excited to introduce the world’s first LLM benchmark for CRM, providing a dynamic, comprehensive framework that helps companies make informed decisions by balancing accuracy, cost, speed, and trust.”

SOURCE: Salesforce