Baidu Says Ernie 3.5 Outdid ChatGPT and GPT-4 in Key Metrics

On Jun 27, 2023 at 11:40 am UTC by Ibukun Ogundare · 3 mins read

The result of the AGIEval and C-Eval tests shows that Ernie 3.5 achieved higher scores than other large models, including ChatGPT, and surpassed GPT-4.

Competition in the AI market heats up as China’s Baidu said its AI model Ernie 3.5 beats the popular OpenAI’s ChatGPT AND GPT-4 on key tests. The Chinese internet company unveiled the Ernie bot at an event in March. At the time of the announcement, CEO Robin Li said that the new product was imperfect and would continue to improve as people use it and give feedback. Within an hour after revealing the Ernie bot, Baidu stated that about 30,000 corporate clients joined the waitlist to access the chatbot.

Meanwhile, Baidu has been publicly testing Ernie Bot since it was unveiled in March. The chatbot, built on the Chinese search engine’s foundational AI model called Ernie, is trained on extensive data. On the other hand, ChatGPT, which Baidu said Ernie 3.5 outperformed, is based on OpenAI’s GPT 3.5 model. It also added its AI model beats OpenAI’s latest and more advanced model, GPT-4. It noted that Ernie 3.5 performed better than OpenAI’s product in Chinese language tests.

Baidu Claims Ernie 3.5 Is Better than ChatGPT in Multiple Key Area

The Chinese company made the claim while citing a report by China Science Daily. According to the report, a “Few-Shot evaluation” reveals that Ernie 3.5 outperformed ChatGPT in multiple test sets. The three evaluation benchmarks are AGIEval, C-Eval, and MMLU. Microsoft Research the AGIEval evaluation benchmark to examine the model’s performance level in the “human-oriented” standardized test. The focus is on 20 official, public, and distinct qualifying exams, such as SAT exam in the US and college entrance examinations in China. More include Bar exams, American GMAT, GME, and so on. In addition, Berkeley University, Columbia University, the University of Illinois at Urbana-Champaign, and the University of Chicago jointly release MMLU. The large-scale multi-task language understanding test measures the models’ English interdisciplinary professional ability. This test covers different educational areas like social sciences, humanities, science, technology, engineering and mathematics (STEM), and more.

Furthermore, the c-Eval evaluation is a Chinese basic model evaluation containing 13,948 multiple-choice questions covering 53 subjects. The evaluation benchmark was created and released by the joint effort of Tsinghua University, the University of Edinburgh, and Shanghai Jiaotong University.

The result of the AGIEval and C-Eval tests shows that Ernie 3.5 achieved higher scores than other large models, including ChatGPT, and surpassed GPT-4. Also, the Baidu AI model also outdid ChatGPT’s 40.27 points and GPT -4’s 56.96 points. Ernie 3.5 scored a whopping 64.37 points, making the first position. For the Chinese c-Eval evaluation, Ernie 3.5 outstripped ChatGPT. While the Chinese AI model scored the highest at 71.93 points, ChatGPT measured 51.70 points, and GPT-4 got 68.57 points. In addition, Baidu mentioned more results that showed that Ernie 3.5 has “outstanding Chinese ability” and outperformed ChatGPT and GPT-4.

Baidu Says Ernie 3.5 Outdid ChatGPT and GPT-4 in Key Metrics

Baidu Claims Ernie 3.5 Is Better than ChatGPT in Multiple Key Area

Related Articles

Rexas Finance (RXS) vs Tron (TRX): ChatGPT Picks the Solana (SOL) of 2025

OpenAI Seeks Global Expansion in US, France, and Asia

AI Tokens Drop on Nvidia Q2 Earnings, Bigger Picture Tells Another Story