MLCommons Releases New Benchmarks for AI Chips and Systems

MLCommons Releases New Benchmarks for AI Chips and Systems

Artificial intelligence (AI) benchmarking group MLCommons has just released a new set of tests and results that focus on the speed of top-of-the-line hardware in running AI applications. This is an exciting development for the AI community as it provides valuable insights into the performance of AI chips and systems.

The benchmarks developed by MLCommons measure how quickly these AI chips and systems can generate responses from powerful AI models that are loaded with data. Essentially, they gauge the speed at which AI applications can deliver responses to user queries. One of the new benchmarks, called Llama 2, specifically measures question-and-answer scenarios for large language models. Developed by Meta Platforms, Llama 2 consists of a staggering 70 billion parameters.

In addition to Llama 2, MLCommons also introduced a second text-to-image generator, called MLPerf, which is based on Stability AI’s Stable Diffusion XL model. Both benchmarks showcase the impressive capabilities of AI hardware and its ability to generate responses and create images in a prompt manner.

The results of these benchmarks highlight the dominance of servers powered by Nvidia’s H100 chips, which were created by industry giants like Alphabet’s Google, Supermicro, and Nvidia itself. These servers emerged as clear winners in terms of raw performance. Several server builders also submitted designs based on Nvidia’s less powerful L40S chip, indicating the widespread adoption of Nvidia’s technology.

However, it’s worth noting that MLCommons recognizes that raw performance is not the only crucial factor when it comes to deploying AI applications. The power consumption of advanced AI chips is a significant challenge, and MLCommons has separate benchmarks dedicated to measuring power consumption. In this respect, companies like Qualcomm, with their energy-efficient AI chips, and Intel, with their Gaudi2 accelerator chips, have also made noteworthy contributions.

When asked about the results, MLCommons officials described them as “solid.” These benchmarks serve as an important tool for AI companies and researchers to evaluate and compare the performance of their hardware. They provide valuable insights and data that can help drive further advancements in AI technology.

MLCommons' release of these performance benchmarks is a significant step forward in unlocking the full potential of AI. It allows us to understand the capabilities of AI hardware in delivering fast and efficient responses, while also considering power consumption. As AI continues to advance, these benchmarks will play a vital role in shaping the future of AI development and deployment.


Written By

Jiri Bílek

In the vast realm of AI and U.N. directives, Jiri crafts tales that bridge tech divides. With every word, he champions a world where machines serve all, harmoniously.