Groq to Provide Access to World’s Fastest AI Inference Engine
Groq, the leader in real-time AI inference, announced its participation in the National Artificial Intelligence Research Resource (NAIRR) Pilot, recently. The Pilot, a U.S. National Science Foundation-led program, marks the first step towards creating a shared national research infrastructure to connect U.S. researchers and educators to responsible and trustworthy AI research resources. In collaboration with 13 federal agencies and 25 private sector, nonprofit, and philanthropic organizations, Groq is powering the next phase of responsible AI research, discovery, and innovation by providing access to its LPU Inference Engine – the only solution delivering real-time AI inference today – via GroqCloud.
Groq announces its participation in the National Artificial Intelligence Research Resource (NAIRR) Pilot.”Groq was founded, in part, to end the ‘haves and have-nots’ in AI,” said Groq Public Sector President Aileen Black. “Lack of access to necessary resources should never prevent a researcher from succeeding at the next Operation Warp Speed. It is an honor to provide the next generation of AI innovators with the real-time inference needed to run text-based applications and other AI workloads at scale.”
With GroqCloud, researchers can leverage leading open-source Large Language Models (LLMs) from providers like META, Google, and Mistral, who have built leading AI models according to industry benchmarks and human evaluation. The LPU Inference Engine makes it easy to conduct research, as well as test and deploy new generative AI applications and other AI workloads because it delivers 10x the speed while consuming just 1/10th the energy of comparable systems using GPUs for inference. Researchers can also access the Groq technology via the ALCF Argonne Leadership Compute Facility, which includes a GroqRack compute cluster that provides an extensible accelerator network of 9 GroqNode servers with a rotational multi-node network topology.
“Inference plays an increasingly pivotal role in the AI ecosystem of computing, data, software, and platforms researchers require to advance innovation and scientific discovery in a responsible manner for the country. Groq’s contribution to the NAIRR Pilot will enable researchers to access leading models, helping to realize their boldest research visions,” said Katie Antypas, director of the Office of Advanced Cyberinfrastructure at the U.S. National Science Foundation.
The LPU Inference Engine is a new processing system developed to handle computationally intensive applications with sequential components, such as LLMs, audio, control systems, network observance, and more. While CPUs and GPUs excel in tasks related to data input and model training, they encounter challenges in executing at-scale inference in ultra-low, real-time workloads. Sub-optimal latency, throughput, power consumption, and a sequential processing nature adversely impact their effectiveness. Groq addressed these limitations when designing the LPU to ensure repeatable ultra-low latency without hindering performance.