BAIVECTOR - AdobeStock

AI chip challenger Groq eyes APAC expansion

Groq’s novel chip architecture that speeds up AI inferencing has attracted a fast-growing developer base as it plans its first datacentre in the Asia-Pacific region

US startup Groq has been challenging semiconductor powerhouses Nvidia and AMD in the artificial intelligence (AI) chip race with a novel chip architecture that can perform inferencing tasks at breakneck speeds.

Unlike graphics processing units (GPUs), which have become the industry standard for both AI training and inferencing, Groq’s chips – dubbed language processing units (LPUs) – are built specifically for inference, the process of running a trained AI model to get a response.

But while GPUs rely on external memory to store the AI model, making them prone to performance bottlenecks, Groq’s system distributes AI models across thousands of chips, storing them directly in faster, on-chip memory with lower latency and higher energy efficiency.

Groq CEO Jonathan Ross, a former Google engineer who worked on the cloud provider’s tensor processing unit (TPU), likened the company’s chip architecture to a ballet performance.

“It’s like a ballet choreography, where everyone just knows where to be,” Ross told Computer Weekly during a trip to Singapore. “Our chips know when to accept the data,” he said.

GPU systems, on the other hand, are unpredictable, added Ross, comparing them to showing up for an unscheduled meeting and having to wait.

Groq’s focus on inference workloads was deliberate. “We weren’t going to make customers’ lives better by working on training, which is too expensive, uses too much energy, and requires too much effort to get the software stack to work, but inference was not a solved problem,” he said, adding that it’s also a much bigger problem with 10 to 20 times more compute being deployed for inferencing compared with training workloads.

During a demonstration, Ross showed how an AI model running on the company’s LPUs generated a complex, multi-day travel itinerary in the blink of an eye from a voice prompt, responding to a flurry of edits and requests almost instantaneously.

“We’re dramatically faster than GPUs,” he said. “Normally, people have to put a lot of work into streaming the audio. We actually generate the entire audio and then start playing. That’s how fast it is.”

Groq’s compiler was designed before the chip, allowing it to onboard some of the largest AI models much faster than others. Ross said a trillion-parameter model was up and running in production on a Monday after it was released on a Friday. “We actually spent less time compiling it than downloading it,” he quipped.

And, with large AI models running faster and more efficiently, Ross believes there will be less need to use smaller models or fine-tune a large model, which could be quickly superseded by a newer foundation model that performs just as well. “It has become largely about getting one of the larger models to run as inexpensively and as fast as possible, which is what we do,” he said.

Groq’s work to speed up inferencing has attracted a developer following in the Asia-Pacific (APAC) region, where there are 45,000 registered developers on Groq’s platform in Singapore alone, with India having the second-highest number of developers globally.

The company, which was founded in 2016, also claimed that usage of its application programming interface (API) increased from 17% to 36% in the past year, surpassing Amazon’s, citing a recent industry survey.

Read more about AI in APAC

  • Singtel’s AI Studio, part of its CUBΣ network-as-a-service offering, aims to simplify AI development and deployment while addressing enterprise concerns around security and data sovereignty.
  • By unifying data on the Databricks platform, Vietnam’s Techcombank has built AI capabilities to deliver hyper-personalised offers to 15 million customers and expand its footprint beyond its traditional affluent base.
  • Researchers at the National University of Singapore have created a wearable device that combines a camera with conversational AI powered by Meta’s Llama models to give sight to the visually impaired.
  • Australian IT spending is set to grow by 8.9% in 2026, driven by growing investments in AI, datacentre systems and cloud, according to Gartner.

To meet the rising demand for its chips and the GroqCloud service, the startup has expanded from one datacentre to 13 in the past 12 months. It is now planning to open its first datacentre in APAC later this year.

While datacentre operators continue to grapple with the power and cooling demand of AI workloads, Groq’s air-cooled chips can be deployed in older facilities that don’t support the liquid-cooling requirements of power-hungry GPUs.

“In Finland, Microsoft was moving out of an Equinix datacentre that’s air-cooled because it’s useless to them, and as they moved out, we moved in,” said Ross. “There’s an enormous amount of air-cooled datacentre space that others can’t use because GPUs have to be liquid-cooled.”

The company also lets organisations with data sovereignty and security requirements deploy its chips on-premise through partnerships with datacentre hardware suppliers. On-premise customers can get their own custom region, and also use GroqCloud if they require more capacity than they bought, he said.

Groq’s growth has been fuelled by strong financial backing from key investors. In September 2025, it raised $750m in a Series E funding round led by Disruptive, with participation from major investors like BlackRock, Samsung and Cisco, valuing the firm at $6.9bn.

Read more on Data centre hardware