LLM Inference Optimization

AI inference crisis: Google engineers on why network latency and memory trump compute

Researchers propose low-latency topologies and processing-in-network as memory and interconnect bottlenecks threaten ...

Semiconductor Engineering

HW-SW Co-Designed System With 3 Core Optimization Pathways For Long-Context Agentic LLM Inference (Cambridge, ICL)

A new technical paper titled “Combating the Memory Walls: Optimization Pathways for Long-Context Agentic LLM Inference” was published by researchers at University of Cambridge, Imperial College London ...

The Next Platform

Optimizing AI Inference Is As Vital As Building AI Training Beasts

The history of computing teaches us that software always and necessarily lags hardware, and unfortunately that lag can stretch for many years when it comes to wringing the best performance out of iron ...

NextBigFuture

Defeating Nondeterminism in LLM Inference by Thinking Machines

A research article by Horace He and the Thinking Machines Lab (X-OpenAI CTO Mira Murati founded) addresses a long-standing issue in large language models (LLMs). Even with greedy decoding bu setting ...

Search Engine Land

LLM optimization in 2026: Tracking, visibility, and what’s next for AI discovery

Marketing, technology, and business leaders today are asking an important question: how do you optimize for large language models (LLMs) like ChatGPT, Gemini, and Claude? LLM optimization is taking ...

Semiconductor Engineering

Detailed Study of Performance Modeling For LLM Implementations At Scale (imec)

A new technical paper titled “System-performance and cost modeling of Large Language Model training and inference” was published by researchers at imec. “Large language models (LLMs), based on ...

techtimes

AI at the Edge: LLM on NVIDIA Jetson

Demand for AI solutions is rising—and with it, the need for edge AI is growing as well, emerging as a key focus in applied machine learning. The launch of LLM on NVIDIA Jetson has become a true ...

Forbes

Nvidia Sweeps AI Benchmarks While AMD Misses The Boat. Again.

Nvidia did not submit results for Blackwell either, as it wasn’t ready when results had to be submitted, but still won the race with the Hopper GPU by up to 4X. Too bad for AMD, as they probably have ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results