GraphDB Inference Engine

12d

MLCommons Releases New MLPerf Inference v6.0 Benchmark Results

Today, MLCommons ® announced new results for its industry-standard MLPerf ® Inference v6.0 benchmark suite. This release includes several important advances that ensure the benchmark suite tests ...

17d

IndexCache, a new sparse attention optimizer, delivers 1.82x faster inference on long-context AI models

Researchers at Tsinghua University and Z.ai built IndexCache to eliminate redundant computation in sparse attention models ...

SiliconANGLE

Nvidia GTC 2026: Jensen Huang’s Groq ‘Mellanox moment’ and the inference land grab

Ahead of Nvidia Corp.’s GTC 2026 this week, we reiterate our thesis that the center of gravity in artificial intelligence is shifting from “How fast can you train?” to “How well can you serve?” ...

The Next Platform

Nvidia Software Pushes MLPerf Inference Benchmarks To New Highs

For years, co-founder and chief executive officer Jensen Huang and other higher-ups at Nvidia have been banging on the ...

Autoblog

These Are Some of the Most Reliable Engines Ever Built

Looking for bullet-proof reliability? Then these are some of the most robust gas engines built over the past four decades. Many modern engines still face reliability issues despite 140 years of ...

Wall Street Journal

What Is Inference? Explaining the Massive New Shift in AI Computing

A significant shift is under way in artificial intelligence, and it has huge implications for technology companies big and small. For the past half-decade, most of the focus in AI has been on training ...

EDN

The truth about AI inference costs: Why cost-per-token isn’t what it seems

To understand what's really happening, we need to look at the full system, specifically total cost of ownership of an AI ...

11d

Intel Arc Pro B70 Delivers Up to 80% AI Inference Boost in MLPerf v6.0 Benchmarks

Intel Arc Pro B70 delivers up to 80% faster AI inference in MLPerf v6.0 benchmarks, with strong GPU and CPU performance gains ...

Hosted on MSN

MetalRT brings the first unified AI inference engine to Apple Silicon

Artificial intelligence is rapidly moving beyond cloud servers and into the devices people use every day. Laptops, smartphones and edge systems now have enough computing power to run sophisticated ...

TechCrunch

Startup Gimlet Labs is solving the AI inference bottleneck in a surprisingly elegant way

Stanford adjunct professor and successfully exited founder Zain Asgar just raised an $80 million Series A for a startup that solve the AI inference bottleneck problem in an astute way. The round was ...

The Next Web

NeuReality taps former Google AI director to steer its inference operating system into the market

When Jensen Huang told 30,000 attendees at GTC last week that the future data centre is a “token factory,” he was describing a world that a small Israeli startup has been quietly building toward for ...

Some results have been hidden because they may be inaccessible to you

Show inaccessible results