From Bits to Rounds: Unifying Information-Theoretic Limits and Real-World Efficiency in Diffusion Language Models
Abstract
Discrete Diffusion Language Models (DLMs) have recently emerged as promising alternatives to autoregressive Large Language Models (LLMs), offering comparable accuracy with faster inference. However, their efficiency relies on parallel decoding strategies that typically generate only high-confidence tokens at each iteration. We show that such confidence-based decoding faces a fundamental information-theoretic bottleneck. Motivated by this theoretical limitation, we propose a training-free decoding strategy that circumvents this bottleneck, enabling more efficient parallel generation without sacrificing quality.
Bio
Jiantao Jiao is an Assistant Professor of Electrical Engineering and Computer Sciences and Statistics at UC Berkeley. His work spans information theory, machine learning, and foundation models. He co-founded Nexusflow.AI and served as its CEO, leading the development of foundation models and agentic systems until the company was acquired by NVIDIA. He is now a Director of Research and Distinguished Scientist at NVIDIA, where he focuses on advancing Artificial Superintelligence. His research connects academia and industry, with support from the NSF, OpenAI, Meta, Google, and other partners.