IBM researchers, in collaboration with ETH Zürich, have unveiled Analog Foundation Models (AFMs), a new class of AI models designed to bridge large language models (LLMs) and Analog In-Memory Computing (AIMC) hardware. AIMC promises ultra-efficient AI by performing matrix-vector multiplications directly inside memory arrays, bypassing the traditional von Neumann bottleneck and enabling high-throughput, low-power inference on edge devices.
Historically, AIMC adoption has been limited by analog noise, caused by device variability, DAC/ADC quantization, and runtime fluctuations, which degrade model accuracy—particularly for billion-parameter LLMs. AFMs address this challenge through hardware-aware training, including noise injection, iterative weight clipping, learned input/output quantization, and distillation from pre-trained LLMs using synthetic data. These techniques allow models such as Phi-3-mini-4k-instruct and Llama-3.2-1B-Instruct to maintain performance comparable to 4-bit/8-bit quantized digital baselines.
Interestingly, AFMs also enhance performance on low-precision digital hardware, making them versatile across AIMC and conventional inference platforms. AFMs demonstrate scalable accuracy under increased inference compute, outperforming traditional quantization-aware models on reasoning benchmarks.
This milestone demonstrates that large LLMs can run efficiently on compact, energy-saving hardware, opening the door for edge deployment of foundation models. AFMs mark a significant step toward practical analog AI and energy-efficient, high-performance computing.