Perceptual Image Codec: What Matters in Practical Learned Image Compression

The Apple Machine Learning Research team has unveiled PICO, or Perceptual Image Codec, a novel learned image compression system designed to prioritize human visual perception and on-device performance. This project represents a comprehensive effort to optimize image compression for real-world practicality, contrasting with many academic-only learned codecs.

PICO is highlighted as the first learned codec optimized directly for the human visual system, aiming for better perceived quality.
Its development involved an extensive study of modeling choices and an search over millions of configurations to jointly optimize for perceptual quality and execution speed.
Subjective user studies indicate significant bitrate savings: 2.3-3 times more efficient than traditional codecs like AV1, AV2, VVC, ECM, and JPEG-AI.
It also outperforms other learned codecs, offering 20-40% bitrate savings.
Performance benchmarks on an iPhone 17 Pro Max demonstrate rapid processing, encoding 12MP images in 230ms and decoding in 150ms, surpassing many ML codecs running on V100 GPUs.
Unlike many learned compression methods, PICO is designed with cross-platform robustness.

PICO presents a significant step forward in practical image compression, combining state-of-the-art machine learning with a strong emphasis on user experience and real-world device performance.

Perceptual Image Codec: What Matters in Practical Learned Image Compression

The Lowdown