Perceptual Image Codec: What Matters in Practical Learned Image Compression
Apple's Machine Learning Research introduces PICO, a new 'Perceptual Image Codec' that leverages learned compression to significantly reduce image sizes while maintaining high visual quality. This innovative approach promises 2.3-3x bitrate savings over current standards like AV1 and VVC, alongside impressive on-device performance. Its focus on practical, human-centric visual optimization makes it a compelling advancement in image technology, resonating with HN's interest in efficient, real-world applications of machine learning.
The Lowdown
The Apple Machine Learning Research team has unveiled PICO, or Perceptual Image Codec, a novel learned image compression system designed to prioritize human visual perception and on-device performance. This project represents a comprehensive effort to optimize image compression for real-world practicality, contrasting with many academic-only learned codecs.
- PICO is highlighted as the first learned codec optimized directly for the human visual system, aiming for better perceived quality.
- Its development involved an extensive study of modeling choices and an search over millions of configurations to jointly optimize for perceptual quality and execution speed.
- Subjective user studies indicate significant bitrate savings: 2.3-3 times more efficient than traditional codecs like AV1, AV2, VVC, ECM, and JPEG-AI.
- It also outperforms other learned codecs, offering 20-40% bitrate savings.
- Performance benchmarks on an iPhone 17 Pro Max demonstrate rapid processing, encoding 12MP images in 230ms and decoding in 150ms, surpassing many ML codecs running on V100 GPUs.
- Unlike many learned compression methods, PICO is designed with cross-platform robustness.
PICO presents a significant step forward in practical image compression, combining state-of-the-art machine learning with a strong emphasis on user experience and real-world device performance.