Indexing a year of video locally on a 2021 MacBook with Gemma4-31B (50GB swap)

A 2021 M1 Max MacBook Pro, pushed to its limits with 50GB of swap, surprisingly indexed a year of video footage locally using a 31B Gemma model. This deep dive reveals an ingenious AI-native pipeline to manage personal archives, showcasing the unexpected power of consumer hardware for large language models. HN celebrates the practical application of advanced AI on local machines, sidestepping cloud dependencies and demonstrating impressive technical ingenuity.

Score

Comments

Highest Rank

on Front Page

First Seen

May 21, 3:00 PM

Last Seen

May 21, 7:00 PM

Rank Over Time

The Lowdown

NJ, who splits his time between managing a safari lodge in Maasai Mara and software development in Silicon Valley, found himself with an ever-growing, unindexed archive of video footage. Despite his proficiency with AI tools in his professional life, his personal video content remained unmanageable. This led him to tackle the problem of video indexing head-on, aiming to create a searchable, local archive.

Initially, NJ considered commercial AI video editors but found them unsuitable due to concerns about generative AI compromising brand authenticity and their failure to address the fundamental issue of an unlabeled archive. He identified that the crucial first step was to build a comprehensive index.
He designed a local-first indexing pipeline, favoring plain-text sidecar files next to each video clip over a central database. This ensures data portability and resilience.
The pipeline integrates various tools: ffprobe for metadata, exiftool for GPS, Nominatim for reverse-geocoding, ffmpeg for frame extraction, WhisperX for transcription and diarization, and insightface for facial recognition.
For the core vision processing, he used a multi-tiered approach: Claude CLI for zero marginal cost, Anthropic API for speed, and critically, a local Gemma 4 31B Q4 model running via LM Studio for bulk processing.
The vision model processes extracted frames, transcripts, and folder context to generate YAML frontmatter and a prose description, capturing extensive details like lighting, time of day, color palette, people count, and potential use cases.
A major revelation was the performance of his 2021 M1 Max MacBook Pro. Despite being "five years old" by the story's internal timestamp, it ran the 31B Gemma model and handled the indexing, utilizing over 50GB of swap memory, highlighting the unexpected longevity and capability of Apple Silicon.
NJ shared valuable lessons from overcoming four significant bugs, emphasizing defensive coding for fast-moving AI libraries, vigilance against silent API failures, strict schema design over union types, and careful consideration of culling criteria for video memories versus professional portfolios.
He concluded that structured prompts with enum constraints are vital for preventing model confabulation, local 31B models can largely match cloud services for bulk tasks, and the true "missing layer" in current AI video editors is a robust, underlying index.

NJ is now proceeding to build the editing layer, leveraging the newly created index with Claude Code and DaVinci Resolve. He views this as a "tooling problem" that he can solve, turning a long-standing personal frustration into an opportunity to apply his technical skills. He also opens the door to sharing his "skill" (code) for public use and collaboration.

The Gossip

Local LLM Leistungsfähigkeit

Users were impressed by the feasibility of running large language models like Gemma 4 31B on consumer-grade hardware, specifically M-series MacBooks and even older machines like a 2015 ThinkPad. Discussion revolved around real-world performance, token per second benchmarks for MoE models, and the general excitement about using local LLMs for batch processing without cloud dependencies. The author confirmed his own surprise at his M1 Max's capabilities.

Authentizitäts-AI-Alltag

The author's strong stance against using generative AI video for a travel brand ("no place on a real travel brand") sparked a minor debate. One commenter challenged this, pointing to the prevalence of potentially "fake" content from Airbnb hosts. The author clarified his position, acknowledging the tension between maintaining genuine representation for a high-end brand and the time-saving benefits of AI.

Code-Anfrage-Kaskade

A user immediately inquired about accessing the "skill" (code) mentioned in the post, noticing that the provided path was a local one. The author promptly acknowledged the oversight and committed to sharing the actual code files, indicating strong community interest in replicating or adapting the described solution.