Gemini 3 Deep Think drew me a good SVG of a pelican riding a bicycle
Gemini 3 Deep Think, Google's latest AI model, has impressed Hacker News by acing Simon Willison's long-standing "pelican riding a bicycle" SVG challenge. The AI's detailed and whimsical output for this seemingly absurd prompt has sparked conversations about the rapid advancement of image generation and the efficacy of niche benchmarks. Commenters debated whether the model was truly brilliant or if the benchmark itself had become compromised by targeted training.
The Lowdown
Simon Willison, a prominent voice in the AI community known for his unique benchmarks, recently showcased Gemini 3 Deep Think's impressive capabilities by challenging it with his signature "pelican riding a bicycle" SVG prompt.
- Gemini 3 Deep Think is Google's new AI model, which the company claims is built to push intelligence frontiers and solve modern challenges in science, research, and engineering.
- Willison first tested the model with his basic prompt to "Generate an SVG of a pelican riding a bicycle," yielding what he described as a "really good" result.
- He then escalated the challenge, requesting a "California brown pelican riding a bicycle" with highly specific details, including anatomical features, bike components, and environmental context.
- The AI successfully produced highly detailed and contextually rich SVG images for both prompts, surpassing previous attempts and setting a new bar for the benchmark.
- Willison has even created an FAQ addressing the popular question of whether AI labs are specifically training models for pelicans riding bicycles, highlighting the benchmark's notoriety.
The demonstration underscores significant progress in AI's ability to understand and render complex, whimsical prompts into high-quality, intricate vector graphics.
The Gossip
Benchmark Brouhaha: Goodhart's Law vs. Litmus Test
The discussion often circled back to the core question of whether Simon Willison's "pelican on a bicycle" prompt remains a fair benchmark for AI's general capabilities. Many commentators invoked "Goodhart's Law," suggesting that once an indicator becomes a target (i.e., AI models are explicitly trained on it), it ceases to be a good indicator. Others, including Willison himself, argued that even if it's become a "target," it still serves as a useful "litmus test" to gauge how well AI companies are tackling specific, complex visual generation challenges, or simply demonstrates the rapid advancement across models.
SVG Sensation: Impressive Visuals & General Capabilities
Beyond the specific pelican benchmark, many commenters were broadly impressed by Gemini 3 Deep Think's SVG generation quality and the general advancements in AI's ability to create complex vector graphics. Users shared anecdotes of trying their own quirky prompts with various models, finding similarly excellent results for arbitrary concepts. This suggested that the pelican's success might be less about targeted training and more about a significant leap in AI's overall visual and descriptive understanding, making AI-generated icons and graphics a viable alternative to traditional resources.