How to automate Instagram engagements with computer vision (and get banned)
This post details a clever technical approach to automating Instagram engagement using computer vision, directly interacting with the UI rather than relying on unstable DOM selectors or private APIs. Despite the sophisticated method, the author candidly admits their account was still banned within days. The discussion on Hacker News explores the cat-and-mouse game between bot developers and platform security, as well as the broader implications for social media usage.
The Lowdown
The author presents an intriguing method for automating Instagram engagements by using computer vision to interact directly with the user interface, bypassing the platform's attempts to thwart traditional automation via obfuscated HTML and constantly changing DOM structures. The core idea is that while code changes, the visual UI for humans remains consistent enough to be programmatically interacted with.
- The Problem: Instagram's dynamic HTML makes traditional DOM-based automation unreliable due to frequently changing class names and structure.
- The Solution: Instead of DOM, the author uses computer vision to visually identify UI elements like the 'heart' icon. This involves taking screenshots, identifying key landmarks (like the triple-dots menu and action bar), and using these landmarks to create a small, targeted search region for the heart icon.
- Refinement: A 'sliding window' approach finds potential hearts within this region, followed by a vertical alignment filter to eliminate false positives, as all actual hearts appear in a consistent vertical column.
- The Outcome: The system successfully detects heart icons and clicks them. However, despite implementing various obfuscation techniques like natural cursor movement and randomized timings, the account was banned by Instagram within days.
- Conclusion: The experiment proved that computer vision could interact with any screen-rendered UI, but ultimately, Instagram's bot detection proved superior for this use case.
The Gossip
Banning Bots & Bot Busters
Commenters quickly noted the inevitability of bans when automating on platforms like Instagram. The author's candid admission of being banned resonated, highlighting the ongoing arms race between bot developers and platform security teams who invest heavily in detection. Some suggested focusing on content generation rather than engagement automation to avoid such issues.
API vs. Visual Automation Acumen
A common point of discussion revolved around the technical choice of computer vision over API-based automation. While the author's method elegantly sidesteps UI instability, some commenters questioned if capturing HTTP API calls would be a simpler and more appropriate approach, especially given Instagram's stated efforts to protect its backend.
Pixels and Precision Problems
One commenter raised a minor technical critique regarding the author's claim about 'typical' screen resolutions, suggesting that 4K isn't universally standard for Instagram browsing. They also proposed that resizing the browser window to a smaller resolution could reduce the search space for computer vision, making the process more efficient.
Platform Paradoxes & People Problems
A significant theme emerged around the broader implications of platform's anti-bot measures, which often result in a 'user hostile' environment, even for legitimate users. Commenters lamented the difficulty of creating new accounts and the constant struggle against automated systems, leading some to question the overall value and user-friendliness of such social networks.