How many of the 170k English words do you know?

VocabOwl offers a 'scientifically stratified' 100-question challenge to gauge your English vocabulary, promising to tell you how many of the 170k words you know. While a fun diversion, the Hacker News crowd quickly picked apart its methodology, user experience, and the dubious 'scientific' claims, often finding ways to game the test for inflated scores. It's a classic HN takedown of an ambitious but flawed online tool.

106

Score

192

Comments

Highest Rank

14h

on Front Page

First Seen

Jun 19, 3:00 PM

Last Seen

Jun 20, 8:00 AM

Rank Over Time

The Lowdown

VocabOwl presents itself as a novel web-based tool designed to 'scientifically' estimate an individual's English vocabulary size. Inspired by a podcast and claiming to use 'stratified sampling' with 'Gemini 3 Flash AI', it guides users through a 100-question multiple-choice quiz.

The core premise involves:

Goal: To determine how many of the 171,476 English words a user 'actually knows'.
Methodology: A 100-question challenge divided into five difficulty bands (Core Basics, Intermediate, Advanced, Expert, Grandmaster), with word selection based on frequency of use.
User Interaction: Participants select the correct definition from four options for each word.
Outcome: An estimated total word count and a personalized 'verdict' message.

While conceived as an engaging self-assessment, the tool's execution and the validity of its 'scientific' claims were met with considerable skepticism and analytical critique from the Hacker News community.

The Gossip

Flawed Formats and Funky Findings

Many commenters quickly identified patterns and issues within the multiple-choice format that allowed for easy guessing, severely questioning the test's accuracy. Common observations included that the correct answer was often the longest, or that distractors were easily eliminated due to being direct opposites or unrelated. Users, including non-native speakers, reported surprisingly high scores, leading to suspicions that the definitions and alternatives might be AI-generated and not sufficiently challenging or well-curated.

User Unfriendliness and UI Unsettlement

A prevalent complaint revolved around the test's user experience (UX), particularly the number of clicks required per question. Users expressed frustration over having to click to select an answer, then click 'check', and then 'continue' to advance, making the 100-question test a tedious slog. Numerous suggestions were made for streamlining the process, such as keyboard shortcuts, instant progression upon selection, and eliminating redundant confirmation steps, alongside mentions of minor UI jitters.

Linguistic Lapses and Lexical Layers

The discussion delved into the nature of English vocabulary itself and the test's word choices. Many non-native speakers, particularly those with French or Latin knowledge, found that their understanding of root words made 'advanced' words unexpectedly easy to decipher, contributing to inflated scores. Commenters also debated the inclusion of highly obscure or fictional words, as well as instances where definitions provided were considered 'slightly off' or incomplete, challenging the test's ability to accurately assess comprehensive vocabulary knowledge.

Sampling Scrutiny and Statistical Sophistication

A more technical thread emerged questioning the 'scientific' basis of the test's stratified sampling. Commenters proposed alternative statistical methodologies, such as adaptive sampling or Item Response Theory (IRT), arguing they would provide a more efficient and accurate assessment of vocabulary. Skepticism was voiced regarding the underlying assumptions about word-knowing distribution (e.g., Zipfian vs. Gaussian) and the overall calibration, with some comparing it unfavorably to more robust, academically-backed vocabulary tests.