Posted by abnry 5 days ago
Some of the words chosen are rather absurd/inappropriate: breviary (which I got wrong but felt like a vaguely religious word) was characterized as intermediate but I think it's much more obscure and less obvious than that; Hippopotomonstrosesquippedaliophobia was used as a word (I got that wrong as well) - any type of 'phobia' word is really the sort of thing a fourth grader opens up a page in the dictionary and points out, not a word that is used... ever; metamorphosis and kinetic were labeled expert, which I don't agree with (what elementary schooler doesn't learn about the metamorphosis of a caterpillar into a butterfly? what high schooler doesn't learn about kinetic energy?).
Most words were reasonably well defined in a way that most people would understand or recognize. A few words had poor definitions: lethargy ("the state of being lethargic" - obvious); complacent ("smug satisfaction with oneself" - I disagree that complacency is intrinsically smug); magnanimous ("generous toward a rival" - I disagree that a rival must be involved); gauche ("socially awkward" - this is sort of close but the given definition completely misses the idea of being tactless).
They call it scientific and give a hand-wavey formula, but they don't explain how words are stratified in the first place. If stratified sampling is a formally recognized method of doing this, it would be nice to have a link to a real reference. I think I know a lot of words, but I am skeptical of the estimate this app provided (north of 75k).
Breviary: this was, to me, known and not uncommon. It's widely known to Catholics, but also, if you have an interest in medieval art or books, you'd likely know it too. It was one of the main types of books before the invention of the printing press. Think of an image from an illuminated manuscript, 50% chance it's from one.
Hippopotomonstrosesquippedaliophobia: it's not that you're expected to know the whole word, but they're looking for you to recognize components of it and infer the meaning from that. I knew sesquippedalian (sometimes jokingly used in "long word" contexts) so that was easy: but phobia is also easily identifiable, and hippo, from the latin root, I knew was not as obvious as the animal, but probably something like "large" (clue: the Hippodrome). So you could, even knowing only "phobia" and being able to guess "hippo", have a good basis for your choice.
Complacent and gauche: have heard both these uses, I think that's straightforwardly correct. If this was a dictionary that would, at worst, be the 2nd or 3rd definition. No complaints.
Source: I used to place in spelling bees and could've been a contender but I didn't have the discipline to study the dictionary for hours on the weekends, which is the next level.
In the last batch there were a few words that I was vaguely confident of but a lot more of them seemed like "stunt" words I would never see because every time they'd need defining so why bother.
Also I was assuming it was picking from a huge set, but it seems everybody was shown the same words, so while it's supposedly a "sample" any bias, even if unintended, shows up in the results, if you wanted to be scientific perhaps you'd do this for 1000 words and then sample 100 questions from that for each participant or something.
I don't think I got "breviary" when I tried. Maybe it's using a decision tree, but everyone's ending up on the same branch by getting most of the words right?
I'll remark that "if you have interest in [some particular academic pursuit], you'd likely know it" is a pretty decent description of the sort of word that shows up in "grandmaster" tier.
(I have joked that, living in Japan, my English is getting worse faster than my Japanese is getting better, but breviary might well be a concrete example.)
See NGRAMs: https://books.google.com/ngrams/graph?content=Breviary%2CHip...
Except "hippo-" is from Greek and means "horse".
Well.. Hippos is greek for horse, and Hippopotamus is a "river horse". Same for Hippodrome, a course for horses. And in latin, hypo means small (and not large), as seen in e.g. hypoglycemia.
And iirc “gauche” had more than just “socially awkward” in the correct answer but speeding through it again I didn’t get gauche as a word. That said, something gauche, to me, has always been something glaringly “not ok” in a social sense so again, that tracks. Oxford Language defines it as
> lacking ease or grace; unsophisticated and socially awkward.
Which is closer to the quiz’s definition and again, tracks with my internal thinking of the word’s use.
> Hippopotomonstrosesquippedaliophobia
Was just plain fun - as soon as I saw the “fear of long words” I was like of course that’s it
*I mistakenly put “Merriam Webster” the first time around - while MW doesn’t include the word smug itself, the 1.b definition is simply “self-satisfied”
It really could do with a summary showing the answers you made and corrections for what you got wrong.
I agree that it doesn't seem 'smug', but weirdly both dictionary dot com and Wiktionary give 'smug' as a synonym or part of the definition.
But they also analyze 'smug' as equivalent to self-satisfied or self-complacent, so maybe that's the word whose meaning is not as expected.
(I would think of "smug" less as "self-" anything - it implies a relation, it's more like exulting in a superior situation one has over someone. And 'complacent' is at base being content with one's situation, but often with the negative implication that one should be acting to make things better instead)
I think some of the, were flawed - I can't remember what it was now, but one word two of the meanings were kind of appropriate, but I chose the wrong one, and I think there were 2-3 words I didn't know but guessed from the components in the words. At least one I also guessed that way, but got the complete opposite meaning!
I like this kind of test, but for me, the first 2 sections (which I aced) were kind of redundant. Maybe they needed to stratify it more or do it more dynamically, e.g. maybe do half the layer 1 questions, and if you get all them correct, move on to half the layer 2 questions. If you get one wrong, you get the rest of the layer 2 questions, and maybe if you get more than a certain number of those wrong you also have to go back and do the rest of layer 1. If you ace the first half of layer 2 as well as layer 1, maybe you jump straight into layer 3, etc...
95% of Americans.
I can assure you that just about every American that has made it through middle school has been taught about kinetic energy. Let alone high school.
Perhaps just because it suits my learning style, I find learning is actually easier if I attempt to work something out or guess it, and then am corrected when wrong, because then I have a memory to anchor it on. If I skip that part and just try to learn some facts, very little is retained. One consequence of this is that I prefer science / logic based subjects to things like history or geography (as in places, etc, not the science parts) where it's just a bunch of arbitrary facts that you can't just guess or work out for yourself.
Have they retained that knowledge beyond the test at the end of the semester?
Anecdotal observations would imply that they have indeed been taught it, and indeed have failed to retain the concept.
I have no rigorous data regarding either; but the generally poor outcomes which appear as result of a lack of retention of scientific, math, socio-economic, and anthropological instruction do seem self evident both from within and outside of the US, in headlines and actions, writ large and for all to see.
Is the problem the use of teaching methods which focus on short-term memorization rather than conceptual comprehension? Is it the lack of support for instructors? Is it a lack of focus in the student body? Is it some or all of the above in varying degree? Or something else entirely?
^_^ hah what a great word, first time seeing it.
Another one I came across recently - “sloptimization”
I agree there were too many clicks per word, I took me too long to finish. But I also found it too easy to guess the few words I did not know
I got ~1/3 that is very generous estimate even for "recall" case (recognize), and it obviously false for the "generate" case (using in speech) where I guess my vocabulary is likely ~1/90 of all English words.
Hippopotamus does mean river horse and I was caught out by that (note the o instead of a in ...poto...). I think that word is really a joke - lol - a bit like floccinausilihilipilification, which I wont bother looking up the speling 4.
A lot of them, because being an anti-intellectual is 'cool'
I've seen other systems like this calibrate far more quickly by assigning a sort of score and confidence behind the scenes. Confidence starts out low and increases over time - correct/incorrect answers rapidly adjust score at the beginning, then things settle down.
In practice this means you get a sequence of increasingly uncommon words initially, until you get one wrong, then you drop back to something easier until you start getting things right again, and eventually circle around words at your level.
Also - too many clicks per word. It's low stakes, just let me click the definition once and I'll live if I misclick (or add an undo button).
This, and accept that people will have incorrect input and build it into the confidence. Even the smartest person in the world sometimes makes clerical errors, or has the wrong neuron fire at the wrong moment.
Zenzizenzizenzic for example.
Some at Level 4 was definitely a lot more obscure than those.
Oh come on! Like you really knew what "Hippopotomonstrosesquippedaliophobia" is?
say what you like about antidisestablishmentarianism; at least it's an ethos
Speaking of things that stick... arachibutyrophobia is the feat of getting peanut putter stuck to the roof of your mouth. (I admit I had to look that one up, as it's not nearly as memorable, though I knew the word existed).
I too can say it and I'm very English...ish. LlanPG is a tourist attraction and a great example of an amateur advertising idea smashing it!
It's hard to disestablish a religion. Too many people believe. In Russia, the Russian Orthodox Church came back after Communism went down. Now Putin uses it to reinforce his rule.
That’s the schoolyard version of the story. In reality dissatisfaction with the church hierarchy had been simmering for some time, both in England and in Europe. Henry wouldn’t have gotten away with the split if it hadn’t enjoyed widespread support from the general public, the political class and the aristocracy.
They’re also too far away. I’m on a laptop and I have to keep moving the cursor up and down just to confirm. Give each option a letter or number and let me press it to choose the answer¹.
¹ There is (was?) some service for forms which does that and it works quite well. I think it was Typeform, but I just opened the website to check and—of course—it’s now just plastered with mentions of AI so I lost interest in verifying.
I'm guessing it's testing our susceptibility to machine-generated compliments
What is?
> I'm guessing it's testing our susceptibility to machine-generated compliments
I fail to see the point. For one, the compliments aren’t particularly good or interesting; for another, I didn’t even read them (I just went back to check after your comment), I simply clicked when seeing green.
well the point would be to see how susceptible you are to that. They're figuring out where your cost vs reward tipping point is.
Anyway, if they were running metrics on that they just became useless because I automated responding to it a bunch of times.
I would suggest a bias in this test towards reading. More than a couple are words i know but rarely see in print. But maybe im too much a fan of british TV so i hear many of thier words without seeing them written down.
I got tired after 8 words, looked at how many I'm suppose to know and gave up.
It'd be improved with statistical analysis; just progressively get harder and try to guess. If you wanted to gameify, you could update the stats after each answer.
F.e. Frugal - Economical with money or goods
I don’t think frugal means economical it means rather over the top …
Yeah I don’t know how to define it properly but I don’t need to learn new words if they don’t even teach the right meaning
Ai slop
There were a couple of definitions I did think were a bit off, e.g. 'zenith' and 'nihilism'. And one word where two answers seemed valid but I forget which.
Sometimes it gives one of several possible meanings but that's a valid choice.
In general I think it's a fun quiz - agreed with others though that the word selection brackets aren't ideal. It spends a lot of time on everyday vocabulary, then jumps straight into long words that someone made up one day as a joke.
The words I find most interesting are those that convey some subtle nuance, or describe some very specific thing - tools for old crafts, uncommon but genuinely used adjectives and the like. Very few of those appear.
I had frugal stored as more than just economical.
Thanks for your comments :/
Being frugal just means allocating scarce resources in a way that provides most utility and value.
(context: native English speaker, big reader, huge nerd, perfect SAT score)
I got all 100 correct on the first try without looking anything up! Confusingly, that only resulted in a "SCIENTIFIC ESTIMATE" that I know 85,000/~170,000 words?
Their "How is this calculated" page that appears at the end explains their error:
> According to the Oxford English Dictionary (Second Edition), there are approximately 171,476 words in current use.
> We use Stratified Sampling. Instead of testing random words, we divide the language into 5 distinct difficulty bands based on frequency of use:
> 1. Core Basics ~3,000 words > 2. Intermediate ~7,000 words > 3. Advanced ~10,000 words > 4. Expert ~25,000 words > 5. The Obscure ~40,000+ words
> If you answer 2 out of 3 'Intermediate' questions correctly, we estimate you know roughly 66% of the 7,000 words in that band.
> Total Score = Σ (Accuracy in Band × Band Size)
Their strata add up to 85000, not ~170k, making a perfect score still give a 50%.
They're also using a pretty limited and perhaps non-difficulty-representative subset of the language.
Cute, but wrong on many counts.
There were many words I didn’t know though.
As it usually happens in this kind of "check your vocabulary" tests in English, being Greek gives you an advantage in higher levels ;-)
I attribute most of my success in life to reading early and often. Bartending in college rounded out the social skills (for me) but those two skills have carried me further than I anticipated, coming from a poor background.
Have you found the same to be true?
Even if you're an introvert, working for a couple months at Olive Garden when you're 19 helps you to smile and be polite when 80% of the customers are mouth breathing idiots. Turns out they aren't all mouth breathers and those para social skills come into play later during your career.
I highly support kids of all origins working in service for a bit. Ain't a class thing, but is very helpful in getting used to the breadth and depth of people.
There are few professions where it's not unusual to have an hour+ conversation about literally any topic, and then potentially do it again the next day with the same person about a different topic. More similar to a therapist than customer service.
But the choice of "advanced words" seems a bit odd. Obscure, isnt that obscure.
Sure there are some speciality words, but most of these words are just the stuff you're gonna hear on radio4 in normal conversation
I suppose they evaluate difficulty based on origin of the word. If you already know German or Spanish you may have a head start when learning English, but on a different subset of it.
edit: also, native English (well, American) speaker
A lot of prestigious and scholarly vocabulary in English has come in through Latin and Greek (at various points in the history of English!), so you can learn that vocabulary or make it more memorable or more transparent either by studying Latin and Greek as languages, or just by studying some of their common morphemes (e.g. there are lists of Latin and Greek roots that may be given to medical or life sciences students to help them learn to recognize the meaning of terminology coined from these languages, even without speaking the languages).
But I think it's actually unrepresentative of the English language as a whole if we're literally thinking about vocabulary size rather than historical prestige of some part of the vocabulary. For example, foreign foods like "nori", "pandan", "dolma", "vichyssoise"[1], or "berbere" are often used as English words and would probably appear in large English dictionaries nowadays. None of that was tested in this quiz. I saw one foreign political term which I guessed at, and one or two German loanwords which I knew (I've also studied German), and almost everything else was Latin or Greek origins!
[1] apparently coined by a French-speaking American based on French roots?
I suggest skipping the submit button and just showing it's correct when pressing and moving on after a sec or so. Having to click on submit twice really breaks the flow.
Also in all the words I tried I noticed out of the 4 options one is the correct one, another is the opposite of the correct one, and the other 2 are random stuff. You can basically skip any option whose antonym isn't present as well.
A tangent: writing distractors for multiple choice questions is hard. From the exams I know (excluding those whose nature precludes it, such as based on calculation or rote memorization) the only that does this brutally well is LEK (Polish medical graduate exam). It's nigh impossible to vibe guess it at more than random chance for someone outside the field.
For all its shortcomings, this was part of the fun, deducing the likely correct answer when you see a word for the first time.
I don't understand how they rank words though, some extremely common words like xenophobia were ranked as high as much more obscure ones.
xylo- = wood; -logy = study
Indeed from M-W: "a branch of dendrology dealing with the gross and the minute structure of wood"
Flibbertigibbet appears in some of the Little House on the Prairie (Laura and Mary) books, if I remember right.
And I've also read Gulliver's Travels which is where Brobdingnagian comes from. Brobdingnag was a land of giants. Pretty sure I've seen the word used elsewhere though.
MARGARETTA: How do you find a word that means Maria?
BERTHE: A flibberti gibbet!
SOPHIA: A willo' the wisp!
MARGARETTA: A clown!
In case of online quiz you can have a "competition" between distractors:
1. start by having much more distractors than needed and pick randomly
2. for each measure the probability of it getting clicked (clicks/times it's shown)
3. show the most frequently clicked distractors more often
Users get drops of information and can stop whenever they feel. I stopped well before 100.
Having an answer counted as incorrect, just because I've accidentally touched the screen of the phone? I would absolutely hate that.
The sample of words is also heavily biased towards concepts relating to words, speech, speakers, and/or persuation. They are likely generated by an LLM which is primed on the task of choosing words, and end up choosing words related to "words".
For context, I'm an L2 speaker, linguistic nerd, and I use English mostly in academic/professional settings. I got 75,400 by a combination of the tactics above; in reality it might be closer to 10-15k.
The design is also painfully similar to Duolingo if anyone can spot that.
I had to look up the English word (lumbago), but German has the colorful “Hexenschuss” (witch shot). I suspect most people above a certain age can relate to there being a word for this in most languages.
Yeah. Clocked it from the landing page.
I got credit for a few that I would have happily just missed.
I did the full 100. It's not even 1/4, with the harder ones when one description is significantly longer than others, it's the correct one. Even outside that 2 choices are usually some object - which I think is never the correct answer
I'd also say the toughness should be mixed up a little. The last 30 or so became a slog
Cool idea though!
I managed a paltry 90/100. Some of those words require a classical education and probably a British one at that. I studied Latin at two posh schools and have O level English Language and Literature (that's two qualis at age 16).
I'm pretty well read and know exactly who Sandi and Stephen are. Ironically Sandi is Danish but notably erudite (that turned up for me) and navigates her way around English with remarkable aplomb.
My shorter OED contains 163,000 words (compared to the 600,000 words of the longer).
According to this site I know 71,000 words... Let's test that against the OED. I should have about 43% chance if knowing a word picked at random.
In my totally scientific test (ha) I chose 50 words at random from the OED and discovered I knew 29 of them for a score of 58% which is more than two sigma from 43%, this disproving the hypothesis.
I forgot what that was now, but it was a fun experiment.
Otherwise the most common vocab size would be equal to one.
(The median English speaker almost certainly knows several thousand words, or word stems to avoid duplication. But the number who know all words in the tail is exceptionally small.)
Your method of sampling could be improved further, unfortunately at the expense of ease of use. If the dictionary was sorted according to difficulty, then you could use stratified sampling.
I comment on the related aspects here.
You are correct. I tested that hypothesis about a dozen times and it seems that if you always pick the longest you’ll get it right somewhere in the high 70s to mid 80s. For anyone interested in testing for themselves, open the website to the first question then run this in the console (not going to spend time optimising it, it works well enough for the purpose):
let loopCount = 0
const loop = setInterval(() => {
Array.from(document.querySelectorAll("button")).slice(0, 4).reduce((long, curr) => curr.textContent.length > long.textContent.length ? curr : long).click()
setTimeout(() => Array.from(document.querySelectorAll("button")).at(-1).click(), 100)
setTimeout(() => Array.from(document.querySelectorAll("button")).at(-1).click(), 200)
loopCount++
if (loopCount === 100) clearInterval(loop)
}, 500)If one long versions you choose that, if two, then you choose the one that would be more useful to have a word assigned to it.
Core Basics 19/20
Intermediate 17/20
Advanced 19/20
Expert 14/20
Grandmaster 12/20
I guess, it's not too bad for a non-native speaker.
Minor feedback:
1. The correct answer for "Lethargic" is "Affected by lethargy". I think, definitions should not use words that share common root with the defined word, because:
a. it makes guessing too easy
b. it basically becomes a circular definition which is meaningless
2. Options almost always include 1 correct answer, 1 direct opposite and 2 completely random. Once you learn to recognise it, you can easily rule out 2 random options and have a 50/50 guess.
It only pushed my score up to 65k.
Like if author used LLM to generate wrong definitions per word instead of actually mixing definitions of words.
Like for me most of more complex words been adjectives with few nouns. And in many cases you can just see 2/4 or 3/4 definitions are not for adjective.
Yes, exactly like this.