How many of the 170k English words do you know?

Posted by abnry 6 days ago

How many of the 170k English words do you know?(vocabowl-870366514258.us-west1.run.app)

501 points | 554 commentspage 7

SideburnsOfDoom 4 days ago|

I would argue that none of "Weltschmerz", "Gaekwar", "Ucalegon" and "Houghmagandy" are really English words, although I already knew 2 of them.

And one of them prevented me from a perfect score, when I guessed wrong.

On the second run-through there was significant overlap. Maybe 30 or 40% of the words were from the first run-through.

dreis_sw 5 days ago||

I found a big problem with this - I noticed that the longest answer is very often the correct one, which kinda ruined the game. Even though I didn't want it to, it started affecting my decision-making. Luckily, I only noticed this around question 85, though those are really the tricky ones.

Good news for the project is that I think you can easily tweak the LLM to generate better alternatives.

I got 89/100, which extrapolates to 72,700. As a non-native speaker, I'm quite happy with that.

earthpyy 5 days ago|

Yeah, it happened to me too. When I notice the pattern, I go right away for the longest one, and the answer was 90% correct!

jstanley 6 days ago||

Cool idea, am working through.

It's annoying that you need to click 3 times per question, and the buttons are in 2 different places.

Maybe would be better to just let me click the answer I want and then instantly show me the next question?

Also who is Sandi?

rhdunn 6 days ago||

Sandi Toksvig, the current host of the BBC program QI (Quite Interesting), previously hosted by Stephen Fry. She's also been on a number of other BBC TV and radio shows.

gilleain 6 days ago||

I suspect Sandi Toksvig, one of the hosts of QI. One of the 'success' messages is "quite interestng!".

No offence mean to anyone, but the whole exercise feels very QI : superficial 'understanding' of a large range of things (for example words) without much of a connection between these words.

dsenkus 5 days ago||

I'm sure everyones scores would be a lot lower if we had to describe each word instead of selecting between silly/smart sounding definitions. As was mentioned before, it needs "I don't know" button, otherwise it's too easy to guess.

This approach could also work for getting more accurate results:

1. Show word without any definitions

2. User clicks "I know" or "I don't know"

3. If user clicked "I know", show actual definition of word

4. User selects "I was correct" or "I was not correct"

dtagames 6 days ago||

This was fun! The progression seems logical.

I scored 71,000.

slices 6 days ago|

75k here but a few of the later ones were lucky guesses.

cubano 6 days ago||

Yes...exactly the same here although the guesses often had some grounding in the root of the word.

dtagames 6 days ago||

Don't give away all our secrets, lol! Truth be told, I bet a lot of English speakers rely on this system to deal with uncommon words all the time.

luka598 3 days ago||

Whole thingy has one giant problem which is that correct answer is probably the longest one.

First try I noticed it in about the middle of the quiz and got ~65k Second try I selected only the longest and got ~78k

alentred 6 days ago||

Good fun! At first I was scared of having to answer 100 questions, but when the words got more sophisticated it turned to be more engaging. Also, the result is good for self-esteem! :) Many thanks to the author!

I wonder if the test is calibrated to the fact that some answers are just well guessed? I am not a native English speaker, but I speak 3 languages overall and have basic notions in Latin, and I have to admit it helped a lot in "deciphering" a few words that I didn't know at all. And in at least 2 cases I just guessed correctly.

vhayda 6 days ago||

The longest answer choice is correct 80%+ of the time, when it should be closer to 25%. I was able to breeze through unfamiliar words just by picking the longest option every time…

fcatalan 6 days ago||

71050, not bad for a non native speaker I guess. I missed 9/100.

But to be honest many that might catch out a native speaker are just the Spanish/French/Latin word, so it was too easy in a way.

mppm 4 days ago|

This is cute, but here is the result I got by always clicking the longest answer, or the first one if two seemed equal:

Scientific Estimate: 71,650 words

"Unbelievable. Are you actually Stephen Fry in disguise?"

Core Basics: 16/20

Intermediate: 15/20

Advanced: 19/20

Expert: 18/20

Grandmaster: 16/20

This is significant beyond this particular app, because biases like this are found all over the place in popular LLM benchmarks.

More comments...