Top
Best
New

Posted by ph4evers 4/1/2025

Show HN: Duolingo-style exercises but with real-world content like the news(app.fluentsubs.com)
I've been working on a little side project that combines Duolingo-like listening comprehension exercises with real content .

Every video is transcribed to get much better transcripts than the closed captions. I filter on high quality transcripts, and afterwards a LLM selects only plausible segments for the exercises. This seems to work well for quality control and seems to be reliable enough for these short exercises.

Would love your thoughts!

472 points | 184 commentspage 9
madduci 4/1/2025|
Amazing, this is really awesome
bomewish 4/2/2025||
Initial reactions.

1) Let us keep the right sidebar permanently out, and DON'T grey out the rest of the screen. I want to be able to click on target language words and immediately see them. Like, you've given us the translated sentence, but I can't see which word is which;

2) Colour _the same words_ in both languages when doing mouseover;

3) Or just highlight BOTH as we're listening [but note issue below!];

4) Make the keyboard use a bit more intuitive - i.e. left/right obviously means "go back or forward in the video/audio", but now I have to CLICK on the yt video again to get that behavior. It should be auto so I don't have to do that. Similarly, I want to click on a word to know it's meaning, but then go back to space->pause behavior. Rn clicking a word breaks that. Just adds friction to users.

5) Consider yt-dlp to save the videos so if we are studying one, and yt pulls it, we can keep using. Maybe for the roadmap.

6) Consider allowing us to add words to vocab -- and which vocab -- directly from mouseover [without clogging up UI - not sure there]. Right now it's a bit convoluted [right sidebar, which again should be permanent and integrated, not greying out the main screen - but even if that was fixed, that's a lot of mouse movement]

6) Handle idiomatic language issues better. You'll probably need another LLM pass/method for this, but IT'S a BIG ONE! Languages don't map 1:1 obviously, so for example this one:

https://app.fluentsubs.com/stream?v=cm8mnqrqe084ervb0mi6a4sa...

"genommen" was translated as "taken" <- means nothing.

I dump into 4o and it explains

In the phrase „genau genommen“, the word „genommen“ is part of a fixed idiomatic expression and doesn't translate literally as "taken."

„Genau genommen“ means "strictly speaking" or "to be precise."

So the full sentence:

„Wir sind heute wieder auf der Straße unterwegs, genau genommen auf dem Flohmarkt…“

translates to:

"Today we're out on the street again — strictly speaking, at the flea market…"

It’s specifying or narrowing down what “on the street” means in this context.

**

So you'll need to pull out these idiomatic phrases and then make sure they can be analysed as a single unit, so to speak. Learners are gonna have to be acquainted with those, and now the workflow is obviously broken.

Basically just get a model to bundle them and then in the sidebar on the right that has like "drill into X" you've got the PHRASE as a unit of analysis.

ph4evers 4/3/2025|
Thanks for all the feedback! Really appreciated!

1. Makes sense! 2./3. That's a bit hard, but like point 6 I think it is possible to map certain parts. 4. Makes sense 5. I put it on the roadmap but I think it is not so much of a priority now. I want to have an offline mode at a certain point (as well as a dedicated app) 6. Yes, this is hard and expensive. But I think that I should have a high quality section with proper quality control. I have some ideas to quickly create lessons as a teacher, but right now I'm mainly firefighting stability and quality

Thanks again for the extensive write-up

bomewish 4/3/2025||
For 2/3 — isn’t it just another api call to get the mappings [solving 6 as a side effect] then somehow wiring it up to the frontend like you already do?

The sidebar greying out the foreground now and not able to stay locked REALLY breaks flow. Fixing that slightly mitigates.

It’s amazing tho and I’ll subscribe soon enough.

MichaelGlass 4/1/2025||
love it! Just wanted to share my support.
anon1094 4/1/2025||
I tried the japanese version. I like that you are using real Japanese language YouTube videos. You can see the kanji on the videos though so it kind of defeated the point. Hide the video? Great idea though and very fun too.
ph4evers 4/1/2025|
Thanks for trying! You can hide the video by clicking the video button.
littlekey 4/1/2025||
I tried the Polish and it told me sorry, no news today. ¯\_(ツ)_/¯
ph4evers 4/4/2025|
Sorry, they should now be there
iamkonstantin 4/1/2025||
Did you hand-pick the videos? My first one was some Elon Musk conspiracy dumpster and the second one some church “morality” thing… I think it’s a good example of what not to do with LLMs.

Also, your page needs to disclose any content filtered by or generated by a model.

ph4evers 4/1/2025|
No I let the LLM filter on "non-war and non-politics" but I don't have a ton of content available (yet) so it might picked something that was not great. Which language did you try?
dataengineer56 4/1/2025||
The English icon has the Union Jack flag rather than the US flag, so it automatically elevates the service above Duolingo for me.
pjc50 4/1/2025||
English (Traditional) vs English (Simplified)
elric 4/1/2025|||
That meme is such a load of hogwash. In many ways, US English is closer to "traditional" than UK English. They've both diverged somewhat from what they were in the 17th century. Neither form has been "simplified" in any way.

As for the Union Jack: the UK has at least 3 rather different languages (English, Gaelic, Welsh), possibly a few more depending on how you count the different kinds of Gaelic.

Using a country flag to represent a language has always struck me as being silly. Only rarely do they map 1-to-1.

pjc50 4/1/2025||
It's entirely a joke based on the two different versions of "Chinese" offered on most websites, it's not really meant to be taken seriously. But I've heard that there's an island in New England somewhere whose local accent is closest to Elizabethan English.
npongratz 4/1/2025|||
Tangier Island off of Virginia, in the Chesapeake:

https://www.bbc.com/travel/article/20180206-the-tiny-us-isla...

Also, for what it's worth:

> Some people have characterised Tangier’s way of speaking as ‘Elizabethan’ or ‘Restoration’ English, but that’s nonsense. Languages aren’t static and the Tangier dialect has changed a lot because of its isolation. It’s a distinct creation of its own," Shores said.

csh0 4/1/2025||||
Perhaps you’re thinking of Ocracoke, North Carolina[0]

[0]https://www.bbc.com/travel/article/20190623-the-us-island-th...

watwut 4/1/2025|||
Yeah, but there is a real difference between simplified and traditional Chinese characters. Traditional are more ornamental/complicated while simplified are ... simplified/minimalist .
BalinKing 4/1/2025|||
Honest question, what's the meaning behind this joke? Is it just referencing the fact that American English drops "u" in the spelling of e.g. "color"?
pjc50 4/1/2025||
It's primarily a reference to various language selection dropdowns offering "Chinese (Traditional)" (which is used in Taiwan) and "Chinese (Simplified)" (which is used on the Chinese mainland). That difference arises from Mao-era simplification of many of the most common hanzi characters to make them easier to write or distinguish.

Mixed with, yes, the variant spellings and word choices (e.g. chips/crisps/biscuits) that make it apparent to British English readers when something is American.

BalinKing 4/1/2025||
I think my confusion is more from the implication that variant spellings imply "simplification"—even at a glance, simplified and traditional hanzi differ greatly in complexity, whereas I don't see how "chips" is any simpler than "crisps", even as a joke....

EDIT: Of course, it doesn’t matter one bit in the grand scheme of things—feel free to ignore my pedantry over a silly joke :-)

JimDabell 4/1/2025|||
This really isn’t a positive point. Flags represent nations, not languages, and it can be quite offensive to equate the two.

To use your example, there are plenty of Irish people who speak English but would resent being forced to identify with the Union Flag.

For another example that is very relevant today, there are plenty of Russian-speaking Ukrainians who hate Russia. Using the Russian flag to represent them would at best be distasteful.

coldpie 4/1/2025||
That's actually a really good point that seems obvious, but I hadn't considered before. I wonder what a better solution is. ISO language codes[1], I guess?

[1] https://en.wikipedia.org/wiki/List_of_ISO_639_language_codes

nkrisc 4/1/2025|||
That’s the problem with conflating nations and language.

For example, the very first English video I got was a South African English accent.

dotancohen 4/1/2025||
It works to a first approximation.

Of the five languages I have configured in KDE, three of them are country-specific. So I use the flag indicator, which is far quicker for me to locate and identify out of the corner of my eye than would be a text label (which would require using the retina and thus more time and attention).

nkrisc 4/2/2025||
Sure, fine for personal uses. I mean broadly and generally.

As for English, the United States has far and away the largest number of native English speakers.

Not that I think the stars and stripes has any more right to represent “English” as a concept any more than the Union Jack. If you’re going on origin, why not the flag of England instead?

dotancohen 4/2/2025||

  > If you’re going on origin, why not the flag of England instead?
I actually really like that idea. The US and UK flags seem to represent more culture than language.
nkrisc 4/3/2025||
I mostly meant that facetiously as now we're entering the linguistic quagmire of trying to pin down an exact origin for a language, and furthermore (depending on your chosen definition of "English") the language itself predates the current flag of England, so even that is open to debate regarding its appropriateness.

The moral is: don't try to draw boxes around languages.

All that said, I do understand why someone would want to use flags as shorthand for language. It's wrong, but it's useful.

hoseyor 4/1/2025||
Rather ironic, considering that it’s a flag to indicate personal union of ownership of subjects and lands by the Scottish king who inherited the subjects and lands of England, but you prefer it to be the icon for the language of the state of England, a country in which its own language is more or less indecipherable in many places due to accents, dialects, and degeneration and creolization.

You would be far more likely to understand any given English speaking person in the USA than in England. It should really be called American at this point.

mavus 4/1/2025||
> accents, dialects, and degeneration and creolization. There are just as many accents and dialects of English in the Americas as there are in Britain. Even your term "creolization" comes from Louisiana. It's a matter of perspective and something that all language learners will have the face, the difference between 'standard' English/Spanish/German and regional variations both within it's originating country and from abroad.
facile3232 4/1/2025||
Will give it a shot if you add Mandarin/simplified or swahili.
latexr 4/1/2025|
Clearly I’m in the minority, but I found the idea awful. The execution on the exercises is good—I especially like that you mix similarly sounding words—but my first thought as soon as I read your description was that the news are a terrible, worrying choice which could be misused to push a specific agenda on learners. Lo and behold, first thing I try is Musk’s dad calling people bums. Learners shouldn’t be subjected to polarised opinions at the same time they are trying to internalise words. Use instead some neutral science channels like Veritasium, CGP Grey, or Vsauce.
ph4evers 4/1/2025||
I agree that channels should be checked more carefully. I initially added two news channels per language. But some of those are pretty horrible (like Skynews). I'll put some more time in adjusting the channels. For example, TF1 is really great for French. It can be colored but it has many non-political or more local items.

I picked news channels because they often have short well spoken videos.

pjc50 4/1/2025||
This is probably better even if you just select non-US news channels showing non-US news items. NHK 24 kind of thing.