Posted by LourensT 1 day ago
> You will regret using this data. You will regret using this API.
> It serves data from individual arrivals boards, which all spell stations differently.
> It describes train status in free text that varies between stations. “Approaching Barnet”, “Near Waterloo”, “Heading to Bank”, “Departing Southgate”, “Leaving Hampstead”, etc.
I'm not sure what you expected from an organisation still offering nothing but SMS-based MFA to its "customers" and one that got massively disrupted by a 17 year old in a cyber incident which seemed to paralyse the entire organisation a few months ago...
A common sign of a bad API (including this one) is when it presents data in an overly human-centric way rather than something more computer-friendly.
For a human it's really easy to see "Regents Park" and "Regent's Park" are very very likely referring to the same station, but a computer can't know that unless a human goes out of their way to tell it that.
You could argue the TfL API is perfectly fine for its intended use-case of updating the arrival screens (which are meant for humans), but it's generally better to design APIs to grow for future use-cases you haven't thought of yet. Changing an API tends to be hard once it's being used in the real world.
For example: The older TfL stations have LED matrix displays for displaying information, which are very limited in how much text they can display at once. The newer stations have big TV screens instead, which can show a lot of information. It wouldn't surprise me if this is the underlying reason behind some of the inconsistencies, especially ones like "Kings Cross" vs "King's Cross St. Pancras". I'd bet the longer names with punctuation correspond to arrival displays in the newer stations.
Its like having a printer with preset documents it can print. You set it on your desk, and others can click a button to have the chosen preset sheet come out. You can get creative by hiding some buttons, making some buttons also require a fingerprint of the user to print the paper, or the printout changes every minute, etc.
But the API printer sits on someone's server and prints objects, or organized data, and sends it to whatever you used to call, or request, from the API.
People use web browsers to hit websites, but when code hits URLs they are typically just called APIs. A website is technically an API too
An API, or Application Programming Interface, allows you to interact with software using pre-defined agreements, or contracts.
Think of API as a set of legal contracts. I use this analogy when explaining it to lawyers.
If I give you $5, and I say give me an Apple, you will give me an Apple, as expected by the predefined contract, that I receive an Apple.
If I end up receiving Broccoli, then what we have here, is a bug. Or, in other words, the contract has been broken.
Now apply this to other domains in commerce - e.g. I give an ID of an item in a store, and I get back the name of the item, it's price, and if it's stock.
I echo the sentiments on the TfL API, I've built the same Tube Tracker app over and over for more than 10 years[1] as my go-to for learning new tools[2] or testing changes to frameworks[3] and I'm not sure it's ever improved. A chap called Chris Applegate wrote extensively about his battles more than a decade ago[4], did they ever add the stations between Latimer Road and Goldhawk Road on the Hammersmith & City/Circle line?
[1]: https://www.matthinchliffe.dev/2014/03/05/building-robust-we...
[2]: https://svelte-tube-tracker.vercel.app/
[3]: https://github.com/i-like-robots/react-through-time/pulls
[4]: https://web.archive.org/web/20150620042340/http://www.qwghlm...
The detailing of things like how trains "overlap" each other is incredible
Particularly like the "Tube Tongues" metric — the second-most commonly spoken language after English by residents near each tube station, it paints a real picture of a diverse London:
https://misc.oomap.co.uk/tubecreature.com/#/tongues/current/...
But the district labels are a bit too in the way right now, and in any case it would be nice to see the stations.
It doesn't, at least not for most lines. TfL's data is notoriously inconsistent, with multiple backends used for different purposes. For most lines, the dot matrix indicators are fed by the signalling system and timetables (more modern signaling systems are timetable-aware). Meanwhile, the online API relies on estimates from TfL's TrackerNet.