LinkedIn checks for 2953 browser extensions

Posted by mdp 6 hours ago

LinkedIn checks for 2953 browser extensions(github.com)

289 points | 142 comments

cbsks 5 hours ago|

Looks like Firefox is immune.

This works by looking for web accessible resources that are provided by the extensions. For Chrome, these are are available in a webpage via the URL chrome-extension://[PACKAGE ID]/[PATH] https://developer.chrome.com/docs/extensions/reference/manif...

On Firefox, web accessible resources are available at "moz-extension://<extension-UUID>/myfile.png" <extension-UUID> is not your extension's ID. This ID is randomly generated for every browser instance. This prevents websites from fingerprinting a browser by examining the extensions it has installed. https://developer.mozilla.org/en-US/docs/Mozilla/Add-ons/Web...

rchaud 4 hours ago||

And they said that using a browser with sub-5% market share would cause us to miss out on the latest and greatest in web technology!

userbinator 52 minutes ago|||

The latest and greatest is not great for you, but for them.

dana321 4 hours ago|||

chrome was made by ex-firefox devs, chrome is still not as good!

awesome_dude 4 hours ago||

This is probably a naive question, but...

Doesn't the idea of swapping extension specific IDs to your browser specific extension IDs mean that instead of your browser being identifiable, you become identifiable?

I mean, it goes from "Oh they have X, Y , and Z installed" to "Oh, it's jim bob, only he has that unique set of IDs for extensions"

triceratops 4 hours ago|||

It's not a naive question. This comment says it's not possible to do that: https://news.ycombinator.com/item?id=46905213

awesome_dude 4 hours ago||

Oh, it's (re)randomised upon each restart, whew, thanks for the heads up

edit: er, I think that that also suggests that I need to restart firefox more often...

tech234a 3 hours ago|||

The webpage would have to scan the entire UUID space to create this fingerprint, which seems unlikely.

throwaway808081 3 hours ago|||

Just have a database of UUIDs. Seems pretty trivial to generate and sort as it's only 16 bytes each.

pshirshov 1 hour ago|||

That's actually a bright idea! Have you ever thought about applying for VC funds?

Once you deliver that, you can also think about a database of natural numbers!

voussoir 57 minutes ago||||

Relevant: https://news.ycombinator.com/item?id=42342382

Dylan16807 1 hour ago||||

"Just" have a database, and then what? I can set up a database of all UUIDs very easily, but I don't think it's helpful.

dullcrisp 3 hours ago||||

https://libraryofbabel.info/

stirfish 3 hours ago|||

lol

Let's go a step further and just iterate through them on the client. I plan on having this phone well past the heat death of the universe, so this is guaranteed to finish on my hardware.

  function* uuidIterator() {
   const bytes = new Uint8Array(16); 
   while (true) {
     yield formatUUID(bytes);

     let carry = 1;
     for (let i = 15; i >= 0 && carry; i--) {
       const sum = bytes[i] + carry;
       bytes[i] = sum & 0xff;
       carry = sum > 0xff ? 1 : 0;
     }
 
     if (carry) return;
   }
 }
 
 function formatUUID(b) {
   const hex = [...b].map(x => x.toString(16).padStart(2, "0"));
   return (
     hex.slice(0, 4).join("") + "-" +
     hex.slice(4, 6).join("") + "-" +
     hex.slice(6, 8).join("") + "-" +
     hex.slice(8, 10).join("") + "-" +
     hex.slice(10, 16).join("")
   );
 }

This is free. Feel free to use it in production.

b00ty4breakfast 2 hours ago||

Free space heater

jorvi 2 hours ago|||

Doing it on restart makes the mitigation de facto useless. How often do you have 10, 20, 30d (or even longer) desktop uptime these days? And no one is regularly restarting their core applications when their desktop is still up.

Enjoy the fingerprinting.

tristan957 2 hours ago|||

I restart my browser basically every day.

cyanydeez 2 hours ago||

yeah I close out everything as a mental block against anything I'm working on.

I think there's a subset of people that offload memory to their browsers and that's kinda scary given how these fingerprint things work.

eek2121 2 hours ago|||

Umm, I restart my PC about once a week for security and driver updates.

If you don't, you have a lot more to worry about beyond fingerprinting...

Oh and I'm on LINUX (CachyOS) mind you.

b112 4 hours ago||||

Maybe, but how long are the extension ids? And if they are random, how long to scan a trillion random alphanumeric ids, to find matches?

I presume the extension knows when it wants to access resources of its own. But random javascript, doesn't.

maples37 4 hours ago|||

The extension IDs are UUIDs/GUIDs, so 128 bits of entropy. No site is going to be able to successfully scan that full range.

Sophira 48 minutes ago|||

And just in case the magnitude of that isn't obvious to people, that means there are 340,282,366,920,938,463,463,374,607,431,768,211,456 total possible UUIDs. Good luck.

b112 4 hours ago|||

ChatGPT told me it can be done though.

It won't disclose how, as it says it has had several users report it. And that it expects 50% of the bounty, and will use it for GPU upgrades.

calvinmorrison 43 minutes ago|||

yes thats how browser fingerprinting works and it is impossible to defeat because there are just too many variations in monitors (relevant for fonts), simple things like user agent, etc.

rdoherty 5 hours ago||

Skimming the list, looks like most extensions are for scraping or automating LinkedIn usage. Not surprising as there's money to be made with LinkedIn data. Scraping was a problem when I worked there, the abuse teams built some reasonably sophisticated detection & prevention, and it was a constant battle.

cxr 5 hours ago||

In order to create the data source that LinkedIn's extension-fingerprinting relies on to work, someone (at LinkedIn*?) almost certainly violated the Chrome Web Store TOS—by (perversely*) scraping it.

* if LinkedIn didn't get it from an existing data source

RHSman2 9 minutes ago|||

I had the pleasure of scraping LinkedIn for a client. Great fun.

winddude 4 hours ago|||

a problem for linkedin != "a problem". The real problem for people is the back room data brokering linkedin and others do.

bryanrasmussen 5 hours ago|||

from the code doesn't look like they do anything if they have a match, they just save all the results to a csv for fingerprinting?

cxr 4 hours ago||

"The code" here you're referring to (fetch_extension_names.js[1]) isn't and doesn't claim to be LinkedIn's fingerprinting code. It's a scraper that the researcher behind this repo wrote themselves in order to create the CSV of the data that they're publishing here.

LinkedIn's fingerprinting code, as the README explains, is found in fingerprint.js[2], which embeds a big JSON literal with the IDs of the extensions it probes for. (Sickeningly enough, this data starts about two-thirds of the way through the file* and isn't the culprit behind the bulk of its 2.15 MB size…)

* On line 34394; the one starting:

    const r = [{
                id: "aacbpggdjcblgnmgjgpkpddliddineni",
                file: "sidebar.html"

1. <https://github.com/mdp/linkedin-extension-fingerprinting/blo...>

2. <https://github.com/mdp/linkedin-extension-fingerprinting/blo...>

hsbauauvhabzb 5 hours ago|||

Wont someone think of poor little LinkedIn, a subsidiary of one of the largest data brokers in the world?

charcircuit 5 hours ago|||

Why frame what you are trying to say like that? Businesses of all sizes deserve the ability to protect their businesses from abuse.

jmward01 5 hours ago|||

Do they respect my data? Why do they get to track me across sites when I clearly don't want them to but someone can't scrape their data when they don't want them to. Why should big companies get the pass but individuals not? They clearly consider internet traffic fair game and are invasive and abusive about it so it is not only fair to be invasive and abusive back, it is self defense at this point.

hsbauauvhabzb 5 hours ago|||

They don’t need to track your web browser when they’re owned by Microsoft, because they track every action at a lower level.

0x1ch 3 hours ago|||

Weird, I don't use Windows as an OS but have linkedin. I'd believe the concern and disregard of Linkedin's concern is fair game.

missingdays 4 hours ago|||

What lower level? Microsoft owns internet?

zelphirkalt 4 hours ago||

The operating system. For example see the Windows 11 screenshot debacle/scandal.

Dylan16807 1 hour ago||

Are you talking about Recall, which got such huge negative press they delayed it a year and added a clear opt-in? And never sent anything off the device itself?

If anyone has evidence of constant tracking and reporting then please share it.

john-h-k 3 hours ago|||

Because you signed up to a set of terms and conditions saying LinkedIn can use your data in this way

hsbauauvhabzb 1 hour ago|||

No one likes paying taxes but they still do it. They could just not work and not have money and therefore not need to pay tax.

echelon 2 hours ago|||

I didn't want the web to turn into monolithic platforms. I abhor this status quo.

You cannot function without these enterprises, but that doesn't mean they're ideal or even ethical.

Microsoft wins because of network effects. It's impossible to compete. So I think it should be allowed to assail their monopoly here by any means. It's maximally fair for consumers and for free markets.

Ideally capitalism remains cutthroat and impossible to grow into undislodgeable titans.

Even more ideally, this would become a distributed protocol rather than a privately owned and guarded database.

ronsor 5 hours ago||||

I think they framed it this way because they don't consider scraping abuse (to be fair, neither do I, as long as it doesn't overload the site). Botting accounts for spam is clear abuse, however, so that's fair game.

hsbauauvhabzb 5 hours ago||

No, I consider all data collection and scraping egregious. From that perspective, LinkedIn is hypocritical when Microsoft discloses every filesystem search I do locally to bing.

dylan604 4 hours ago||

Are you not scraping a site with your eyeballs when you view a site?

RockRobotRock 3 hours ago||||

When they scrape, it’s innovation. When you scrape, it’s a felony.

nitwit005 5 hours ago||||

I'm sure there are issues with fake accounts for scraping, but the core issue is that LinkedIn considers the data valuable. LinkedIn wants to be able to sell the data, or access to it at least, and the scrapers undermine that.

They could stop all the scraping by providing a downloadable data bundle like Wikipedia.

sidrag22 33 minutes ago|||

thinking more about, I don't think its a terrible thing that they prevent scraping. Their listings are already suffering from being flooded with garbage applications and having to sift through tons of noise. allowing scraping would just amplify that and make the platform almost entirely worthless.

I "scrape" linkedin in a roundabout way for personal use, and really what Ive found is that i should just maybee not bother at all. I can't get through the noise even when im applying at places that heavily match my skillset, and just get automated rejection emails.

compiler-guy 4 hours ago|||

LLMs scrape Wikipedia all the time, or at least attempt to.

The data bundle doesn't help that at all.

nitwit005 1 hour ago||

That's true, the normal scraping would still happen, but it would eliminate this side business of trying to re-sell LinkedIn's data.

sellmesoap 5 hours ago||||

We enjoy the fruits of an LLM or two from time to time, derived from hoards of ill gotten data. Linkedin has the resourses to attempt to block scraping, but even at the resource scale of LI I doubt the effort is effective.

charcircuit 5 hours ago||

I am not denying that scraping is useful. If it wasn't people wouldn't do it. But if the site rules say you aren't allowed to scrape, then I don't think people should be hostile towards the people enforcing the rules.

ronsor 5 hours ago||

Well, they can try to enforce the rules; that's perfectly fair. At the same time, there are many methods of "trying" which I would not consider valid or acceptable ones. "Enforcing the rules" does not give a carte blanche right to snoop and do "whatever's necessary." Sony tried that with their CD rootkits and got multiple lawsuits.

b112 4 hours ago||||

Yes, until it becomes abusive and malignly affects innocents.

mistrial9 1 hour ago||||

this exchange -- obvious critical / perhaps insurrection speech versus a stable voice of business economics -- should be within the purview of an orderly and predictable legal environment. BUT things moved quickly in the phone battles. Some people say that the legal system has never caught up to the data brokering, and in fact the surveillance state grew by leaps and bounds.

So, reasonable people may disagree. This is a fine place to mention it .. what if individual profiles built at LinkedIn are being combined with illegitimate and even directly illegal surveillance data and sold daily? Everyone stand up and salute when LinkedIn walks in the room? there has to be legal and direct ways to deal with change, and enforcement to complete an orderly and predictable economic marketplace.

cyanydeez 2 hours ago||||

the abuse>using the information they publish to the public

qotgalaxy 1 hour ago||||

[dead]

schmidtleonard 5 hours ago|||

The big social media businesses deserve a Teddy Roosevelt character swooping in and busting their trusts, forcing them to play ball with others even if it destroys their moats. Boo hoo! Good riddance. World's tiniest violin.

This is a popular position across the aisle. Here's hoping the next guy can't be bought, or at least asks for more than a $400M tacky gold ballroom!

xp84 5 hours ago|||

I mean, regardless of who they are or even if you don’t like what LinkedIn does themselves with the data people have given them, the random third parties with the extensions don’t additionally deserve to just grab all that data too, do they?

mathfailure 5 hours ago|||

Surely they do! The data is in the public internets, aren't they?

ronsor 5 hours ago||

They'd put Widevine or PlayReady DRM on the website if they could, I'm sure.

bigfishrunning 4 hours ago||

why can't they?

josephg 5 hours ago||||

Eh. I worked at a company which made an extension which scraped LinkedIn. We provided a service to recruiters, who would start a hiring process by putting candidates into our system.

The recruiters all had LinkedIn paid accounts, and could access all of this data on the web. We made a browser extension so they wouldn’t need to do any manual data entry. Recruiters loved the extension because it saved them time.

I think it was a legitimate use. We were making LinkedIn more useful to some of their actual customers (recruiters) by adding a somewhat cursed api integration via a chrome extension. Forcing recruiters to copy and paste did’t help anyone. Our extension only grabbed content on the page the recruiter had open. It was purely read only and scoped by the user.

RHSman2 2 minutes ago|||

I started their but it felt like a dodgy way (as it could be seen to be illegal). We then just went aloffical and went through Google search API’s with LinkedIn as the target. Worked a treat and was cheaper than recruiter!!!

So when pay the highest scraper, it’s ok! Same data, different manner.

xp84 2 hours ago|||

Doesn't sound like your operation was particularly questionable, but I can imagine there must be some of those 3,000 extensions where the data flow isn't just "DOM -> End User" but more of a "Dom -> Cloud Server -> ??? -> Profit!" with perhaps a little detour where the end user gets some value too as a hook to justify the extension's existence.

hsbauauvhabzb 4 hours ago||||

I say the same thing about my start menu sending every action I perform to bing.

sieabahlpark 5 hours ago|||

[dead]

dumbo23 4 hours ago||

[dead]

Banditoz 9 minutes ago||

LinkedIn has been employing a lot of strange dark patterns recently:

* Overriding scroll speed on Firefox Web. Not sure why.

* Opening a profile on mobile web, then pressing back to go to last page, takes me to the LinkedIn homepage everytime.

* One of their analytic URLs is a randomly generated path on www.linkedin.com, supposedly to make it harder to block. Regex rules on ublock origin sufficiently stop this.

Anyone know why they could be doing this?

bastard_op 4 hours ago||

Chrome is the new IE6. Google set themselves up to be the next Microsoft and is "ad friendly" in all the creepy ways because that's what Google IS an ad company. All they've contributed to security is diminishing the capability of adblockers and letting malware to do bad things to you as consumers.

userbinator 49 minutes ago||

Chrome has become much worse than IE6. Microsoft was not in the business of tracking users and selling ads back then.

0xbadcafebee 4 hours ago|||

He who controls the Ads, controls the Internet.

themafia 4 hours ago|||

> Google set themselves up to be the next Microsoft

Google became a monopoly. All monopolies do this.

cyanydeez 2 hours ago||

there's a step before that. Google is a pure capitalist enterprize>pure capitalism goes to monopoly>all monopolies do this.

brianpbeau 3 hours ago||

Imagine being the nerd that is still using Chrome in the YOL 2026.

jmyeet 2 minutes ago||

I started using Chrome at version 2 I think. It still had the 3D logo. It was such a breath of fresh air and the big innovation was running one process per tab. Firefox existed but the entire browser could (and did) hang. And IE was... well, IE.

I did have a relatively early beef with Chrome though, whcih was I couldn't completely opt out of Flash. As in, I didn't even want it installed. This turned out to be an issue because Flash turned out to be one of the earliest vectors for so-called "zombie cookies".

Fingerprinting in general has been a longstanding problem and has become more and more advanced.

Add to this that Google is, first and foremost, an advertising business and they've become increasingly hostile to ad-bloccking tech for obvious reasons.

Basically what I'm getting at is something I couldn't have imagined a decade ago where I think I really have go switch away from Chrome to something that takes privacy and security seriously so that LinkedIn can't do things like this. And I increasingly don't trust Google to do that.

I actually have more trust in Apple because they have historically been user-focused eg blocking Meta's third party cookies. But obviously Safari isn't an option because it's not cross-platform.

I'm not sure I trust the current state of Mozilla. What's the alternative? Brave? Is Opera still a thing? I honestly don't know.

What I really want is a cross-platform browser written in Rust that black-holes ads out of the box. Why Rust? Memory safety. I simply don't trust a large C/C++ code to never have buffer overruns. Memory safety has become too important.

I don't want my browser to provide information on what extensions I'm using to a site and that shouldn't be a thing I have to ask for or turn on in any way.

minkeymaniac 5 hours ago||

I can confirm.. open up linkedIn.. hit F12 and watch the error count keep going up and up and up

Screenshots found here https://x.com/DenisGobo/status/2018334684879438150

9021007 5 hours ago|

xcancel link: https://xcancel.com/DenisGobo/status/2018334684879438150

shouldnt_be 5 hours ago||

I wrote an article about it a couple of months ago. I also explain why, how and a way to prevent it.

https://javascript.plainenglish.io/the-extensions-you-use-ar...

jmholla 4 hours ago|

To clarify, you talk about why it's possible, not why LinkedIn is doing it, right? Or did I miss something in your article.

avastel 4 hours ago||

I wrote a blog post recently about the technique used by LinkedIn to do extension probing, as well as other ways to do it with less side effects

https://blog.castle.io/detecting-browser-extensions-for-bot-...

pests 3 hours ago|

Nice write up, definitely exactly this.

bitbasher 2 hours ago||

The list of extensions being scanned for are pretty clear and obvious. What is really interesting to me are the extensions _not_ being scanned for that should be.

The big one that comes to mind is "Contact Out" which is scan-able, but LinkedIn seems to pretend like it doesn't exist? Smells like a deal happened behind the scenes...

https://chromewebstore.google.com/detail/email-finder-by-con...

cxr 2 hours ago|

That extension cannot be fingerprinted by its content-accessible resources. It doesn't declare any in its manifest.

bitbasher 3 hours ago|

Looks like this has been known since 2019.

https://www.nymeria.io/blog/linkedins-war-on-email-finder-ex...

More comments...