That is to say, if we all have a "same browser" then there will be people with "not same", there will be divergences, it will be a mess.
A very simple solution of polymorphic fingerprinting - the fingerprinters will get a fingerprint, a good looking one, just that it will be different each time. Even better to think of when the russians poisoned the ammunition in vietnams with bad bullets. Can't make them all bed or they'll toss the whole batch. I think they arrived at 1:10. The idea is to make fingerprinting as ugly as possible - up to and no further than the point that they are forced to cook up something even more evil.
Some will say that we should forbid web content from using GPUs. But that would push apps and their users to native platforms that don't even try to minimize fingerprinting at a technical level, instead either relying on legal enforcement by monopoly gatekeepers (with all the problems that implies), or providing no protection against fingerprinting at all.
Conflating "websites" with "apps" is what lead to this mess in the first place. You can't just "try to minimize fingerprinting", you either prevent it or don't. And by the time you expose a GPU to the web, you're well inside the latter category.
IMO the only solution to the tracking epidemic is making boundaries between the two clear. Just like some random blog can't read my GPS location without asking first, it shouldn't be given access to other tracking vectors without user consent either.
Plainly false. You can minimize the number of bits of entropy in the fingerprint even in situations where a couple of bits are unavoidable, and you can mitigate fingerprinting methods by detection and/or blocking. Browsers do this today.
The web is crucial as the only free platform for distributing software to a huge chunk of consumer devices. Apple would love to strengthen their iOS app distribution monopoly by forbidding sophisticated web apps. That's why they have dragged their feet implementing more advanced web standards and limited their capabilities when they do implement them (for example making fullscreen mode unusable for games).
A single API may just yield a couple of bits, but it adds up when there are hundreds of APIs, with new ones introduced every week. And you don't need that many bits to uniquely identify someone.
But sure, leaking a few bits here and there might as well be unavoidable when two of the three major browser vendors are ad companies and preventing it isn't a priority. (See the saga about Google and 3rd-party cookies.)
> and you can mitigate fingerprinting methods by detection and/or blocking. Browsers do this today.
You can mitigate a finite set of fingerprinting methods that you know of. It becomes exponentially harder with every new tracking vector that is enabled by default, especially when the expectation is that things Just Work.
(For example, blocking canvas readout breaks canvas-based image resizing on lots of websites that use the first result from stackoverflow.)
> The web is crucial as the only free platform for distributing software to a huge chunk of consumer devices. Apple would love to strengthen their iOS app distribution monopoly by forbidding sophisticated web apps. That's why they have dragged their feet implementing more advanced web standards and limited their capabilities when they do implement them (for example making fullscreen mode unusable for games).
Respectfully, I don't see a pressing need to solve the issue of "you don't own Apple devices you pay for" by stuffing every possible API under the sun into the browser.
Besides, I'm not advocating against sophisticated web apps; I just wish browsers applied the principle of least privilege when adding features ripe for abuse. e.g. maybe I would allow GPU access for a web-based 3D game whose developer I trust, but not some random blog that will use it to either fingerprint me or run a cryptominer.
This is a pet peeve of mine. I haven't seen a sane take on this anywhere. Getting rid of 3rd party cookies to prevent tracking has been a priority for Google for many years. Everyone thinks they haven't done it because they hate privacy or something; nothing could be further from the truth. They have been blocked on disabling 3rd party cookies because of antitrust concerns coming from other ad companies who object to being blocked from tracking users.
It's not because Google "hates privacy", it's because Google operates to generate profit, and it does so from targeted advertising.
[0]: https://www.eff.org/deeplinks/2021/03/googles-floc-terrible-...
Google themselves never needed FLoC for their own ads business. Their search and video ad businesses don't need 3rd party tracking to be successful. Google has the most first party data; users literally tell Google their intent directly by typing it into the search box. Advertising on 3rd party sites is a small minority of Google's revenue, and the part of that attributable to cross site tracking is even smaller.
But Google had to provide something to replace cookie tracking for the other ad companies that don't have the first party data Google has. Those ad companies rely on 3rd party cookies to compete with Google. If Google blocked 3rd party cookies in Chrome with no replacement they would instantly be sued for leveraging their browser market share to kill their competition in the ads market, and they would lose big.
My user agent and canvas size and gpu capabilities should be unique on every single request.
(Within reason, of course)
Imagine the data start pouring in as every client is starting to present itself as a 68000 HTML5 only browser, with no react rubbish or javascript support. We might end up with a simpler, and faster internet again. Plus a lot of big tech and ad companies scratching their heads.
There is no utility for the fingerprintor of getting your unique fingerprint, when that fingerprint wont be registered ever again.
If you use Firefox, look up the CanvasBlocker extension. It can even make the fingerprint appear consistent when first party domain is the same.
> Our tests indicate that you have strong protection against Web tracking.
Which sounds like a good thing. Then it follows with:
> Your browser has a nearly-unique fingerprint
This sounds like a bad thing to me because uniqueness makes me more identifiable. Do I interpret it correctly? What's the value I should look for?
I tried safari, firefox and brave on it. Only brave failed this, every time.
Went to look on Google and found several thread about people complaining brave anti fingerprinting doesn't work very well. For instance https://community.brave.com/t/fingerprinting-protection-no-l...
Yet when I open it again in another window, it says 100% uniqueness but it's a different Signature than previously.
I think it's saying it can't track me, because signature would have to be the same for it to work yet I'm using regular Firefox, nothing fancy.
Is that technique really effective or I'm just wrong about the signature not being the same ?
You can try to be so generic that the attributes are meaningless, the meatspace version of this would be everyone wearing a Guy Fawkes mask so they all have the same face and you can't tell the difference between individuals. Or, you can wear a new generic face every single day so that nothing you did yesterday connects to you, an bland ephemeral identity.
Tor uses the former method (or tries to), everyone is up to something but you can't tell them apart because they all have the same face/browser attributes. Firefox's fingerprint resist is the second method, normally identifying values are fuzzed repeatedly so that while each signature is identifiable you won't be using it for long enough to connect them to eachother. Both strategies have their merits.
Each fingerprint component can be used to correlate other components, including TLS channel options which generally last the entire session the website is open for, or there's a network header that remains unchanged between NAT.
Given tens of thousands of potentially fingerprintable elements, you only need 5 relatively unique elements between people for a 1 in 1million, each point thereafter exponentially increases that uniqueness.
On the other hand, there is also the fact that things like spoofing web APIs only goes so far. There are other fingerprinting techniques, such as measuring properties of the GPU, which might still uniquely identify your machine (how many people have such-and-such GPU in your ZIP code?).
https://support.mozilla.org/en-US/kb/firefox-protection-agai...
It depends on what you want to win. There are two types of fingerprinting:
- Browser fingerprinting (what you see here): Make sure that your Chrome on Windows behaves like every other Chrome on Windows and it isn't really a bot pretending to be Chrome. This results in you being treated like a real user and getting less CAPTCHAs.
- User specific fingerprinting: Determining that your browser is unique among all the browsers the website has seen so that you can be tracked without cookies.
The latter is obviously bad. Some people would argue the prior is bad, but it is a LOT of work to make every browser behave like every other browser across operating systems for little privacy benefit.
You explain that their browser has all kinds of little, subtle leaks of information about what software they're using, what operating system they're using, whether it's up to date, what hardware they're running, whether they're in a public space or an office or a home, which city they're in, what ISP they use, how they've configured their monitor and screen, what settings they set in their browser, what language they use at home, etc etc
You explain that you can collect all this information without them knowing you were doing it, without them really being able to stop you if they wanted to, and that you can collate it into an identifier that lets you know every time they visit your site even if they don't tell you themselves in some way, and with no way to ask you to stop.
And you explain that you do this for them, to make their experience of your site better for them, and harder for them to accidentally break.
How do you think they'd respond?
To be clear, I'm not asking this as some rhetorical trick. There absolutely are users who wouldn't care in the least, and who might even see you as really clever for doing it.
But that's how you can know if it's bad or not. If you think your users would be creeped out or otherwise troubled by it, or might feel like you've invaded their privacy or their right to control their own experience in their own browser, then you already know it's bad. If you think they wouldn't mind, then -- and only then -- maybe it's not.
My local barber knows me when I walk in. He knows what I look like, what I wear, what I usually order.
He uses this to make my experience better. He saves me from having to tell him what I want, he knows what seat I like to sit in, and so on.
I don't have to tell him I'm coming in. He can figure it out by looking at me walking in the door.
You can recognize a writer by his style.
What GP is trying to say it's ok for people to use pattern matching but it's immoral if they use machines to do pattern matching.
Your presented person is very different from an amalgamation of clues which are not meant to disclose public information and are not you.
But this is easy to solve. Instead of rationalizing call up a customer and try it.
It becomes tracking once you say “I have an ID in a cookie, and I’m going to look up the settings for that ID in my own giant DB”.
What you’re suggesting - using fingerprinting - is the worst. It’s not reliable nor robust, it implicitly requires tracking (you have to record the fingerprint<=>setting db and look it up), and user cannot opt out of it nor trivially change state at will, etc.
There is fundamentally no legitimate reason to ever use fingerprinting over the actual explicit mechanisms for persistent storage.
But somehow it's immoral for average Joe to track not people but browsers.
I worked briefly for an ad company that not only did their own fingerprinting but bought a lot of fingerprinting data, along some other type of info: country, age cathegory, sex, income cathegory.
The lecture used to shock the students from the economics department.
e.g. For me it shows a new unique fingerprint each refresh.
https://spec.torproject.org/padding-spec/connection-level-pa...
since most of those are unlikely to actually happen (yet) with the usual dragnet ad surveillance, just using hardened firefox (arkenfox/librewolf/mullvad browser) with a vpn or just tor browser is sufficient.
My Privacy Plug ins:
Plug ins:
Blend in and spoof most popular properties
BP Privacy block all font and glyph detection
Browser plugs fingerprint privacy randomizer
Canvas Blocker
Clear URLs
Cockie Auto delete
Decentral eyes
NoScript
Privacy Badger
Temporary Containers
uBlockOrigin
Maybe like a captcha.
To hide a signal in noise, there must be noise.
https://github.com/arkenfox/user.js/wiki/4.1-Extensions#-don...
I highly recommend reading this article, it is still WIP btw: https://github.com/privacyguides/privacyguides.org/blob/e81b...
Seems like it can detect my browser, device, and OS. I kind of assumed it could do that anyways. What security concerns do I face if someone finds that information?
Another classic example of this is behavioral uniqueness. Maybe 10 people got coffee and a donut at the corner store at 8 AM, but how many of them also went to work at ABC Corp that day and also got a pepperoni roll at the pizzeria for dinner at night? Probably just one person did that.
Information is inherently more valuable when no one knows it. Just like a ponzi scheme, they need to forever collect more invasive information to reap the same benefit over time.
Javascript running on the visitors endpoint is not costly at all (the customer pays for it). Bulk data purchases of anonymized data are also quite common, and easily correlated back to the original profile (person) pre-de-anonymization.
A 1-2 month period in a metropolitan area (50k+) for a bulk sale would get you all the anonymized location data for every single person in the region, cost you about $1200, this gives you devices, travel, work, home, patterns (what restaurants you go to, what your likely demographic is, what you do every day). That is 2.4cents per person (at 50k, price going down the larger the metropolitan population).
There's an entire data processing pipeline devoted to this in a sub-niche of IT called Master Data Management.
The development of Chrome was motivated by the last mile click data, GiS collects way more than you think as well and its enabled by default in all android devices. Even if you never connect a device up, remote sensing networks may offer a connection on the unregulated bands in a mesh network like Amazon Sidewalk, and devices with radios often beacon semi-regularly.
Large companies share signal data as well, and there other sharing agreements where only a token effort is done on de-anonymization but correlations remain the same allowing deduction of the original profiles. All they need is enough points in common, which is not that high.
The business is in selling the memberships involved for access to this data without a warrant ever being needed. You perform a lookup on the data, and can use that pretty much however you want no restrictions (within the law). That is literally the product that they are selling... people.
Some day try splurging and buy access to view your accurint profile. I almost guarantee you'll be shocked. Also they don't keep this that well guarded as evidenced by the continuous rolling release of announcements regarding data breaches. You think they don't import info that's posted from a data breach to back-check their existing records? This is big data we're talking about.
Papers? Is this your normal way home from work comrade? How long does it normally take you to get home? Big brother is watching you.
It is the main contributor to this mess because (a) they allow long-lived first party cookies and (b) they carelessly add every random API without any thought about the privacy implications.
Analogy: we need safer road design and better enforcement of antisocial driving behavior and stricter penalties for hurting people with your vehicle or putting them at risk of harm.
b) It can identify the user personally because many web sites use pixels where we link the fingerprinted user to an email address and then send both to Meta, Google, Reddit etc. And since browsers like Chrome allow long lived first party cookies this works because users remain signed in for over a year.
More realistic: you wont know by what creapy process they chose to show you an advertisement. We cant imagine it.
By linking information together it gets increasingly more unique. They don't need to know your name it uses a building a bridge strategy where related data gets backfilled, and devices, and dossiers get re-targeted based to new devices on the fly based on these unique signatures, proximity, and too many other ways to count.
Some SMART street lights for example record and send back voice data to Qualcomm for processing. The advertised signature matching for this is shot-spotter, but it can be done for any audio signature server side or pushed out to the nodes in the dumb remote sensor networks for potential realtime tracking (1984). Every Tesla that catches you while you are out in public registers you in its data which is sent to a centralized system capable of tracking your every move over time, just like ALPR cameras. Roving sensor networks track everything you do, everywhere you go, what your interests are, your history...
This can include your related and semi-related device nodes, and equipment, phone, car, anything with a microprocessor and a connected network.
Your devices overnight location (home, where you sleep), your location and travel data (behavioral pattern matching), phone data needed needed to set up taps using SS7. All very illegal, but only punishable retroactively when they are caught in the act just like decrypting certain radio bands.
In conjunction with this metadata, it can be used to unmask and de-anonymize publicly purchaseable location data. Who you work for, what you are working on, etc.
From there, it can glean extremely personal insights. If you visited an ER, an abortion clinic, a doctor. Based on the vendors it can further correlate the type of services, or the fact that you might have cancer, be pregnant, have a non-public health condition, often before you yourself know.
It allows the creation of a dossier of you as a person, where you go, your habits, all the information needed to surveil you, blackmail you, coerce you, all to the highest bidder, which will be someone who took umbridge at something you did, or someone looking to vet you only to never find out that you didn't meet their expectations after they read the report and biased against you.
This information then can be used to discriminate against you without your knowledge or perception, there is no opting out. The information available allows believable lies to be fabricated where you are considered guilty without trial or basis, effectively bearing false witness.
When you deviate from patterns found, it will be used against you to justify further discrimination, or heightened risk increasing harassment, loss of opportunities, etc, all unlawfully.
Demotions at work or passed up for promotions, or firings based on unfounded accusations (cancel culture), or the mere presence in the same location (proximity).
Guilty until proven innocent for the wildest thing any crazy person might think up, but the data is collected and who is to say its false when it is just data (neutral), and it supports false narratives.
These harms is what privacy protects you from, without privacy you are considered a slave who can never change from what's written. Inherently, this thinking promotes the narrative that people don't ever change.
Coincidences in life happen, extremely unlikely things happen, but this information will always be considered proof of something else, in the worst light. Guilt by association, proximity, etc, in other words violation of your fundamental human rights, and you have no agency to change it. What comes along with it is mental coercion and torture, turtles all the way down until you break; all from making some 'inconsequential' decision somewhere about your privacy.
You piss someone off, rub them the wrong way for calling out bad behavior, or they just fixate on you, and you don't give them a second thought until you find yourself dead in your living room by the police unexpectedly, because they SWAT-ted you, or they leave other breadcrumbs that these systems view as trusted and indicative truth (when they are fabricated). AFAIK, There is a presumption in law that electronic devices are considered to be operating correctly unless you can prove otherwise (which you most often never can, given limited specs and other issues).
It is these type of security concerns that are inherent in any data collection.
Visibility of information is the first thing any adversary needs to have a successful attack on you. They can do so fast or slow. Slow involves increasing harassment, pruning your social network, making communications unreliable, torturing you and isolating you until you break; and everyone eventually breaks. Disadvantaging you, forever forward.
Geico is already using this information to justify higher rates for most members. If you own a hybrid car, and this is being mandated in the future to slow climate change, you have regenerative braking. Geico classified anything that isn't regenerative breaking as hard braking which indicates reckless driving. If you hard brake, you were a reckless driver and had to pay higher rates. They did this using your LexisNexis report which was not public until a class action lawsuit for them, years in the making. Your car mfg through the telematics data link may have sent information to these companies without your knowledge or agency.
They charged higher rates to people who owned hybrids that avoided accidents, while simultaneously causing them to incentivize causing the same said accidents by avoiding hard braking to avoid higher rates. Its circular.
There are so many public examples of the collected information being used to harm you, and the collection not being properly disclosed or there being no agency to say no.
An example of this is where data brokers would share data between their competitors, and any removed records would be returned at the next sync because deleted records were removed, but not all at once. The data repopulates, the shuffle of isolated database merges.
Data breaches are encouraged because once its out there you can't punish them after a certain period of time. Strangers can insert themselves into your life without you knowing.
There was an interesting recent project where a person used AI facial recognition in conjunction with smart glasses to pull public dossiers and pretend to be people these targets met in the past, and this was done at a subway stop. Chance meeting... you give someone enough information that is non-public and they believe your plausible story. Can't seem to find the project now, but there was a youtube video about it.
Master Data Management is the area that touches these systems the most in IT.
Privacy is the right to not be blackmailed, coerced, or generally speaking at the mercy of malevolent people seeking you harm directly or indirectly.
https://www.qualcomm.com/news/onq/2021/04/how-juganus-smart-...
The Qualcomm smart streetlights have been around since 2016.
Do you suppose you have an expectation of privacy if its just two people on an empty public street? If you are tagged, just like the whales, deer, and other wildlife; are you an animal or a human? Food for thought.
A lot of the web works surprisingly well still, and you can turn on just what you need when you need it, placing your most visited sites on an allow list, but still denying a lot of third party things on those sites.
The internet is a joy with js disabled virtually everywhere. And all the canvas fingerprinting, webrtc leak, font fingerprinting, super cookies, etc... are all defeated by simply not running JavaScript
I am not some no-js evangelist or javascript hater or anything, but a huge amount of the web really does work fine (sometimes better even!) without js enabled by default. I don't think it has to be strictly either-or.
Guess which of these two categories I spend most of my time on...
Chrome on the other hand will be give me two identical values.
So I guess Apple is doing something ...