Reverse engineering a $1B Legal AI tool exposed 100k+ confidential files

Posted by bearsyankees 12/3/2025

Reverse engineering a $1B Legal AI tool exposed 100k+ confidential files(alexschapiro.com)

821 points | 288 commentspage 3

aperture147 12/4/2025|

Hey I think I've just found a new marketing stunt for a new vibe-coding platform:

"Worried your vibe-coded app is about to be broadcast on the internet’s biggest billboard? Chill. ACME AI now wraps it in “NSA-grade” security armor."

I've never thought that there will be multiple billion-dollar-AI-features that fixes all the monkey patching problems that no one saw them coming from the older billion-dollar-AI-features that fixes all the monkey patching problems that no one saw them coming from...

6thbit 12/5/2025||

Man. Hopefully their remediation steps included a full audit of their Box's account.

One could only imagine that if OP wasn't the first to discover it, people could've generated tons of shared links for all kinds of folders, for instance, which would remain active even if they invalidated the API token.

mattfrommars 12/3/2025||

This might be off topic since we are in topic of AI tool and on HackerNews.

I've been pondering a long time how does one build a startup company in domain they are not familiar with but ... Just have this urge to 'crave a pie' in this space. For the longest time, I had this dream of starting or building a 'AI Legal Tech Company' -- big issue is, I don't work in legal space at all. I did some cold reach on lawfirm related forums which did not take any traction.

I later searched around and came across the term, 'case management software'. From what I know, this is what Cilo fundamentally is and make millions if not billion.

This was close to two years or 1.5 years ago and since then, I stopped thinking about it because of this understanding or belief I have, "how can I do a startup in legal when I don't work in this domain" But when I look around, I have seen people who start companies in totally unrelated industry. From starting a 'dental tech's company to, if I'm not mistaken, the founder of hugging face doesn't seem to have PHD in AI/ML and yet founded HuggingFace.

Given all said, how does one start a company in unrelated domain? Say I want to start another case management system or attempt to clone FileVine, do I first read up what case management software is or do I cold reach to potential lawfirm who would partner up to built a SAAS from scratch? Other school of thought goes like, "find customer before you have a product to validate what you want to build", how does this realistically work?

Apologies for the scattered thoughts...

airstrike 12/3/2025||

I think if you have no domain expertise or unique insight it will be quite hard to find a real pain point to solve, deliver a winning solution, and have the ability to sell it.

Not impossible, but very hard. And starting a company is hard enough as it is.

So 9/10 times the answer will be to partner with someone who understands the space and pain point, preferably one who has lived it, or find an easier problem to solve.

joshvm 12/3/2025||

I would also split the concerns:

1. Compliancy with relevant standards. HIPAA, GDPR, ISO, military, legal, etc. Realistically you're going to outsource this or hire someone who knows how to build it, and then you're going to pay an agency to confirm that you're compliant. You also need to consider whether the incumbent solution is a trust-based solution, like the old "nobody gets fired for buying Intel".

2. Domain expertise is always easier if you have a domain expert. Big companies also outsource market research. They'll go to a firm like GLG, pay for some expert's time or commission a survey.

It seems like table stakes to do some basic research on your own to see what software (or solutions) exist and why everyone uses them, and why competitors failed. That should cost you nothing but time, and maybe expense if you buy some software. In a lot of fields even browsing some forums or Reddit is enough. The difference is if you have a working product that's generic enough to be useful to other domains, but you're not sure. Then you might be able to arrange some sort of quid pro quo like a trial where the partner gets to keep some output/analysis, and you get some real-world testing and feedback.

strgcmc 12/3/2025|||

I think it comes down to, having some insight about the customer need and how you would solve it. Having prior experience in the same domain is helpful but is neither a guarantee nor a blocker, towards having a customer insight (lots of people might work in a domain but have no idea how to improve it; alternatively an outsider might see something that the "domain experts" have been overlooking).

I just randomly happened to read about the story of, some surgeons asking a Formula 1 team to help improve its surgical processes, with spectacular results in the long term... The F1 team had zero medical background, but they assessed the surgical processes and found huge issues with communication and lack of clarity, people reaching over each other to get to tools, or too many people jumping to fix something like a hose coming loose (when you just need 1 person to do that 1 thing). F1 teams were very good at designing hyper efficient and reliable processes to get complex pit stops done extremely quickly, and the surgeons benefitted a lot from those process engineering insights, even though it had nothing specifically to do with medical/surgical domain knowledge.

Reference: https://www.thetimes.com/sport/formula-one/article/professor...

Anyways, back to your main question -- I find that it helps to start small... Are you someone who is good at using analogies to explain concepts in one domain, to a layperson outside that domain? Or even better, to use analogies that would help a domain expert from domain A, to instantly recognize an analogous situation or opportunity in domain B (of which they are not an expert)? I personally have found a lot of benefit, from both being naturally curious about learning/teaching through analogies, finding the act of making analogies to be a fun hobby just because, and also honing it professionally to help me be useful in cross-domain contexts. I think you don't need to blow this up in your head as some big grand mystery with some big secret cheat code to unlock how to be a founder in a domain you're not familiar with -- I think you can start very small, and just practice making analogies with your friends or peers, see if you can find fun ways of explaining things across domains with them (either you explain to them with an analogy, or they explain something to you and you try to analogize it from your POV).

jimbokun 12/3/2025||

One approach is to partner with someone who is an expert in that space.

corry 12/3/2025||

"Companies often have a demo environment that is open" - huh?

And... Margolis allowed this open demo environment to connect to their ENTIRE Box drive of millions of super sensitive documents?

HUH???!

Before you get to the terrible security practices of the vendor, you have to place a massive amount of blame on the IT team of Margolis for allowing the above.

No amount of AI hype excuses that kind of professional misjudgement.

me_again 12/3/2025|

I don't think we have enough information to conclude exactly what happened. But my read is the researcher was looking for demo.filevine.com and found margolis.filevine.com instead. The implication is that many other customers may have been vulnerable in the same way.

corry 12/4/2025||

Ah, I see now that I read too quickly - the "open demo environment" was clearly referencing the idea that the vendor (Filevine) would have a live demo, NOT that each client wanted an open playground demo account that is linked to a subset of their data (which would be utterly insane).

stanfordkid 12/3/2025||

I mean... in what world would you send a customers private root key to a web browsing client. Like even if the user was authenticated why would they need this? This sort of secret shouldn't even be in an environment variable or database but stored with encryption at rest. There could easily have been a proxy service between client and box if the purpose is to search or download files. It's very bad, even for a prototype... this researcher deserves a bounty!

1vuio0pswjnm7 12/4/2025||

"(after looking through minified code, which SUCKS to do)"

Would there be a "pretty printer" or some other "unminifier" for this task

If not, then is minification effectively a form of obfuscation

gu5 12/4/2025|

https://webcrack.netlify.app/ was made exactly for this, it's a really useful tool

testemailfordg2 12/4/2025||

When liability of a corporation and its owners is limited, does it benefit their business to be dilligent in every step or have a mentality of move fast and break things?

lupire 12/3/2025||

Who is Margolis, and are they happy that OP publicly announced accessing all their confidential files?

Clever work by OP. Surely there is automatic prober tool that already hacked this product?

dghlsakjg 12/4/2025|

> Who is Margolis, and are they happy that OP publicly announced accessing all their confidential files?

Google tells me they are a NY law firm specializing in Real Estate and Immigration law. There are other firms with Margolis in the name too. Kinda doesn't matter; see below.

I doubt that they are thrilled to have their name involved in this, but that is covered by the US constitution's protections on free press.

richwater 12/3/2025||

Of course there will be no accountability or punishment.

satya71 12/4/2025|

I had a look at some of FileVine's output, and I can say I'm not surprised. This is not an organization that prizes engineering at all.

More comments...