p(data | stats) = p(stats | data) * p(data) / p(stats).
and p(data) is only strong for a "blob / cloud" of points, so when there's some correlation the observed stats tell you that you likely have a blob having some degree of correlation.
We just spent five years since COVID appeared to argue about statistics, with tons of bad analysis of very complicated data fuelling political rage up to this day.
The US health secretary is currently using data with "strong structure" to deny vaccines and to falsely pin down convenient targets for everything from cancer to autism.
* https://en.wikipedia.org/wiki/G._E._M._Anscombe
:)
Linear correlation is just one pattern the data can have.
Unfortunately many social science publications have reviewers who know only the basics and can't judge or accept statistically valid analysis that is outside their competence. Fit it into line or nothing.
This will require improvements to vision models, RL frameworks, etc, but will be interesting to see how much it can broaden current abilities.