Posted by sebg 10/28/2024
https://github.com/JuliaSurv and https://github.com/JuliaStats/Survival.jl
It's especially fun when you start measuring survival time in things like dollars instead of years, because then we get a less biased estimation of customer lifetime value – something many organisations misestimate.
When I first learned survival analysis, my professor had me construct life-tables, and then learned KM. You can often do quite a bit with discrete time tables, so if you have data:
ID TimeRange Outcome
A 4 1
B 3 0
You can then explode the data into the form: ID Time Outcome
A 1 0
A 2 0
A 3 0
A 4 1
B 1 0
B 2 0
B 3 0
If you groupby this table and get the numerator/denominator, that is what you need to calculate the life-table, and the discrete version of the KM plot.Understanding that also allows you to use more typical binary regression or machine learning models, and then you just calculate the cumulative hazard from the predictions afterwards, https://andrewpwheeler.com/2020/09/26/discrete-time-survival....
While methods like linear or polynomial regression can be used predict continuous variables like number of patient deaths over time, survival analysis gives us more context and allows us to make inferences about the survival of individuals and populations as a whole, in addition to forecasting illness.
It does this in a few ways, but firstly by giving us a more rich language to discuss survival data. This is where concepts like "censoring", etc. come into play. It should be noted that censoring here does not refer to the colloquial use of the word, but rather the statistical definition [1].
Survival analysis in practice includes training KM-curves, Cox-PH models, etc. by fitting them to the underlying survival data based on this censored data. It's very possible that new and more robust methods have come out since I last touched the subject around 2015. Modern NNs may blow these models out of the water in terms of pure predictive power, but from a population-modeling perspective survival analysis would likely still be very useful.