Top
Best
New

Posted by sebg 10/28/2024

Survival Analysis Part I: Basic concepts and first analyses (2003)(www.nature.com)
38 points | 9 comments
openrisk 10/29/2024|
There is a nice Python library for survival analysis [1] and of course lots of R packages [2]

[1] https://lifelines.readthedocs.io/en/latest/

[2] https://cran.r-project.org/web/views/Survival.html

thetwentyone 10/29/2024||
For Julia users:

https://github.com/JuliaSurv and https://github.com/JuliaStats/Survival.jl

vindex10 10/29/2024|||
There is also StatsModels package in Python implementing parts of R survival:

https://www.statsmodels.org/stable/duration.html

laichzeit0 10/29/2024||
https://autonlab.org/auton-survival is also pretty good and is built on PyTorch.
kqr 10/29/2024||
I've used survival analysis for customer retention: https://entropicthoughts.com/survival-analysis-for-customer-...

It's especially fun when you start measuring survival time in things like dollars instead of years, because then we get a less biased estimation of customer lifetime value – something many organisations misestimate.

apwheele 10/29/2024||
I agree understanding KM is a very good place to start survival analysis. Many examples in my business I have for KM the censoring is due to certain events taking along time (auditing healthcare claims) to resolve.

When I first learned survival analysis, my professor had me construct life-tables, and then learned KM. You can often do quite a bit with discrete time tables, so if you have data:

    ID TimeRange Outcome
     A   4          1
     B   3          0
You can then explode the data into the form:

    ID Time Outcome
     A   1     0
     A   2     0
     A   3     0
     A   4     1
     B   1     0
     B   2     0
     B   3     0
If you groupby this table and get the numerator/denominator, that is what you need to calculate the life-table, and the discrete version of the KM plot.

Understanding that also allows you to use more typical binary regression or machine learning models, and then you just calculate the cumulative hazard from the predictions afterwards, https://andrewpwheeler.com/2020/09/26/discrete-time-survival....

0xrafu 10/29/2024||
I see lots of medical papers using HR without addressing the PH assumption atall, RMST and other alternatives are rare...
xiaodai 10/29/2024||
why?
fearthetelomere 10/29/2024|
Survival analysis is often also called time-to-event analysis or failure analysis in engineering. In a general sense, it's used to model time until a "death" event, whether that could be divorce, machine part failure, or actually a person's death.

While methods like linear or polynomial regression can be used predict continuous variables like number of patient deaths over time, survival analysis gives us more context and allows us to make inferences about the survival of individuals and populations as a whole, in addition to forecasting illness.

It does this in a few ways, but firstly by giving us a more rich language to discuss survival data. This is where concepts like "censoring", etc. come into play. It should be noted that censoring here does not refer to the colloquial use of the word, but rather the statistical definition [1].

Survival analysis in practice includes training KM-curves, Cox-PH models, etc. by fitting them to the underlying survival data based on this censored data. It's very possible that new and more robust methods have come out since I last touched the subject around 2015. Modern NNs may blow these models out of the water in terms of pure predictive power, but from a population-modeling perspective survival analysis would likely still be very useful.

[1] https://en.wikipedia.org/wiki/Censoring_(statistics)