Survival analysis is one of the most useful and frequently implemented statistic tools in clinical trial analysis, especially in the oncology field. Although the name of this technique may seem to be linked to the analysis of patients’ survival, it is a very versatile technique! Keep reading if you want to learn more about the features and components of this analysis!
By Mercedes Ovejero Bruna
Senior Statistician/Data Scientist at Sermes CRO's Biostatistics & Data Management Unit
What do we use survival analysis for?
In many clinical studies, one of the main variables that are usually studied is the time that passes until the occurrence of a certain event, for example, until the patient progresses from a pathology, until an adverse event occurs, until they die, etc. But it does not only focus on negative events, but we can also study the time it takes for a patient to respond to a treatment, or even the time passed until the patient is discharged from the hospital.
Basically, what is studied is the period that passes between the start of the (previously established) follow-up and the occurrence of the target event or, failing this, the end of the follow-up period if the mentioned event does not occur. Therefore, the most elementary analyses are composed of two variables that are studied simultaneously:
If the patient, for whatever reason, does not experience the event in the considered timeframe, that would be referred to, in the field of survival analysis, as a “censored case”.
Beware of censoring!
As has been pointed out, the events of interest do not always occur within the stipulated study time. These cases are called censored ones. Now, why can censored cases appear? The origin of these cases does not necessarily have to be something with a negative connotation, but the following are some of the circumstances that determine the definition of censored cases in a practical way:
In many of these cases, what happens is that no information about the patient is available until the end of the follow-up period, and, therefore, we only thing that is known is that during this observation period the patient hasn’t experienced the event of interest. What is unknown is if the patient has suffered the event at another time.
From a technical point of view, censored cases can be grouped into left-censored, interval-censored, and right-censored cases. An interesting analysis of the types of censored cases can be found at . Although basic survival analysis may not consider the type of censored case we are dealing with, it is important to note that there are advances in the modelling by type of censored case. Reading Turkson et al. (2021) will allow you to get an idea of how to deal with these circumstances.
The (basic) recipe for survival analysis
The basic elements to prepare a good survival analysis can be grouped into:
What basic results are obtained in a survival analysis?
While the survival function focuses on reporting the “non-occurrence” of the event (for example, the patient has not died), the risk function focuses on the “occurrence” of the event. This is very interesting because it allows us to pose answers to questions such as, for example, “at what point am I going to have a ‘spike’ in hospital discharges?” Curiously enough, this function is hardly ever reported, and, as we have seen, it provides more interesting information in the area of clinical studies.
Challenges of these analyses
In conclusion, survival analyses allow to study the time passed until the occurrence of a certain event. Although basic analyses are very intuitive to interpret, they can become something very complex. This makes them a real challenge in contexts such as:
As open-source software recommendations, the R packages survival and survminer, as well as Python’s scikit-survival, are versatile tools that allow for the elaboration of both basic as well as more advanced survival analysis.
References
Baethge, C., & Schlattmann, P. (2004). A survival analysis for recurrent events in psychiatric research. Bipolar Disorders, 6(2), 115-121.
Fox, J., & Weisberg, S. (2002). Cox proportional-hazards regression for survival data. An R and S-PLUS companion to applied regression, 2002.
Gómez, G., & Serrat, C. (2014). Correcting the bias due to dependent censoring of the survival estimator by conditioning. Statistics, 48(2), 295-314.
Gong, X., Hu, M., & Zhao, L. (2018). Big data toolsets to pharmacometrics: application of machine learning for time‐to‐event analysis. Clinical and translational science, 11(3), 305-311.
Kassambara, A., Kosinski, M., & Biecek, P. (2021). survminer: Drawing Survival Curves using ‘ggplot2’. R package version 0.4.9, https://CRAN.R-project.org/package=survminer.
Kleinbaum, D. G., & Klein, M. (2012). Extension of the Cox proportional hazards model for time-dependent variables. In Survival analysis (pp. 241-288). Springer, New York, NY.
Pölsterl, S. (2020). scikit-survival: A Library for Time-to-Event Analysis Built on Top of scikit-learn. Journal of Machine Learning Research, 21(212), 1–6.
Prinja, S., Gupta, N., & Verma, R. (2010). Censoring in clinical trials: review of survival analysis techniques. Indian journal of community medicine: official publication of Indian Association of Preventive & Social Medicine, 35(2), 217.
Ruth, D. M., Wood, N. L., & VanDerwerken, D. N. (2022). Fully nonparametric survival analysis in the presence of time-dependent covariates and dependent censoring. Journal of Applied Statistics, 1-15.
Therneau, T. (2022). A Package for Survival Analysis in R. R package version 3.3-1, https://CRAN.R-project.org/package=survival.
Turkson, A. J., Ayiah-Mensah, F., & Nimoh, V. (2021). Handling Censoring and Censored Data in Survival Analysis: A Standalone Systematic Literature Review. International Journal of Mathematics and Mathematical Sciences, 2021.
Xu, S., Shetterly, S., Powers, D., Raebel, M. A., Tsai, T. T., Ho, P. M., & Magid, D. (2012). Extension of Kaplan-Meier methods in observational studies with time-varying treatment. Value in Health, 15(1), 167-174.