How do you handle missing values in a random forest?

Is there a detailed comparison of the different forest missing data algorithms?
- A detailed comparison of the different forest missing data algorithms implemented using "impute" was described in a recent paper with Fei Tang "Random forest missing data algorithms", 2017. I recommend consulting the help files of "rfsrc" and "impute" from randomForestSRC for more details about imputation and OTFI.
What are random forest classifiers (RFCs)?
- One particular family of models we use is Random Forest Classifiers (RFCs). A RFC is a collection of trees, each independently grown using labeled and complete input training data. By complete we explicitly mean that there are no missing values i.e. NULL or NaN values.
How to handle missing data in biomedical research?
- Missing data are common in statistical analyses, and imputation methods based on random forests (RF) are becoming popular for handling missing data especially in biomedical research. Unlike standard imputation approaches, RF-based imputation methods do not assume normality or require specification of parametric models.


Share this Post: