Khalkhali, H.R and Esnaashari, O and Lotfnezhad Afshar, H and Jabbari, N (2016) Applying Data Mining Techniques to Extract Hidden Patterns about Breast Cancer Survival in an Iranian Cohort Study. JRHS, 16 (1). pp. 31-35.
326.pdf
Download (542kB) | Preview
Abstract
Breast cancer survival has been analyzed by many standard data mining
algorithms. A group of these algorithms belonged to the decision tree category. Ability of the
decision tree algorithms in terms of visualizing and formulating of hidden patterns among study
variables were main reasons to apply an algorithm from the decision tree category in the current
study that has not studied already.
Methods: The classification and regression trees (CART) was applied to a breast cancer
database contained information on569 patients in 2007-2010. The measurement of Gini impurity
used for categorical target variables was utilized. The classification error that is a function of tree
size was measured by 10-fold cross-validation experiments. The performance of created model
was evaluated by the criteria as accuracy, sensitivity and specificity.
Results: The CART model produced a decision tree with 17 nodes, 9 of which were associated
with a set of rules. The rules were meaningful clinically. They showed in the if-then format that
Stage was the most important variable for predicting breast cancer survival. The scores of
accuracy, sensitivity and specificity were: 80.3%, 93.5% and 53%, respectively.
Conclusions: The current study model as the first one created by the CART was able to extract
useful hidden rules from a relatively small size dataset.
Item Type: | Article |
---|---|
Uncontrolled Keywords: | Breast Neoplasms Survival Data Mining CART Decision Tree |
Subjects: | R Medicine > R Medicine (General) |
Depositing User: | Unnamed user with email gholipour.s@umsu.ac.ir |
Date Deposited: | 09 Oct 2018 07:51 |
Last Modified: | 10 Feb 2019 06:51 |
URI: | https://eprints.umsu.ac.ir/id/eprint/5189 |