BEGIN ARTICLE PREVIEW:
In machine learning when we build a model for classification tasks we do not build only a single model. We never rely on a single model since we have many different algorithms in machine learning that work differently on different datasets. We always have to build a model that best suits the respective data set so we try building different models and at last we choose the best performing model. For doing this comparison we cannot always rely on a metric like an accuracy score, the reason being for any imbalance data set the model will always predict the majority class. But it becomes important to check whether the positive class is predicted as the positive and negative class as negative by the model.
For this, we make use of Receiver Characteristics Curve – Area Under Curve that is plotted between True positive and False positive rates. In this article, we will learn more about the ROC-AUC curve and how we make use of it to compare different machine learning models to select the best performing model. For this experiment, we will make use of Pima-Indian Diabetes that can be downloaded from Kaggle.
What we will learn from this article?
END ARTICLE PREVIEW