Data Pre-processing Issues in Medical Data Classification

Main Article Content

Ashwini Tuppad
Shantala Devi Patil

Abstract

With digitalization of data and the rise of World Wide Web, access to information has been very easy and affordable. Especially the Web and the Internet have boosted research activities by facilitating access to large, publicly available medical datasets under open access scheme. These developments have resulted in explosive amounts of data being generated varying in volume, variety and velocity thus referred to as big data. Availability of such medical big data has catalyzed the research in medical predictive analytics. However, the true value of such data can be derived only after subjecting it to careful processing and analysis before drawing inferences from it. Publicly available medical datasets have noise in the form of missing values, outliers and data inconsistencies, that may affect the results or outcomes negatively. Pre-processing of such data is essential to eliminate noisy elements and refine the data to be suitable for further analysis and processing. This paper signifies the need for data pre-processing and explains the data pre-processing pipeline with various underlying stages constituting it. It also presents a comparative analysis of various data pre-processing techniques for handling missing values and outliers in a dataset..

Downloads

Download data is not yet available.

Article Details

How to Cite
Ashwini Tuppad, & Shantala Devi Patil. (2023). Data Pre-processing Issues in Medical Data Classification. Journal of Advanced Zoology, 44(S6), 1079–1084. https://doi.org/10.17762/jaz.v44iS6.2361
Section
Articles