Data Preprocessing for Machine learning in Python
Data Preprocessing for Machine Learning in Python Data Preprocessing is a technique that is used to convert the raw data into a clean data set. In simple words, pre-processing refers to the transformations applied to your data before feeding it to the algorithm. Need of Data Preprocessing QUALITY DATA : ( Low Quality of data gives Low Quality of mining results - Quality decisions must be based on quality data e.g., duplicate or missing data may cause incorrect or even misleading statistics . Tasks of Data Preprocessing Different steps are involved for Data Preprocessing. These steps are described below - Data Cleaning This is the first step which is implemented in Data Preprocessing. In this step, the primary focus is on handling missing data, noisy data, detection, and removal of outliers, minimizing duplication and computed biases within the data. Data Integration This process is u