Comparative Analysis of Data Cleaning Tools Using SQL Server and Winpure Tool  
  Authors : Dr.Abdelrahman Elsharif Karrar; Moez Mutasim Ali

 

Data cleaning based on similarities involves identification of “close” tuples, where closeness is evaluated using a variety of similarity functions chosen to suit the domain and application. Current approaches for efficiently implementing such similarity joins are tightly tied to the chosen similarity function. In this paper, we compare between two cleaning tools. The two cleaning tools considered are Microsoft SQL Server2012 Data Quality Services and Winpure clean and match software. Data Quality Services is a knowledge-based system that performs both computer-assisted and interactive cleansing and matching processes using the created knowledge base. WinPure Clean & Match 2009 is the latest edition, following on from the award-winning Clean & Match 2007. It builds upon its data duplication module and now features advanced fuzzy matching logic to identify and remove more duplications. The comparison between the above two tools is carried out using academic datasets, with the Weather dataset as its input.

 

Published In : IJCAT Journal Volume 3, Issue 7

Date of Publication : July 2016

Pages : 371-377

Figures :07

Tables : 01

Publication Link :Comparative Analysis of Data Cleaning Tools Using SQL Server and Winpure Tool

 

 

 

Dr.Abdelrahman Elsharif Karrar : Taibah University Saudi Arabia.

Moez Mutasim Ali : University of Science and Technology Sudan.

 

 

 

 

 

 

 

Data Cleaning, Data Cleansing, Data Quality Services, Winpure, Datasets

Data Quality Services Provide several steps to begin the cleaning process, in the other way WinPure Clean and Match 2013 Software provide easy Data Quality Services Provide options for an automated process to clean the source data or manually go over the cleansing results and fix issues that are found, in the other way WinPure Clean and Match 2013 Software don’t provide the manual option.

 

 

 

 

 

 

 

 

 

[1] E. Rahm, "Data Cleaning: Problems and current approaches", 2004. [2] V. G. Surajit Chaudhuri, Raghav Kaushik, "A Primitive Operator for Similarity Joins in Data Cleaning," 2006. [3] M. Li Lee, T. Wang Ling, Y. Teng Ko, "Cleansing data for mining and warehousing", August 2003. [4] A. Chapman, "Principles and Methods of Data Cleaning," July 2005. [5] M. Hellerstein, "Quantitative Data Cleaning for Large Databases", February 27, 2008. [6] Microsoft, "Data Quality Services", 2012. [7] D. Leivesley, "WinPure Clean & Match 2013, Powerful Data Quality Software Featuring Advanced Fuzzy Matching Data Deduplication," 2013. [8] WinPure, "WinPure Clean & Match 2013," 2013.