After the cleaning process, the dataset is consistent with other similar datasets in the system as all consistencies are removed. The process is different from data validation and involves removal of typographical errors as well. Well known techniques like data transformation, statistical methods, parsing (detect the syntax errors) and duplicate eradication are used for data cleansing. Good and clean data needs to fulfill criteria mentioned below:
• Accuracy: including integrity, density and consistency.
• Completeness: Difference of data should be corrected.
• Density: The proportion of omitted values in the data and number of total values must be well known.
• Consistency: Concerned with challenges and syntactical differences.
• Uniformity: Is directed to irregularities or indiscretions.
• Integrity: A combined value over the criteria of completeness and soundness.
• Uniqueness: Related to number of duplicates in the data.
The cleansing services offered by most data cleaning companies are:
• Removal of duplicate ideas.
• Tagging and identifying same records or facts.
• Removing forged or bogus and untrue proof.
• Data validation.
• Deleting outdated records.
• Comparing and removing facts of third party in sequence as opt-in and opt-out list.
• Data cleansing, aggregation and organization.
• Identifying incomplete or misplaced facts or figures.
• Improving facts including product characteristics, assemble order and metaphors.
• Eliminating duplicate data or figures, which many look as similar records.
The common challenges faced by data cleansing applications are:
• Many a times there is a loss of information in the corrected data. No doubt, invalid and duplicate entries are deleted, but many a times the information is limited and insufficient for some entries. This too is deleted leading to a loss of information.
• Data cleansing is highly expensive and time consuming. Thus, it is important to maintain it effectively.
Fortunately, the benefits are worth much more than the challenges. Thanks to this, most companies have adopted this activity and this has led to a growing importance of the application.
Article Source: http://EzineArticles.com/4313959

