Do you have Dirty Data?

First…what is considered dirty data?

  • Not structured / Free Text
  • Not complete
  • No Use of Nomenclature (Name Conventions)
  • Garbage / fictitious / ambiguous data
  • Many duplications
  • Many data elements Represent “Old /Non Moving” materials


How many times have you made a decision based on the wrong information? Having the right information is key in being successful; and, that’s why having accurate data is critical to everyday function. For example data mining, on inaccurate and “DIRTY DATA” can be a waste of time for the data engineer and the data requester. If the requester bases their decisions on faulty data, it could hinder their outcome. Data integrity plays a bigger role than most people anticipate. Most of the reasons data gets dirty is laziness, poor input planning processes, and lack of consistency input.


Data Management Services are focused on increasing the effectiveness of the procurement, inventory, and maintenance management functions across the enterprise through the standardization and enhancement of MRO Material, Vendor and Service Masters Data within the ERP, as well as optimizing inventory information relating to Inventory Management information. Dirty data is a term used by Information technology (IT) professionals when referring to inaccurate information (data) collected from data capture forms. There are several causes of dirty data. In some cases, the information is deliberately distorted. A person may insert misleading or fictional personal information which appears real. Such dirty data may not be picked up by an administrator or a validation routine because it appears legitimate. Duplicate data can be caused by repeat submissions, user error, or incorrect data joining. There can also be formatting issues or typographical errors. A common formatting issue is caused by variations in a user's preference for entering phone numbers.  Gartner research shows that poor-quality customer data leads to significant costs, such as higher customer turnover, excessive expenses from customer contact processes like mail-outs and missed sales opportunities. But companies are now discovering that data quality has a significant impact on their most strategic business initiatives, not only sales and marketing. Other back-office functions like budgeting, manufacturing and distribution are also affected. Compliance and transparency are now at the top of the list of most companies’ data concerns, according to Gartner. 1


Next week, check back for Dirty Data Part II – Why you should fix it, who benefits when you do, and how Ariba can help!