Machine learning tools for data – the bridge to a new era of AI

Added Friday 19 July 2019

Machine learning tools can support enterprises in building a solid information architecture as a platform for innovative AI initiatives. Paul Ranson, Commercial Director at the DOT Group, tells us more.

Data degrades. Even the most conservative estimates from leading analysts say 20% of uncurated business data will be unfit for purpose at the end of twelve months – incomplete, duplicated, obsolete or just plain wrong.

In the mix of data found in most enterprises, around 80% will typically be unstructured and scattered across the organisation. Of this 80%, potentially up to 60% will be ROT: redundant, obsolete and/or trivial.

How do these stats translate into business impact?

A report by IDG states that companies with effective data grow 35% faster year on year. However, for this to happen your data needs to have a high level of accuracy, consistency and completeness.

Any line of business working with poor quality data has an uphill struggle on its hands. To give a few real-world examples…

Marketing teams will waste budget and potentially irritate customers, because their mailshots and telemarketing campaigns are off target.

Data scientists trying to extract data from siloes to identify patterns and recommend new go-to-market strategies will spend an inordinate amount of their expensive time finding and cleansing data.

When data is held in a myriad of systems, compliance and risk departments cannot readily demonstrate good governance around data protection to stringent regulatory standards. It’s difficult to respond promptly to subject access requests or prove compliance with customers’ right-to-be-forgotten requests.

In addition to these business-as-usual activities, forward-looking enterprises planning artificial intelligent initiatives will find their high hopes flounder on the rocks of poor data. There is no AI without both a governed IA (information architecture) and sound data. The old term ‘garbage in, garbage out’, coined way back in the 1950s, has never been truer than in the world of AI.

These are internal and external imperatives that make the quality of data and the data infrastructure over which it flows a key business issue

All the while, contrary to what one might expect, IT departments remain the ultimate custodians of corporate data in most enterprises. As budgets tighten, CIOs and IT managers worry about the escalating costs of storing and managing exploding data volumes and the associated security risks.

 

Ensuring data quality is fit for purpose.

Fortunately, there is a remedy and it doesn’t involve costly solutions. Vendors have stepped up to mark in developing machine learning tools to fix the problems caused by poor data quality and take the burden of data housekeeping off the shoulders of data owners and curators.

Machine learning is an application of AI that provides systems the ability to automatically learn and improve from experience without being explicitly programmed. Machine learning is already proving remarkably powerful in accomplishing a wide variety of analytics objectives, such as predicting customer churn or detecting fraud in online credit card transactions.

IBM machine learning tools can automate the process of sifting data, profiling data, standardising formats, completing fields… to provide clarity on the data, building a history and learning as they go. 

While identifying data similarities or unifying data may not be the most exciting application of machine learning, it is one of the most beneficial and financially valuable applications, enabling enterprises to transform data from a liability to be mitigated into an opportunity to be seized. In doing so, machine learning is empowering a new age of AI to monetise the data.

 

Blue-sky thinking?

Not at all. Whether you intend to store your data on premise or in the cloud, The DOT Group already has IBM-powered machine learning tools for implementing this revolutionary approach to managing data and improving its quality:

  • IBM Cloud Pak for Data simplifies and automates the way your organisation turns data into insights, bringing together the data held on-premise, in public clouds and in private clouds into single, unified solution that takes only a day or two to set up. (Formerly IBM Cloud Private for Data)
  • IBM Watson Knowledge Catalog is a data catalogue to help business users quickly and easily find, understand and use data. It creates a single source of the truth for data engineers, data stewards and scientists, and business analysts
  • The IBM Information Server Portfolio a family of products enables you to understand, cleanse, monitor, transform and deliver data.

Here at the DOT Group, we firmly believe that to maximise insight from your data and build a platform for successful AI initiatives, you must be able to know your data, trust your data and use it.

To book a free data quality assessment workshop, visit www.dotgroup.co.uk. For more information, download ‘Get the Facts - leveraging IBM capabilities to deliver quality data’.  

More Information

If you’d like hear more about this, please complete the form below:

What is planning analytics?

Meet the insiders: Maple Computing