Implementing Artificial Intelligence and Internet of Things technologies into your operation is critical to running an efficient enterprise but without a proper data governance strategy to identify the and collect all relevant data, you're not using your new technology to its best ability. Here's how to determine and qualify data sourced from IoT and then used by AI.
The fundamental and recurring theme when evolving Data Governance programs within an organization, as Artificial Intelligence becomes more pervasive, is the importance of data quality so AI deep-learning, image processing, natural language processing, and others can effectively process this data and provide optimal results.
One unique perspective about data governance is the interdependence between AI, Big Data, and Internet of Things (IoT). Our ultimate objective is to drive value through continuous improvement, and data governance across these three pillars is highly coupled and demands a collective strategy to adequately define and implement.
This isn’t to say that IoT technology, for example, is the only data source from which we need to consider, as archives, media, corporate documents, social media feeds, and data logs are all relevant data sources that can provide relevant information that can be processed by Artificial Intelligence models to extract insights that lead to business value and continuous improvement.
But, IoT is providing a compelling and growing foot-print as a data-source that Artificial Intelligence can extract insights from.
There will be more than 25 billion interconnected IoT devices across the globe by 2020 generating more than 500 zettabytes of data per year (1 zeta byte = 1,000,000,000,000,000,000,000 bytes). -Gartner
The landscape of IoT devices is heterogeneous and device standards are very diverse. This lends itself to data-source eco-systems that demand carefully built data governance mechanisms to evaluate the quality of information being collected, processed, and transmitted.
So we’ve established that IoT is an important data source and even more importantly one that provides a considerable level of complexity thereby demanding priority over other information technologies. Building a ground-up data governance strategy with the intent of continuous improvement empowered through AI and driven by IoT data is a unique undertaking.
Here is a specific example of how one can plan and implement a data governance process.
One of the ways of determining what data is important to collect is through a technique called ideation. This can be done from scratch or you can revisit a previous ideation deliverable and evolve it. The output of this is simple, the company has established ideas (technology agnostic) that define how their goals can be achieved. The ideas are mapped against impact versus feasibility and subsequently prioritized.
For example, let's say one of our ideation outcomes looked to:
(a) address the gap in adequate baseline metrics, as a means to collect information from which an organization can perform continuous improvements.
(b) gain insights from the baseline metrics to understand critical problem areas.
The combination of these two ideas are simple and are the foundation into the world of continuous improvement. They are the necessary starting point to evolve IoT, Big Data, and AI strategy and corresponding data governance goals.
We can determine that a gap in our baseline metric likely existed prior to this exercise and once we identify these ares we mark them as “critical goals to achieve” and then subsequently iterated further to specific problems to solve.
Defining the details of our ideas, through a variety of means, be it Agile User-Stories, Design Thinking WOW statements, low-fidelity prototypes, or more complex designs and architecture including physical views of the proposed solutions, ultimately leads us to a sufficiently mature solution view that, from the perspective of data governance can now begin to address the relevant quality, discover-ability, security, and other criteria that assure consistency and confidence in the collected data. Moreover, the evolution of both our data governance model and our ideas can continue through an ongoing iterative approach.
I would like to focus on data quality. Specifically, IoT data that will be processed by AI-centric tools. Revisiting our ideas: first, address the gap in adequate baseline metrics, as a means to collect information from which an organization can perform continuous improvements; which can evolve into a complex architecture of IoT platforms that collect data in the form of temperature, vibration, images, pressure etc.
Then our second, directly coupled idea, gain insights from the baseline metrics to identify critical problem areas. Which has evolved to a specific set of AI platforms for image recognition and deep-learning.
Let us drill down a little further, while continuously referencing our goal of identifying what information is critical to achieve our goal.
1. We need to gain insights from baseline metrics to understand where our critical problem areas are.
2. We need to collect various sensor data images, temperature etc.
In order for these quality insights to be processed by AI platforms. This means our sensor data must be quality data.
So what does it mean for our sensor data to be of “quality”? We must use AI platforms to drive the definition of what quality data is as part of our overall governance program development. We should also consider components of data-quality that are agnostic to the AI platform, like timeliness, which from the perspective of the AI platform itself is not that important, but to achieve our critical business goals it is very important.
For images captured to support baseline measurements of quality control data, quality images are defined by a combination of infrastructure, image resolution, object orientation, even lighting conditions are a factor. The former builds upon the need for images captured through IoT sensors and processed through an AI algorithm to “gain insights from the baseline metrics to understand where are critical problem areas” by “address the gap in adequate baseline metrics, as a means to collecting information from which an organization can perform continuous improvement”.
We’ve now built in relevant data governance concepts from which we can implement specific requirements and controls into our IoT sensors that can address quality criterion specific to our AI platform with the goal of achieving quality insights that will ultimately lead to achieving our critical business goals.
Konrad Kona