DATA IS THE AI DRIVER
April 24, 2019
Today, Artificial Intelligence (AI) is far more than a buzzword in the business world. Everyone seems to agree that AI technology can change the world in more than one way – be it helping us to revolutionize healthcare or partnering with us to fight terrorism. In fact, its transformational effect is being asserted to be as far-reaching as the Industrial Revolution. No wonder almost every business leader across the globe is looking at ways to capitalize on AI opportunities. They are keen to find out how AI capabilities can help them to serve their customers better and where should they channelize their AI investments to get the maximum benefit.
For now, let’s put aside what is possible with AI and look at what is the foundation of an AI solution? Yes, you have got it right – the answer is data!
But first, data
You need to gather a huge amount of data to develop a machine learning model, and this is one of the struggles we keep coming back to: where do we get the data?
The answer is often existing data sets that are accessible to everyone – for example, weather data, and some satellite data available from IBM’s PAIRS system. We can also access traffic data, market data and so on – these can all be used to expedite the development of AI solutions. We can gather and combine these data sets in our applications to make them smarter. In other words, we can source data from third parties to make our applications smarter.
A good example of this is the vegetation management system used by the utility industries to identify the time and area of encroachment by vegetation on the power lines. This knowledge equips the industry to clean up the vegetation whenever required to prevent any power failure due to such encroachments. The industry uses a big combination of data in order to do this – they use geographic information system (GIS) data to mark power lines, they use satellite images to identify areas where there are trees, and they combine weather data and foliage information to learn about the kinds of trees present in those areas.
If you then add LIDAR scans to that data, you can extend the previous solution to identify potential equipment failures. This knowledge enables the utility companies to take the required actions at the right time to keep their power lines functional.
Next challenge – shortening development lifecycle
Now that we have the data, we need to look at – how do we train the data sets to reduce the time to build a smart system.
To quicken the process, we need to work with tools that need a minimum amount of training. Thanks to tools like IBM machine vision systems, it is now possible to manually train small sets of data and then use that trained data and machine learning to train larger data sets. This shortens the development lifecycle and produces smarter systems faster.
This is really cool stuff, but earlier this kind of prediction took time as we had to manually train the systems. Luckily, we now have systems that train themselves and make these kinds of solutions possible within a shorter time span – it’s something we are doing and it’s the kind of trend we are seeing when it comes to developing AI solutions. To conclude, we have quite a few data sets available in the public domain; it’s up to us how we combine the data to makes it interesting!
