Technological trends, like nutritional and fitness fads, often come and go in waves. A new trend appears on the horizon and before you know it, it’s everywhere, repeated endlessly till it’s robbed of all meaning and context and becomes just another buzzword. Then, just as suddenly as it starts, it recedes into the background, replaced by a new crop of technological, scientific or corporate jargon. But the trend does not die. It just disappears for a certain span of time and then comes back again, slightly changed in form and stronger than before. Something exactly like this is taking place with semantic technologies, previously known as the semantic web.
Talk about semantic technologies has been around for more than 40 years and lately, there has again been widespread interest with regard to such technologies. It is, in fact, the application of these technologies which make possible the Natural Language Processing (NLP) capabilities of IBM Watson, the ongoing refinement of Google search results, or the provision by DBpedia of a knowledge base from Wikipedia articles. Such technologies hold the key to the future of Artificial Intelligence (AI) as it is these technologies which will allow us to add knowledge and logic to AI as it exists today.
Before we get into how semantic technologies can change the world, it is necessary for us to understand what semantic technologies and machine learning are; for it is the combination of these two differing technologies which will lead to a quantum leap in the capabilities of AI in the near future. Semantic technologies basically use formal semantics to derive meaning from disparate sets of raw data. This is done by using tools, methods and techniques that help categorize and process data as well as define the relationships between different concepts and datasets. These relationships are described in a language that computers can understand and compute upon. Thus, semantic technologies allow computers to not just process strings of characters but to also to store, manage and retrieve information based on meaning and logic.
Another exciting branch of computer science is machine learning, which involves creating algorithms that help machines derive insights and make predictions from data. Such algorithms allow computers to learn and reach insights without being explicitly programmed to achieve any particular result. Given a mass of unstructured data, these algorithms allow the machines to construct models, based on which data-driven predictions can be made. Machine learning can be classified under the following broad categories:
Supervised Learning: Under supervised learning the machine is fed with data as well as the desired or actual result. The computer then maps and figures out a relationship between the input and the output. For instance, when fed a historical data on the number of website visits and the day, date and time of those visits a computer can figure out inputs (day, date and time of visit) and the output (number of visits) and use this to create a model that will predict the number of visits in the future. This is known as supervised learning since the computer is given the result or the goal that has to be generated (in this case the computer is told to predict the number of visits)
Unsupervised Learning: In this type of learning, no goals are given to the computer, rather it is up to the machine to detect patterns and find structure in the data provided. For instance, let’s say you want to classify the visitors to your website according to the products they are interested in or what they read (their interests). You will not necessarily have to classify the different types of visitors but can just feed in the data, with the computer segmenting all the visitors according to patterns it discerns. This can sometimes give rise to categories or segmentation criteria that you might not have anticipated. This is typical unsupervised learning, where the goal or output has not been defined in advance.
Reinforcement learning: This is a type of learning in which the computer or machine learns and develops its own algorithm and refines it through multiple interactions with a dynamic environment. For example, let’s say the goal here is for a driverless car to drive down a road without touching the sides of the road. We do not need to set the algorithm on how to achieve this. Instead, through a trial and error method, the computer will itself figure out how to best achieve this goal. This is achieved by the machine being able to sense ‘feedback’ (ie: sense when the vehicle is touching the sides of the road), decide on an action (steer right if it is touching the left side of the road and steer left if it touches the right side) and then compare the result against the goal (to stay on the road without touching the sides). When done over a sufficient number of tries, the machine will come up with an algorithm to achieve the goal by itself and will refine this strategy further over subsequent trials.
Machine learning is at the forefront of some of the most exciting trends in AI such as AI autopilot features in airplanes, content curation on Twitter or Spam filters or email categorization in email programs . Much is expected from this field, and in the future, machine learning techniques are expected to continue to drive innovation with algorithms that help computers learn and refine responses and constantly adapt to new data and content inputs. Many machine learning frameworks came out in these last years: Apache Singa, Tensorflow, Azure ML Studio, Amazon Machine Learning, Caffe, Theano, Torch, and much more, are popular in the community of AI’ers. They have a more or less wide spectrum of use. For example, Tensorflow can be used for pattern recognition (image, acoustic signal, processes, …), language translation, etc.
One way of looking at these technologies is that using a knowledge base as in Semantic Technologies is a kind of top-down approach towards intelligence where you feed in learning and knowledge from the top. It is this sort of approach to learning that makes human beings immensely powerful, where each successive generation of scientists and engineers can access and learn from the theories and store of knowledge developed by earlier generations. This is how, for instance, human babies learn how to perform certain tasks, simply by observing their parents and copying their actions. On the other hand, machine learning techniques can be said to be a bottom-up approach towards learning, where one is exposed to masses of unstructured data and works out patterns or correlations within it. This is perhaps the manner in which human babies learn to walk ie: not just by observing their parents but also over numerous tries till they refine a particular solution.
It is the combination of these two types of learning i.e. the ability to access stored knowledge as well as the ability to figure out new and unique problems that give human beings much of their power. And it is this power that is promised by the combination of Semantic Technologies and machine learning. This is because semantic technologies add another layer to the web allowing machines to capture and work on related concepts, their properties and relationships, the events that impact them and the rules that govern their behavior, rather than just word matching.
When combined with machine learning systems and graph databases, semantic technologies can help to build further semantic databases that help to interpret and answer human queries. This is typically the technology that is behind intelligent systems such as IBM Watson and Apple’s Siri, and other sophisticated combination of Machine Learning and Semantic Technologies. For instance, chatbots which give meaning to your request thanks to the machine learning, are nowadays more or less well connected to a knowledge base in a domain to provide you a relevant answer in this domain.
We at Capgemini have also been working on a number of exciting applications. A case in point, is CBIoTS (pronounced cybiots) or Cockpit for Big IoT Systems- a cloud based platform to control and command Ultra Large-Scale Systems (ULSS) : more than a million machines. CBIoTS is a Multi-Agent based Systems (MAS) with each agent controlling and commanding a machine. These Agents can communicate with each other as well as with the platform in a secure manner. If we connect sensors or actuators to the machines with which the agents are associated, then CBIoTS is capable of forming digital clones of very large systems. Such digital versions or digital clones can help us detect anomalies in the existing system or even simulate different configurations digitally. CBIoTS is connected to IBM Watson and/or Tensorflow which gives it powerful cognitive qualities for the recognition of configurations involving a very large number of constituents. Since agents are active and collaborative, they can form communities of agents that will be able to modify not only the configuration of the machines on which they are installed, but also the environment of these machines thanks to the actuators attached to them. CBIoTS makes it possible to make very large intelligent and autonomous systems.
Of course, the CBIOT is just one among many different projects that are attempting to bring together the capabilities of machine learning, deep learning, algorithm and knowledge database as well as semantic technologies. However, there are significant challenges which need to be overcome before we can fully harness the potential of such technologies. One of the immediate stumbling blocks is computing power. This is because the knowledge databases which are the basis of semantic technologies need lots and lots of computing capacity for inferences. For instance, if I am building an ontology for project management, I not only need to define each and every concept related to project management, which is a very huge domain of knowledge, but also the inter-relationships within these concepts as well as their relationships to each and every related field. In short, creating such ontologies is very heavy work requiring lots of time as well as processing and storage capabilities.
Another critical challenge preventing the wider adoption and usage of Semantic Technologies is a lack of properly skilled manpower. One of the basic issues here is that Semantic Technologies are basically a multi-disciplinary technology that requires mastery over a number of different disciplines to work with. At present there are very few schools and universities which offer a specialization in this field and part of the problem also seems to be a lack of awareness about this field. This is a serious issue that needs to be overcome if the Semantic Technologies are to be more widely adopted.
To conclude, it’s clear that Semantic Technologies can bring about an evolution in the capabilities of Artificial Intelligence, especially when used in combination with machine learning. There is already a lot of exciting work being done in this area and it’s being more and more clear, that Semantic Technology is the future.
Natural Language Processing is one of the most important sectors to develop as it is the key to truly allow artificial intelligence.Apart from sentiment mining and opinion mining one of most interesting and challenging applications is to produce language.Therefore,I find that Semantic Technologies is the way to collect more information in a data-driven age