An Algorithm that Learns Everything about Anything

May 27, 2014
Sogeti Labs

We deal with a lot of algorithms during a regular day. Searching for Google, reading you Facebook timeline or every “hey, you also might like this” suggestions you get around the web. For those of you unfamiliar with what a algorithm actually is, think about it as a set of rules that define a sequence of operations to solve a problem or achieve a certain goal. Also, most algorithms are ‘learning’ and getting better over time.

A system called LEVAN, which is short for Learn EVerything about ANything, is now being developed by a group of researchers out of the Allen Institute for Artificial Intelligence and the University of Washington. What’s really interesting about LEVAN is that it’s neither human-supervised nor unsupervised (like many deep learning systems), but what its creators call “webly supervised.”

It’s a fully-automated approach for learning extensive models for a wide range of variations (e.g. actions, interactions, attributes and beyond) within any concept. The approach leverages vast resources of online books to discover the vocabulary of variance, and intertwines the data collection and modeling steps to alleviate the need for explicit human supervision in training the models. The visual knowledge about a concept is being organized enabling a variety of applications across vision and NLP. The online system has been queried by users to learn models for several interesting concepts including, breakfast, Gandhi, beautiful, etc. To date, the system has models available for over 50,000 variations within 150 concepts, and has annotated more than 10 million images with bounding boxes.

This basically means LEVAN is using the web to gain knowledge on what it needs to know. So it could scour Google Books Ngrams to learn common phrases associated with a particular concept, then searches for those phrases in web image repositories such as Google and BING. For example, LEVAN knows that “fighting horses,” “saddle horse” and “eye horse” are all part of the larger concept of “horses,” and it knows what each one looks like. Another impressive feature of LEVAN is that because LEVAN uses text and image references to teach itself concepts, it’s also able to learn when words or phrases mean the same thing. So while it might learn, for example, that “Mohandas Gandhi” and “Mahatma Gandhi” are both sub-concepts of “Gandhi,” it will also learn after analyzing enough images that they’re the same person.

Our world, both online and offline, will increasingly be run by algorithms. That will require a lot of computing power but if you were wondering just how fast the AI space is moving; LEVAN was designed to run nicely on the Amazon Web Services cloud.

About the author

SogetiLabs gathers distinguished technology leaders from around the Sogeti world. It is an initiative explaining not how IT works, but what IT means for business.

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *

Slide to submit