How Knowledge Graphs Will Make Intelligent Assistants More Intelligent

By Expectlabs @ExpectLabs

Learn about a key ingredient that will make intelligent assistants more intelligent in the next few years.

Couple with this clip that talks about the way smart assistants will improve in the area of speech recognition accuracy.

TRANSCRIPT:

Hi, I’m Tim Tuttle and I’m the CEO and founder of Expect Labs. I’m doing a multi-part video blog series on the ways that we think intelligent assistants are going to get smarter over the next few years. I’ve talkedabout how we think intelligent systems are going to get much faster, we think they’re going to get more accurate, and we also think they’re going to get smarter and they’re going to be able to anticipate you better. Today I am going to talk about how these intelligent systems are going to get smarter.

Now, if you use intelligent assistants today, you’ll probably notice that they don’t always understand you. If you ask it a very simple question like, “where’s the Eiffel Tower?” it will probably give you the correct answer that it’s in Paris. But if you ask it a more esoteric question like, “does jQuery version 6 run in the webkit browser?” it’s probably not going to have any idea because obviously that’s using jargon that it doesn’t particularly understand, software programming jargon. If you say, “do I have a hairline fracture in my fifth metatarsal?” it probably won’t understand you because it doesn’t understand medical concepts.

So, this is going to change in the coming years because these systems will get smarter and smarter about different knowledge domains that they may not understand now. And the way that’s going to happen is through the expansion of the knowledge graph. Now I’ve already talked about knowledge graphs and how they’re going to make intelligent systems more accurate. But it’s also going to make them smarter about a broader range of domains. Today the largest graphs are on the order of hundreds of millions of concepts, or nodes, in the knowledge graph. So that raises the question: How big do these knowledge graphs have to get in order to understand all the concepts that I care about? Say I’m a lawyer, I want to understand legal terms. I’m a doctor, I want to understand medical conditions. If I’m an engineer, a software computer programmer, I want to understand things about programming and software packages.

So how big is the knowledge graph? Well, you can go through a thought exercise to think about how many concepts do there exist in the world in various categories, and then you add them all up and that might give you an indicator how big this knowledge graph has to get. So think about if you wanted to capture the name and friend connections between every person in the world. Well, now there’s 7 billion people living on the planet. If you consider all the people who lived in recent history, that’s probably in the low tens of billions, and so that gives you an upper bound on representing every person.

What about places? Well, today if you use a service like Foursquare or Google Places or Factual, these days they potentially have tens of millions of places, which are names of businesses, points of interest, intersections in the US and around the world. Probably if those systems would grow to close to a billion, you would get to the point of being able to name every store on every street corner everywhere in the world, right? So again, talking about a billion or so. If you’re talking about interests, hobbies, concepts, Facebook has a really good interest graph where they capture all the interests that people have entered into Facebook, and right now that’s in the hundreds of millions of concepts. What about products? Let’s say you wanted to have a database reflecting the name of every product that exists. Well, Amazon’s doing a pretty good job at that already and the size of their database is in the hundreds of millions or certainly to capture all the products that might exist in the future would be in the single digit billions, as a good estimate. What about names of companies? Well, there are pretty good data sets that do this already and they’re certainly in the tens of millions. I expect in the next decade if you want to have an entry in your knowledge graph for every single company, again, you’re talking about probably in the hundreds of millions. Let’s say you wanted to have a knowledge graph entry for every single movie and video that was ever produced, well, if you go to the full extent and include every home movie that’s been uploaded to YouTube we already know how big that is, and that’s probably about in the low billions, tens of billions of ideas that might be produced and you can go continue this thought exercise to go further and further and narrow down to every professional discipline, whether it’s like legal, medical, engineering, etc.

Regardless of how much you count up all the concepts that are there, it’s hard to get over into the trillions of very different concepts. So what that means if you add up all the things that would have to exist in this universal knowledge graph to capture every concept that exists and ever existed, you’re only talking about a knowledge graph that’s in the trillions, maybe hundreds of trillions. It is possible, using technology that we have today, to create a representation that captures hundreds of trillions of concepts? Well, the closest analog that we have right now are some of the largest search engines and databases that live on the web. And right now those data sets approaching trillions in their size. Probably the largest search engines have in the hundreds of billions of web pages and documents indexed. We’re dangerously close to being able to handle data sets that capture the entirety of human knowledge in one searchable data set. So that’s encouraging about how these systems are going to get smarter by using a universal knowledge graph.