Kristine Schachinger, the researcher, has created a convenient guide for novices that explains how impartial, or “entity” search works, and how Google uses its RankBrain machine-aided education system to improve its results.
Recently it became known that a famous search engine company uses a mechanism such as the RankBrain alongside its other algorithmic factors in order to increase relevancy of its search results.
In particular, RankBrain works to determine and process requests using the mechanism of analysis of complicated and/or complex request that allows to connect them to relevant topics. This allows Google to ensure better search results, which is especially important as the system receives several millions of search requests on a daily basis. This number is incredible, even for Google.
The company itself describes the RankBrain as one of the most important ranking signals within the Google algorithm.
“RankBrain is one of hundreds of signals included in the algorithm that determine, which results will be presented in the search results, as well as their sequence”, explains Greg Corrado, Senior scientific specialist of the company. “It took several months to implement. And nowadays it is justly the third most important signal that had an enormous impact on improvement of search engine operations”, he added.
Notice: RankBrain is rather a “request processor” than a “ranking factor”. As of today, it is unclear, which particular functions does this new feature exercise as a ranking signals are normally more or less related to the contents”.
It will be worth noting that this signal is not the only significant measurement within the search engine mechanism as of late. Google have been making a number of valuable corrections to its search engine throughout the past few years, ranging from renewing algorithms to search engine template. Google has become an “animal” that is quite different from what we’d seen before Panda and Penguin.
These alterations had not only affected the search itself. Company structure had also evolved. With the introduction of the “Alphabet”, a parent company that had united numerous Google projects, the Google itself had also evolved from being a simple body, but also lost it’s unique leading position.
However, RankBrain is significantly different from previous amendments. This novation is an attempt to specify request results, mostly within the Knowledge Graph, by the search object. While objective search in itself represents nothing new, and simply was added to the machine search algorithm a few month ago.
So, what is this “entity search”? How does it work alongside the RankBrain? Where does Google lead us in the end?
To answer these questions we need to go back a few years.
The launch of this algorithm caused major changes. It’s reasonable to say that it had became a “capital adjustment” of the system of processing organic search engine requests. Google had all of a sudden switched from searching “lines” (i.e. lines of search engine requests) to the search of things (such as objects).
The new algorithm is actually a result of a fruitful performance by the company on including the semantic search into its search system. Thus, Google had decided to not only pursue machine learning, but also to understand and process the NLP language. From now on there is no need to include keywords – Google will be able to understand your meaning from entering your request from now on.
The purpose of the semantic search is to improve the preciseness of search results through understanding of the search intent and contextual meaning of terms, mechanism of their appearance in the search data scope, whether it is a website or a closed system. As a result, Google responses to its users become much more relevant. Semantic search systems can view situations from various angles, taking into account the search context, location, intentions, variations of words and synonyms united within specialized requests, concept of relativity and requests in original languages. Major search systems, such as Google and Bing, possess semantic search elements.
However, two years later, each Google user can confirm that the dreams about semantic search had remained such. Naturally, it would be wrong to say that Google corresponds with none of criteria, but the system is still very vague and remote from the idea of the semantic search.
For example, the system uses databases to identify and connect objects. However, semantic engine analyses how context affects words, therefore it can possibly assess and interpret the meaning in the future. Google has no tool to do so. And according to some system data it remains the navigation tool that lacks up to its definition and has no similarities with the semantic analysis as it is.
Thus, while Google can identify known objects and relations between them by using data analysis, machine and remote learning, it still cannot understand the natural human language. It also can’t easily interpret conceptual attributes without additional explanations, if such relations are not strongly correlated or absent in Google’s storage. These explanations normally come as a result of additional users’ entries.
Naturally, Google has the ability to learn many of these definitions and relations over some time, if enough users will be browsing for the specified list of terms. That is where the RankBrain machine learning comes in handy. Instead of stimulating the user to specify the search request, system will make an optimal suggestion regarding the nature of information being sought, based on a conscious need of a person.
However, even with the help of RankBrain Google cannot interpret meanings similarly to an ordinary person, which is the part of the natural language of semantic definition.
As a result, Google is by default not a semantic search engine. But what is it?
Shift from “Lines” to “Things”
One post on Google’s official blog says: “We have worked on an intellectual model, so-called “Graph” in programmers’ language, which recognizes actual objects and links between them: things rather than lines”.
As we already mentioned, Google today is very good in revealing precise data. Dou need a weather forecast? Traffic data? Restaurant review? Google will answer your question without even forcing you to visit the related website, all information will be displayed on the top of search results page, where it is the most visible. Data is normally provided based on the Knowledge Graph, resulting from the abovementioned shift from “lines” to “things”.
This shift became an incredible event for search engines based on data archives, especially subject to data bits being placed into the Knowledge Graph. These data bits are what answer questions like “Who?”, “What?”, “Where?”, “When?”, “Why?” and “How?” Google is capable to provide its users with information, which they hadn’t even thought that they needed it.
Such a shift towards objects had some drawbacks as well. While Google was very successful in analyzing direct data-driven information, it stopped improving in increasing relevancy of responses to complicated combined search queries. Such queries cannot be clearly compared to specific objects, known data and/or data attributes. Therefore, they’re poorly recognized by the search engine.
As a result, when you enter such complicated query, you can only hope to see a few relevant answers from the system, which may easily lack in value for you. Therefore the search results increasingly become a set of potentially applicable responses rather than the list of specific responses to the question asked by the user. Why does this happen?
Comprehensive Queries and How They Affect the Search
«RankBrain uses artificial intelligence to inbuilt enormous amount of written languages into mathematical objects, so-called “vectors”, which computer can comprehend. If RankBrain sees a word or a phrase that it is not familiar with, the machine assumes, which words or phrases may have similar meaning and therefore filters the results, thus raising the bar of search efficiency to previously unachievable records”, says an article in Bloomberg Business.
Do you want to see complicated queries in practice? Go to the search engine and test system’s capability. If you had used an unusual or irrelevant set of terms, you’ll see how Google will form a set of potentially suitable answer options for you. Why does this happen?
Google searches for answers among elements that are known to the system through activating the machine learning system (RankBrain) in order to create/understand/analyze cause and effect connections that are beyond obvious. Normally, if the relation or an object are unknown to Google (therefore it cannot clearly determine its context and meaning), it simply tries to guess them.
Even when an object is known, inability to determine relevancy between original items is decreased before relevancy is disclosed. Do you remember, how Google sometimes shows you words that were not used in search query? This works similarly, we just don’t see those deleted search criteria any more.
You can see for yourself if you type your query once again in the search engine. Pay attention to results suggested to you in a dropdown box as you type. Instead of entering the query you need completely, simply select the most relevant result from the list.
Have you noticed, how results are much more accurate, when you use Google’s phrasing? Do you want to know, how does this happen? The thing is, Google cannot understand the language without knowing. What a particular word means. And it cannot recognize the link between objects, unless there are enough people that will be able to point out, how such attributes are correlated. This is how objects work in simplified search conditions.
When we say “objects”, we mean various nouns – people/places/ideas/things. Google knows objects and their definition isdetected by databases, to which the search engine references.
As we already mentioned, Google provides wonderful results in terms of weather, movies, restaurants and last night’s football game results. It can provide you with definitions and related terms, and even works as a digital encyclopedia. In other words, there are no problems, when object and cause and effect surroundings of the data is well-known. However, if inputted items are unknown or if connection between them is unclear, Google will fail to understand you. In this case it will only be able to suggest regarding the nature of your query and provide a more or less adequate response (from the system’s standpoint).
Google wants to turn words, displayed on the page, into objects that mean specific things and possess related attributes. Actually, it tries to create something for a computer that is similar to a human brain – the artificial intelligence.
This is not an easy task, however the work is long underway. “Google works on creation of the gigantic internal system that possesses information not only on each object separately, but also on entire variety of objects in the world”, says Amit Singhal, company’s software developer.
How does it Work?
For example, let’s take “Iced Tea”, “Lemons” and “Glass” These are all objects (things) surrounded by known cause and effect relations. Therefore, when you search using these items, Google will provide you with lots of relevant results. System will understand your request, because the user’s goal is quite clear.
- Now let’s change our requests – “Iced Tea”, “Rooibos”, “Glass”/ In general, Google understands our needs, but the interpretation of user intentions is now slightly more complicated. Why is that? Because although Rooibos is used to brew tea, it is not a common ingredient for an iced tea.
- Let’s make our task even more difficult – “Iced Tea”, “Goji” and “Glass”. Now Google goes to the section with multiple potential results, trying to find some options, that are at least somehow related to the task at hand. Some results may be “for nothing”, while others will only be relevant to goji tea, but not iced tea. The system is somewhat lost.
- It’s time for the final transformation – “Iced Tea”, “Dissolved Sugar” and “Glass”. Google is now completely lost trying to guess the meaning of this request. While all objects are related to the recipe of brewing the sweet tea, you’ll also see links to chemistry resources alongside with websites that contain tea recipes. How come? Well, the search engine simply cannot properly understand the correlation between the items specified.
Now imagine, that instead of finishing entering your own query, you use words from the dropdown menu in Google, that generally correspond with your needs. What will system suggest? “Glass of sweet tea with lemon”. The word “Sugar” ius changed to “Sweet”, and the word “Dissolved” that had stunned the system, is left out. As a result, you’ll receive a response that corresponds to your query ideally.
Why does it work like this?
What Google can do is understand that “iced tea” is something called “iced tea”. And “Glass” is, therefore a glass. However, in the last example system struggles with recognizing word “dissolved” alongside “iced tea”, “sugar” and “glass”.
Since this query may be related to tea with ice and sugar in a glass, as well as to a sugar solution used in a lab, you receive a pretty strange set of results. Some of them, obviously, are totally irrelevant to tea, but quite relevant to “dissolved sugar”. You also see results that are related to both tea and sugar at the same time, that are still not the recipes of brewing sweet iced tea.
The reason we see such pages is most probably due to the RankBrain working, which tries to decipher user’s intentions. The mechanism tries to determine correlations between objects, but without sufficient abilities to do so, it goes back to the vortex of potentially suitable answers.
Therefore we have a set of search queries, which Google needs to evaluate based on objects and things in its database. After that it analyzes the nature of relations between such objects based on cause and effect relations known to the system. Without clear understanding of user’s intentions, Google involves RankBrain to pick and provide answers that have the closest meaning possible.
So, where does Google Evolve?
Despite its experiments with RankBrain, the company had lost part of the US market. After launching Hummingbird, Google had to say goodbye to about 3% of total number of users. Therefore, the results achieved cannot be clearly determined as positive. Moreover, some people tend to only discuss deficiencies of the new updates.
It is possible that Google needs to decide, whether it’s an engine to provide answers, or a search engine. Or it may also divide these functions and work in both directions.
While it is unable to create a semantic search, the company had created a fact-dependent system. RankBrain was added with the purpose to receive more accurate search results, because objective search requests are often unclear not only in terms of meanings of nouns, but also in terms of relations between them.
RankBrain will become better overtime. Mechanism will learn new objects and remember probability relations between them. It will allow this technical add-on to provide better results and be more useful for users than it is now. However, as strange as it may sound, system will continue working against itself, gradually losing users. And only time will tell, how significant these losses will be during the system’s journey to perfection.