Languages Magazine

The MindMeld API: Introducing the Ranking Dashboard

By Expectlabs @ExpectLabs

Make your search results more meaningful with the Ranking Dashboard. Take a quick video tour of the tool above and see how you can calibrate the importance of factors like relevance, recency, popularity, proximity, and any other custom ranking field that may be important to your data set. 

There are many ways you can optimize your content prior to using the dashboard. In this video, learn more about one of them. 

TRANSCRIPT:

In this screencast, I’ll explain what the ranking dashboard is and how you can use it to customize the search quality and ranking of documents for your application. To access the dashboard, first, log in to your developer account. Click on your profile name, which will then take you to the management console.  There, you will see a list of the applications you’ve created. If you have a free account, you may only see one application. Now, click on that application and scroll down, where you’ll see links to all of our tools, for example our API Explorer, Crawl Manager, and Context Simulator. Now click on the Ranking Dashboard button and it will open the tool.

The Ranking Dashboard tool allows developers to view tunable ranking factors. For example, if you have crawled news articles for your apps, then the ranking factors that might be useful for you to set are relevance, recency, and popularity. If you have a local business document set that includes crawled Yelp or Foursquare data, then the ranking factors that are important to you are proximity and relevance.

Here are small information buttons with question marks next to each ranking factor. When you click on the button it takes you to our documentation page which explains each ranking factor in more detail: what it means, what the value range is, and how it impacts the document results and search quality. Also, take a look at the bottom of the ranking factors box. There are two boxes that say “history since” and “history until,” which allows you to filter the uploaded session context to specific timeframes.

Now I will show you how to tune the ranking for your document set.

If you’ve already crawled the document set from your own website or from other third-party websites, and you already have a set of documents available in your application, you are ready to use the ranking dashboard. First, you need to create a session so you can post content or text entries to it. Based on the content that you post you’ll see that there are document results on the right-hand side where it says “documents.” If you click on the “create a new session” button with a plus sign a small window pops up where you can give it your own name or you can accept the default session name.

Below it there is a small box that says, “create new text entry” where you can upload text entries to your session. The text entries become the context of your session and are used to query your indexed documents. For example, if your document set is related to news, you can say, “What’s the latest about Facebook?” Go ahead and click “submit” so the text entry is posted to your session. Based on this context, you can see some entities are extracted and a set of documents are returned that are related to Facebook. 

Let’s say initially you had a ranking factor of 0.5 for relevance and all other factors were 0. As a result, we see documents related to Facebook, but some of them are quite old, from couple of years ago. Since these are news articles, recency is important. Let’s emphasize recency, which is the publication date of the news article. Set the recency to a non-zero value, say 0.5. Now you will see very recent documents at the top, and they are still related to Facebook. For time-sensitive documents, search quality will depend on when you last crawled the site and indexed fresh documents. If you crawled it a week ago, most of the documents would be from a week ago. You can always re-crawl the site using our Crawl Manager tool so you can have the newest documents float up to the top.

If your document source happens to have any location-related information, for example local business listings, you would want to set the proximity ranking factor. Now let’s open the Ranking Dashboard for an application that has crawled documents from Yelp.com. The documents in this application have location information in the form of longitude and latitude. The proximity ranking factor indicates how important the document’s proximity to the user’s location is on its ranking. The proximity is computed as the distance from user’s location to the document’s location, and the shorted the distance, the higher the document is ranked. Before testing how proximity ranking factor impacts search results, make sure you that your users’ location is set. If it’s not, just set the latitude and longitude here. Each time you make changes to the user’s location, it gets saved into that user’s profile in our system. For Yelp data, here we set the proximity to 0.49, and relevance is set to 0.8, and let’s set everything else to 0. I posted some text entries related to Spanish, Chinese, Japanese, and Italian restaurants. The last context that was uploaded was Chinese so here we see 10 Chinese restaurants within 2 miles of the user’s current location. What if we want restaurants that are closer, like within 1 mile? Let’s go ahead and change the proximity radius. Now you see only 3 Chinese restaurants. That’s how it works.

Another thing to note is that the ranking factors aren’t saved while you’re playing with them. If you’re satisfied with your configuration and want to apply these ranking factors to all of your queries, then click on the “save as default” link. This saves the ranking factor values in the application profile in our system. So when you make API calls to retrieve documents for this app, the ranking factors you set here will be applied. Also note that each time you make any changes to the sliders, it automatically refreshes the document results. Or you can hit the “refresh” button.

Let me briefly explain the other ranking factors. The popularity factor is based on how many users clicked on that particular document link to view the source page. If the popularity factor is set, then the documents with most view counts will rank higher. You can test this feature by clicking on couple of documents, and refreshing the results. Now you see that the clicked documents now show up at the top.

If you want to rank your documents based on factors other than the main four we provide, use customrank 1, 2, or 3. For example, if you want to rank based on user rating or review count, set these values in a document field named customrank1,2, or 3. Once your documents are indexed with these fields, set the custom ranking factor to a non-zero value.

If you want to see how to make a curl call and get the exact same results you see in the Ranking Dashboard, click on the “get curl” link. You can copy the text you see there and paste it on a command line and you should see the same results. Basically it helps you if you wanted to make the same call from your app. 

You can also use the query window to post any Lucene format queries. So if you want to only search on the title, you can specify the field name, such as “Pinterest.” This way you can filter on specific fields. 

And that is how you use the ranking dashboard. Thanks for watching!


Back to Featured Articles on Logo Paperblog

Paperblog Hot Topics

Magazines