Languages Magazine

The MindMeld API: Introducing Semantic Markup Tags

By Expectlabs @ExpectLabs

Adding semantic markup tags to your pages is an easy way to enhance the quality of your search results. Watch us reveal which markup tags are supported by the MindMeld API, how to incorporate them into your projects, and the awesome things that happen when you do.

Pair with this video that explains how to use real-time push events.

TRANSCRIPT:

So today I’m going to show you how to use semantic markup tags that we use to optimize how your webpages are crawled, and how they’re turned into documents. With these semantic markup tags, you can enrich the representation of your documents, thereby improving the search relevance and ranking. To begin, I’ve gone to the documentation page at developer.expectlabs.com, and there’s a section on crawler configuration. Go down to the bottom of this page, and there’s a section called optimizing your content using semantic markup. There’s a table in here showing the list of semantic markup tags you can use, like the type, URL, and site name. For each of these, if you insert it into the document, as shown here, like here, og:Image, that will basically specify the image for the document. Our crawlers will pick this up, and then provide that better meta data for your documents.  

Now I’ll show a real example of how we can do this. I’m going to go into the management console and I will create a new application, here it’s called Hello World 7, and I’m going to go into the Crawl Manager. Now I have a simple web server set up here; it’s a Node server, it’s serving up three pages. I can go look at it. So it has a main page and then two very simple pages. Let’s go crawl this. I’m here in the Crawl Manager for the new application I created and I’m going to go crawl down your website. The Crawler log here is telling us the crawl has started, it’s telling us there isn’t a robots file, and then it’s telling us the crawler’s finished. There’s a warning there which we’re going to ignore. It’s basically just saying that it’s found a very small number of pages, which is atypical, but in this case it’s correct.  We have these three pages, and we can see the JSON associated with each one. Here we can see the text, the title, the URL, etc. Now I’m going to go modify the web server to add a custom metadata tag and we’ll re-crawl it and see what difference that makes. I’m going to go into my web server, and use this URL that I’ve cleverly stashed away here. And I’ll use the right keys. OK, so I’ve added this meta element with og image and with that URL and I’ll restart my web server. Let’s just re-crawl it and see what happens. As you can see, there’s the three pages, and page one here is the image showing up. And so if you go back to the table in the crawler configuration, here we’ve used the og:image. And there are all these other ones you can use too. You can specify the width and the height of that image. You can specify a description, a title, and lots of other stuff to improve the representation of your documents. That’s all for today, thanks for watching!

Back to Featured Articles on Logo Paperblog