Languages Magazine

The MindMeld API: Introducing Multilingual Support

By Expectlabs @ExpectLabs

With MindMeld’s multilingual support, you can now understand your users in English, French, Portuguese, Spanish, Italian, German, and Russian! Explore how you can easily enable this feature by watching the video tutorial above.

Continue your MindMeld knowledge with this video that explains how to improve your search results with semantic markup tags. 

TRANSCRIPT:

Hello there. In this screencast we’re going to show you how the MindMeld API supports multiple languages. One of the first spots where you can specify the language is in the documents so when MindMeld crawls your website, if your documents contain a certain tag, we will know the language of that document. Specifically, there’s this el:language tag where you can specify the language of the document following this convention that is used throughout the language settings for the entire MindMeld API. We follow these three letter language codes that you can find in this Wikipedia page. So that’s for the first document endpoint, which is the crawler.

The second one is in the document itself. If you don’t use a crawler to automatically post documents to the backend, you can certainly post a document directly in which case you can also have this language attribute as part of the POST request to the backend and that follows the same convention where you specify the language through this ISO639-2 standard.

A third endpoint is in the text entry. Now here is where things get interesting because you don’t actually have to specify the language for the text entry endpoint. We will automatically try to detect what language the text is in. However, if you already know it because you’ve set the language top down for the entire application, then you can also include the language attribute when you post a text entry.

Another of the endpoints that support the language attribute is the entity and typically this language tag will be automatically set, because the entity is derived from a text entry and we already know what the language is. But the API also supports the direct posting of entities in which case you can include the language attribute.

Also at the session level when you have a request for what the most relevant documents are at any given point, which is kind of the key functionality of the MindMeld API. In this case, you will get the most relevant documents in the language of your most recent text entries or entities. However, if you choose to, you can also specify in the query parameter a filter that would then narrow down the results to only be documents in a certain language. That is specified through the query string.

Now the final place and probably the most important where you can set the language is for the speech recognizer. So, for example, if you use the MindMeld JavaScript SDK in the class listener, you can specify the language through the property here lang, so that the recognizer knows whether to expect the audio in English or German or any language that we support. In this case, we actually use the BCP encoding for languages. So, for example, for US english is EN-US, or for German would be DE-DE.

So this is an example of an app that supports multiple languages, as you can see from this drop-down, and we’ve crawled a bunch of sports sites. So let’s start with an utterance in English.

"Is Kenny Phillips still with the New York Giants?" OK, so we get a bunch of results about Ken Philips. How about we try with German now? "Was ist das letzte mit der futbal Weitmeisterschaft?" OK, well we get a bunch of results about the World Cup, like this article here. Wow, there are still tickets available. Cool. Alright, so let’s try with Spanish now. We’ll go Spanish from Spain. "Hace de nunca al museo de combate de Las Vegas?" Well, yeah, so there’s this boxing museum in Vegas. How about we try some Portuguese? "A cupo do mondo do futebol." Oh nice. We got this article with a historical overview of all the World Cups.

So you can see how easy it is to get MindMeld to work with a bunch of languages.


Back to Featured Articles on Logo Paperblog

Paperblog Hot Topics

Magazines