Multi-lingual support for task bots

The accuracy of a bot trained with Swiftmatch and mindmeld depends on various features that are involved in the training of the bot and the inference of incoming user messages. The key features that decide whether a language is supported and its accuracy are:

  • Text (support for the language and the script it is written in)
  • Contextual spellcheck (if spelling mistakes in the data or user query can be corrected)
  • Wordforms (handling various forms and synonyms for words present in training data)

For task bot use-cases requiring the usage of system entities, support for system entities in the desired language is also required.

📘

Note

For entity recognition in task bots:

  • Custom list, regex, and free-form entity types are supported in all the languages listed below.

Languages supported by task bots

Please find below the list of supported languages LASER (Language encoder).

  • Arabic
  • Bulgarian
  • Catalan
  • Chinese
  • Croatian
  • Danish
  • Dutch
  • English
  • Finnish
  • French
  • Georgian
  • German
  • Greek
  • Hebrew
  • Hindi
  • Hungarian
  • Indonesian
  • Irish
  • Italian
  • Japanese
  • Korean
  • Lithuanian
  • Macedonian
  • Norwegian Bokmål
  • Polish
  • Portuguese
  • Romanian
  • Russian
  • Spanish
  • Swedish
  • Turkish
  • Ukrainian
  • Vietnamese

Find below the list of languages supported by Polymatch -

  • Arabic
  • Bulgarian
  • Catalan
  • Croatian
  • Danish
  • Dutch
  • English
  • Finnish
  • French
  • Georgian
  • German
  • Greek
  • Hebrew
  • Hindi
  • Hungarian
  • Indonesian
  • Irish
  • Italian
  • Korean
  • Lithuanian
  • Macedonian
  • Mongolian (mn)
  • Norwegian Bokmål
  • Polish
  • Portuguese
  • Romanian
  • Russian
  • Spanish
  • Swedish
  • Turkish
  • Ukrainian
  • Vietnamese

Languages vs entities supported

OrdinalQuantityCardinalMoneyDurationDate/timePersonLocation
ArabicSupportedSupportedSupportedSupportedSupportedSupported
BulgarianSupportedSupportedSupportedSupportedSupportedSupported
CatalanSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
ChineseSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
CroatianSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
DanishSupportedSupportedSupportedSupportedSupportedSupported
DutchSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
EnglishSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
FinnishSupportedSupportedSupportedSupportedSupportedSupported
FrenchSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
GeorgianSupportedSupportedSupportedSupportedSupported
GermanSupportedSupportedSupportedSupportedSupportedSupportedSupported
GreekSupportedSupportedSupportedSupportedSupportedSupported
HebrewSupportedSupportedSupportedSupportedSupported
HindiSupportedSupportedSupportedSupportedSupported
HungarianSupportedSupportedSupportedSupported
IndonesianSupportedSupportedSupported
IrishSupportedSupportedSupportedSupportedSupportedSupported
ItalianSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
JapaneseSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
KoreanSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
LithuanianSupportedSupportedSupported
MacedonianSupportedSupportedSupportedSupportedSupportedSupported
MongolianSupportedSupportedSupportedSupportedSupportedSupported
Norwegian BokmålSupportedSupportedSupportedSupportedSupportedSupportedSupported
PolishSupportedSupportedSupportedSupportedSupportedSupported
PortugueseSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
RomanianSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
RussianSupportedSupportedSupportedSupportedSupportedSupported
SpanishSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
SwedishSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
TurkishSupportedSupportedSupportedSupportedSupportedSupported
UkrainianSupportedSupportedSupportedSupportedSupportedSupported
VietnameseSupportedSupportedSupportedSupported


Languages for text preprocessing

LanguageTextSpellcheckWordforms
English
Spanish
Portuguese
Russian
Turkish
French
German
Italian
Arabic
Polish
Dutch
Danish
Korean
Norwegian
Swedish
Finnish
Ukrainian
Hebrew
Greek
Bulgarian
Catalan
Croatian
Georgian
Hungarian
Irish
Hindi
Bengali
Punjabi
Marathi
Telugu
Vietnamese
Tamil
Urdu
Javanese
Gujarati
Persian
Bhojpuri
Hausa
Kannada
Indonesian
Yoruba
Malayalam
Odia
Maithili
Burmese
Uzbek
Sindhi
Romanian
Pashto
Magahi
Malay
Nepali
Assamese
Afrikaans
Albanian
Amharic
Armenian
Azerbaijani
Basque
Belarusian
Bosnian
Czech
Esperanto
Estonian
Galician
Icelandic
Kazakh
Kurdish
Latvian
Lithuanian
Macedonian
Malagasy
Serbian
Sinhala
Slovak
Slovenian
Somali
Swahili
Tagalog
Tajik
Tatar

Features supported by other languages include:

LanguageTextSpellcheckCommon system entitiesWordforms
Chinese
Japanese
Thai
Burmese
Khmer