Multi-lingual support for task bots

The accuracy of a bot trained with Swiftmatch and mindmeld depends on various features that are involved in the training of the bot and the inference of incoming user messages. The key features that decide whether a language is supported and its accuracy are:

  • Text (support for the language and the script it is written in)
  • Contextual spellcheck (if spelling mistakes in the data or user query can be corrected)
  • Wordforms (handling various forms and synonyms for words present in training data)

For task bot use-cases requiring the usage of system entities, support for system entities in the desired language is also required.

πŸ“˜

Note

For entity recognition in task bots:

  • Custom list, regex, and free-form entity types are supported in all the languages listed below.

Languages supported by task bots

Please find below the list of supported languages LASER (Language encoder).

  • Arabic
  • Bulgarian
  • Catalan
  • Chinese
  • Croatian
  • Danish
  • Dutch
  • English
  • Finnish
  • French
  • Georgian
  • German
  • Greek
  • Hebrew
  • Hindi
  • Hungarian
  • Indonesian
  • Irish
  • Italian
  • Japanese
  • Korean
  • Lithuanian
  • Macedonian
  • Norwegian BokmΓ₯l
  • Polish
  • Portuguese
  • Romanian
  • Russian
  • Spanish
  • Swedish
  • Turkish
  • Ukrainian
  • Vietnamese

Find below the list of languages supported by Polymatch -

  • Arabic
  • Bulgarian
  • Catalan
  • Croatian
  • Danish
  • Dutch
  • English
  • Finnish
  • French
  • Georgian
  • German
  • Greek
  • Hebrew
  • Hindi
  • Hungarian
  • Indonesian
  • Irish
  • Italian
  • Korean
  • Lithuanian
  • Macedonian
  • Mongolian (mn)
  • Norwegian BokmΓ₯l
  • Polish
  • Portuguese
  • Romanian
  • Russian
  • Spanish
  • Swedish
  • Turkish
  • Ukrainian
  • Vietnamese

Languages vs entities supported

OrdinalQuantityCardinalMoneyDurationDate/timePersonLocation
ArabicSupportedSupportedSupportedSupportedSupportedSupported
BulgarianSupportedSupportedSupportedSupportedSupportedSupported
CatalanSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
ChineseSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
CroatianSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
DanishSupportedSupportedSupportedSupportedSupportedSupported
DutchSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
EnglishSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
FinnishSupportedSupportedSupportedSupportedSupportedSupported
FrenchSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
GeorgianSupportedSupportedSupportedSupportedSupported
GermanSupportedSupportedSupportedSupportedSupportedSupportedSupported
GreekSupportedSupportedSupportedSupportedSupportedSupported
HebrewSupportedSupportedSupportedSupportedSupported
HindiSupportedSupportedSupportedSupportedSupported
HungarianSupportedSupportedSupportedSupported
IndonesianSupportedSupportedSupported
IrishSupportedSupportedSupportedSupportedSupportedSupported
ItalianSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
JapaneseSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
KoreanSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
LithuanianSupportedSupportedSupported
MacedonianSupportedSupportedSupportedSupportedSupportedSupported
MongolianSupportedSupportedSupportedSupportedSupportedSupported
Norwegian BokmΓ₯lSupportedSupportedSupportedSupportedSupportedSupportedSupported
PolishSupportedSupportedSupportedSupportedSupportedSupported
PortugueseSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
RomanianSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
RussianSupportedSupportedSupportedSupportedSupportedSupported
SpanishSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
SwedishSupportedSupportedSupportedSupportedSupportedSupportedSupportedSupported
TurkishSupportedSupportedSupportedSupportedSupportedSupported
UkrainianSupportedSupportedSupportedSupportedSupportedSupported
VietnameseSupportedSupportedSupportedSupported


Languages for text preprocessing

LanguageTextSpellcheckWordforms
Englishβœ“βœ“βœ“
Spanishβœ“βœ“
Portugueseβœ“βœ“
Russianβœ“βœ“
Turkishβœ“βœ“
Frenchβœ“βœ“
Germanβœ“βœ“
Italianβœ“βœ“
Arabicβœ“βœ“
Polishβœ“βœ“
Dutchβœ“βœ“
Danishβœ“βœ“
Koreanβœ“βœ“
Norwegianβœ“βœ“
Swedishβœ“βœ“
Finnishβœ“βœ“
Ukrainianβœ“
Hebrewβœ“
Greekβœ“
Bulgarianβœ“
Catalanβœ“
Croatianβœ“
Georgianβœ“
Hungarianβœ“
Irishβœ“
Hindiβœ“
Bengaliβœ“
Punjabiβœ“
Marathiβœ“
Teluguβœ“
Vietnameseβœ“
Tamilβœ“
Urduβœ“
Javaneseβœ“
Gujaratiβœ“
Persianβœ“
Bhojpuriβœ“
Hausaβœ“
Kannadaβœ“
Indonesianβœ“
Yorubaβœ“
Malayalamβœ“
Odiaβœ“
Maithiliβœ“
Burmeseβœ“
Uzbekβœ“
Sindhiβœ“
Romanianβœ“
Pashtoβœ“
Magahiβœ“
Malayβœ“
Nepaliβœ“
Assameseβœ“
Afrikaansβœ“
Albanianβœ“
Amharicβœ“
Armenianβœ“
Azerbaijaniβœ“
Basqueβœ“
Belarusianβœ“
Bosnianβœ“
Czechβœ“
Esperantoβœ“
Estonianβœ“
Galicianβœ“
Icelandicβœ“
Kazakhβœ“
Kurdishβœ“
Latvianβœ“
Lithuanianβœ“
Macedonianβœ“
Malagasyβœ“
Serbianβœ“
Sinhalaβœ“
Slovakβœ“
Slovenianβœ“
Somaliβœ“
Swahiliβœ“
Tagalogβœ“
Tajikβœ“
Tatarβœ“

Features supported by other languages include:

LanguageTextSpellcheckCommon system entitiesWordforms
Chineseβœ“βœ“βœ“
Japaneseβœ“βœ“βœ“
Thaiβœ“
Burmeseβœ“
Khmerβœ“