Locales

Locales and Locale Strings

Overview

This page will discuss the standard used by General Translation to represent locales and languages and the list of currently supported locales.


Locale strings

General Translation uses a variant of the BCP 47 Language Tag standard to represent locales and languages. BCP 47 Language Tags are the Internet Best Current Practices (BCP) standard for identifying languages in both spoken and written forms. These tags provide a uniform way to specify languages, allowing applications to adapt content, format, and behavior based on the user's locale.

Language tags are composed of one or more subtags separated by the "-" character. The subtags include the following components:

  • Language Subtag: Represents the primary language, e.g., en for English, es for Spanish.
  • Region Subtag: Specifies a country or region, e.g., US for the United States, FR for France.
  • Script Subtag (optional): Indicates the writing script, e.g., Latn for Latin script.
  • Variant Subtag (optional): Identifies a specific variation of a language, e.g., arevela for Eastern Armenian.

When you combine these tags together, we refer to this as a Locale or Locale tag.

Commonly Used Tags

In practice, most language tags consist of two subtags: a language and a region. Here are some common examples:

Language TagDescription
en-USEnglish as used in the US
es-ESSpanish as used in Spain
fr-CAFrench as used in Canada
zh-CNSimplified Chinese (China)
de-DEGerman as used in Germany

Extended Tags

Language tags can include additional subtags for more specificity:

  • Example: hy-Latn-IT-arevela
    • hy: Armenian (language)
    • Latn: Latin (script)
    • IT: Italy (region)
    • arevela: Eastern Armenian (variant)

This tag represents Eastern Armenian written in Latin script, as used in Italy.

Exceptions to BCP 47 in GT

🚧 This section is currently under construction. 🚧


Supported locales

This section lists all locales that are currently supported by General Translation.

A note on low-resource languages

Our system leverages some of the most advanced LLM models on the market to provide accurate translations; however, these models are not without their limitations. Certain resource languages may not be supported by the model provider you have selected or any of the available providers. These languages are known as "low-resource languages".

Low-resource languages can vary between models, so if you specify a preferred model provider in your configuration, you may want to check the list of supported languages for that provider.

Official list

  • af🇿🇦
    Afrikaans
  • am🇪🇹
    Amharic
  • ar🇪🇬
    Arabic
  • ar-AE🇦🇪
    Arabic (United Arab Emirates)
  • ar-EG🇪🇬
    Arabic (Egypt)
  • ar-LB🇱🇧
    Arabic (Lebanon)
  • ar-MA🇲🇦
    Arabic (Morocco)
  • ar-SA🇸🇦
    Arabic (Saudi Arabia)
  • bg🇧🇬
    Bulgarian
  • bn🇧🇩
    Bangla
  • bs🇧🇦
    Bosnian
  • ca🌍
    Catalan
  • cs🇨🇿
    Czech
  • cy🏴󠁧󠁢󠁷󠁬󠁳󠁿
    Welsh
  • da🇩🇰
    Danish
  • de🇩🇪
    German
  • de-AT🇦🇹
    Austrian German
  • de-CH🇨🇭
    Swiss High German
  • de-DE🇩🇪
    German (Germany)
  • el🇬🇷
    Greek
  • el-CY🇨🇾
    Greek (Cyprus)
  • el-EL🌍
    Greek (EL)
  • en🇺🇸
    English
  • en-AU🇦🇺
    Australian English
  • en-CA🇨🇦
    Canadian English
  • en-GB🇬🇧
    British English
  • en-NZ🇳🇿
    English (New Zealand)
  • en-US🇺🇸
    American English
  • es🇪🇸
    Spanish
  • es-419🌍
    Latin American Spanish
  • es-AR🇦🇷
    Spanish (Argentina)
  • es-CL🇨🇱
    Spanish (Chile)
  • es-CO🇨🇴
    Spanish (Colombia)
  • es-ES🇪🇸
    European Spanish
  • es-MX🇲🇽
    Mexican Spanish
  • es-PE🇵🇪
    Spanish (Peru)
  • es-US🇺🇸
    Spanish (United States)
  • es-VE🇻🇪
    Spanish (Venezuela)
  • et🇪🇪
    Estonian
  • fa🇮🇷
    Persian
  • fi🇫🇮
    Finnish
  • fil🇵🇭
    Filipino
  • fr🇫🇷
    French
  • fr-BE🇧🇪
    French (Belgium)
  • fr-CA🇨🇦
    Canadian French
  • fr-CH🇨🇭
    Swiss French
  • fr-CM🇨🇲
    French (Cameroon)
  • fr-FR🇫🇷
    French (France)
  • fr-SN🇸🇳
    French (Senegal)
  • gu🇮🇳
    Gujarati
  • he🇮🇱
    Hebrew
  • hi🇮🇳
    Hindi
  • hr🇭🇷
    Croatian
  • hu🇭🇺
    Hungarian
  • hy🇦🇲
    Armenian
  • id🇮🇩
    Indonesian
  • is🇮🇸
    Icelandic
  • it🇮🇹
    Italian
  • it-CH🇨🇭
    Italian (Switzerland)
  • it-IT🇮🇹
    Italian (Italy)
  • ja🇯🇵
    Japanese
  • ka🇬🇪
    Georgian
  • kk🇰🇿
    Kazakh
  • kn🇮🇳
    Kannada
  • ko🇰🇷
    Korean
  • la🇻🇦
    Latin
  • lt🇱🇹
    Lithuanian
  • lv🇱🇻
    Latvian
  • mk🇲🇰
    Macedonian
  • ml🇮🇳
    Malayalam
  • mn🇲🇳
    Mongolian
  • mr🇮🇳
    Marathi
  • ms🇲🇾
    Malay
  • my🇲🇲
    Burmese
  • nl🇳🇱
    Dutch
  • nl-BE🇧🇪
    Flemish
  • nl-NL🇳🇱
    Dutch (Netherlands)
  • no🇳🇴
    Norwegian
  • pa🇮🇳
    Punjabi
  • pl🇵🇱
    Polish
  • pt🇧🇷
    Portuguese
  • pt-BR🇧🇷
    Brazilian Portuguese
  • pt-PT🇵🇹
    European Portuguese
  • ro🇷🇴
    Romanian
  • ru🇷🇺
    Russian
  • sk🇸🇰
    Slovak
  • sl🇸🇮
    Slovenian
  • so🇸🇴
    Somali
  • sq🇦🇱
    Albanian
  • sr🇷🇸
    Serbian
  • sv🇸🇪
    Swedish
  • sw🇹🇿
    Swahili
  • sw-KE🇰🇪
    Swahili (Kenya)
  • sw-TZ🇹🇿
    Swahili (Tanzania)
  • ta🇮🇳
    Tamil
  • te🇮🇳
    Telugu
  • th🇹🇭
    Thai
  • tl🇵🇭
    Filipino
  • tr🇹🇷
    Turkish
  • uk🇺🇦
    Ukrainian
  • ur🇵🇰
    Urdu
  • vi🇻🇳
    Vietnamese
  • zh🇨🇳
    Chinese
  • zh-CN🇨🇳
    Chinese (China)
  • zh-HK🇭🇰
    Chinese (Hong Kong SAR China)
  • zh-SG🇸🇬
    Chinese (Singapore)
  • zh-TW🇹🇼
    Chinese (Taiwan)

Notes

  • General Translations uses Locale Tags (Locales) to identify languages and regions internally.

Next Steps

On this page