Home

Awesome

Thai Text Care for Unity

This library provides enhanced Thai language support for Unity's TextMeshPro such as Thai Word Segmentation and Thai Font Glyphs fixer for overlapped vowels/tone marks, significantly improving your experience when working with Thai language in Unity.

‎️‍🔥See it in Action: Try WebGL Demo‎️‍🔥 (Best Viewed on PC)<br>Buy Me a Coffee: If this project made your day a bit easier, consider Buying me a coffee

Tested On:

Overview

<br> <p align="center"> <img src="https://github.com/phanphantz/GameDevSecretSauce/blob/main/Assets/ThaiFontDoctor/ThaiTextCare_GIF.gif" width="65%"> </p> <p align="center"> <i>Thai Word Segmentation</i> </p> <br> <p align="center"> <img src="https://github.com/phanphantz/GameDevSecretSauce/blob/main/Assets/ThaiFontDoctor/ThaiFontDoctor_GIF.gif" width="65%"> </p> <p align="center"> <i>Thai Glyph Adjustment Automation</i> </p> <br>

ThaiTextNurse

This component tokenizes and separates Thai words on TextMeshPro components using Zero Width Space. Just attach it to any TextMeshPro component and you're all set! It will beautifully wrap the Thai text for you!

<img src="https://github.com/phanphantz/GameDevSecretSauce/blob/main/Assets/ThaiFontDoctor/ThaiTextNurse.jpeg" width="1000">

Key Features

Under The Hood

Handling the Dictionary

Scripting

Runtime & Editor

    public class ThaiTextCare_ExampleScript : MonoBehaviour
    {
        [SerializeField] TMP_InputField inputField;
        [SerializeField] TMP_InputField separatorInputField;
        [SerializeField] TMP_Text outputText;
        [SerializeField] TMP_Text wordCountText;
        [SerializeField] ThaiTextNurse nurse;

        void Start()
        {
            nurse.OnTokenized += result => RefreshWordCount(result.WordCount);
            inputField.onValueChanged.AddListener(OnOriginalMessageChanged);
            separatorInputField.onValueChanged.AddListener(OnSeparatorChanged);
        }

        void OnOriginalMessageChanged(string input)
        {
            //this outputText has a ThaiTextNurse attached, so it will be tokenized
            outputText.text = input;

            //Do not get the tokenized text here, rely on OnTokenized event binding in Start() instead
        }

        void OnSeparatorChanged(string value)
        {
            nurse.Separator = value;
        }

        void RefreshWordCount(int count)
        {
            wordCountText.text = count.ToString("N0") + " Words";
        }
    }

 // Example input string in Thai
string inputText = "สวัสดีครับ";

// Using SafeTokenize to tokenize the input
 string tokenizedResult = ThaiTextNurse.SafeTokenize(inputText);
 Debug.Log("Tokenized Result: " + tokenizedResult);

// Using TryTokenize for more control
if (ThaiTextNurse.TryTokenize(inputText, out TokenizeResult result))
 {
    Debug.Log("Successfully tokenized!");
    Debug.Log("Tokenized String: " + result.Result);
    Debug.Log("Word Count: " + result.WordCount);
}
else
{
    Debug.LogError("Tokenization failed.");
}

Editor-Only

ThaiFontDoctor

ThaiFontDoctor is a ScriptableObject that processes TextMeshPro's TMP_FontAsset, automating adjustments to glyph pairs based on predefined Glyph Combinations.

When you set a Glyph Combination, you specify which Thai character glyphs should pair together and the appropriate offset for each pair. ThaiFontDoctor then updates the GlyphAdjustmentTable in your TMP_FontAsset in real-time, making it easy to fine-tune how vowels and tone marks appear in your TMP_Text components.

Take a look at ThaiFontDoctor.asset for an example. This instance of ThaiFontDoctor's ScriptableObject already has some common GlyphCombinations for solving Thai font issues on it:

<img src="https://github.com/phanphantz/GameDevSecretSauce/blob/main/Assets/ThaiFontDoctor/ThaiFontDoctor_Example.png" width="1000">

Key Features

Glyph Presets

Here are the available GlyphPresets and its glyph members :

ThaiGlyphPresetDisplay NameGlyph Members
AllConsonantsก - ฮก, ข, ฃ, ค, ฅ, ฆ, ง, จ, ฉ, ช, ซ, ฌ, ญ, ฎ, ฏ, ฐ, ฑ, ฒ, ณ, ด, ต, ถ, ท, ธ, น, บ, ป, ผ, ฝ, พ, ฟ, ภ, ม, ย, ร, ล, ว, ศ, ษ, ส, ห, ฬ, อ, ฮ
AscenderConsonantsพยัญชนะหางบนป, ฝ, ฟ, ฬ
DescenderConsonantsพยัญชนะหางล่างฎ, ฏ
AllUpperGlyphsอักขระด้านบน- ิ, - ี, - ึ, - ื, - ็, - ั, - ์, - ่, - ้, - ๊, - ๋
UpperVowelsสระบน- ิ, - ี, - ึ, - ื, - ็, - ั
ToneMarksวรรณยุกต์- ่, - ้, - ๊, - ๋
ThanThaKhaatทัณฑฆาต- ์
LeadingVowelsสระหน้าเ-, แ-, โ-, ไ-, ใ-
AllFollowingVowelsสระหลัง-ะ, - ำ, -า, -ๅ
SaraAumสระอำ- ำ
LowerVowelsสระล่าง- ุ, - ู

Limitations

How to Install Thai Text Care

You have 2 options for installing the library. Either via the package manager (Recommended) or by downloading this repository and putting it in your Unity Projects.

Package Manager Installation

Installing the package via the Package Manager allows you to easily install or update ThaiTextCare as a third-party library.

  1. In UnityEditor, Go to Window > Package Manager
  2. Click + button and choose Add package by git URL
  3. Use this link to install the package: https://github.com/phanphantz/ThaiTextCare-for-Unity.git
  4. That's it! You're all set!

Other Considerations

<p align="center"> <br> <img src="https://github.com/phanphantz/GameDevSecretSauce/blob/main/Assets/ThaiFontDoctor/ThaiTextCare_ExamplesFolder.jpeg" width="200"> <br> </p>

Known Issues

Future Plans

Thai Font Modification using FontForge

To modify your Thai font extended character glyphs to have the correct Unicode, follow these steps :

  1. Download FontForge
  2. Open the desired font with FontForge (On Windows, you can use Open With... from the Right-click menu and Choose FontForge)
  3. You will need to map the extended characters to the correct Unicode by yourself by right-clicking on the target glyph > Glyph Info... > Then edit the Unicode field to the correct value.

A Python script can also be used to automate the process. You can copy glyphs by their names to the target Unicode, but the challenge is that many fonts use different naming conventions for their characters.

According to C90 encoding, Extended glyphs consist of:

CodeDescriptionC90 Category
U+F700uni0E10.desclessbase.descless
U+F701~04uni0E34~37.leftupper.left
U+F705~09uni0E48~4C.lowlefttop.lowleft
U+F70A~0Euni0E48~4C.lowtop.low
U+F70Funi0E0D.desclessbase.descless
U+F710~12uni0E31,4D,47.leftupper.left
U+F713~17uni0E48~4C.lefttop.left
U+F718~1Auni0E38~3A.lowlower.low

Here’s a summary of the Thai characters in each row:

  1. After you modify the glyphs, You can now Export the font and use it to create TMP_FontAsset in Unity.

Acknowledgements

You are all true heroes🙏

With gratitude, <br>Phun