UTF-8: Difference between revisions
(Update 1.9.4 UTF changes) |
mNo edit summary |
||
Line 1: | Line 1: | ||
<pre style="color: red">IMPORTANT!! As of Eggdrop 1.9.4, Eggdrop was able to successfully workaround the default UTF handling issues with Tcl. That means if you're using a recent version, you probably don't need to do these steps! | <pre style="color: red">IMPORTANT!! As of Eggdrop 1.9.4, Eggdrop was able to successfully workaround the default UTF handling issues with Tcl 8.x versions. That means if you're using a recent version, you probably don't need to do these steps! Tcl 9 and above should not have an issue with UTF-8. | ||
HOWEVER | HOWEVER | ||
Eggdrop's ability to handle UTF does not mean Tcl scripts running on Eggdrop were written to handle UTF. For that reason, if you're using a script that isn't displaying Emoji's properly, check out that section below FIRST | Eggdrop's ability to handle UTF does not mean Tcl scripts running on Eggdrop were written to handle UTF. For that reason, if you're using a script that isn't displaying Emoji's properly, check out that section below FIRST |
Latest revision as of 17:47, 11 September 2024
IMPORTANT!! As of Eggdrop 1.9.4, Eggdrop was able to successfully workaround the default UTF handling issues with Tcl 8.x versions. That means if you're using a recent version, you probably don't need to do these steps! Tcl 9 and above should not have an issue with UTF-8. HOWEVER Eggdrop's ability to handle UTF does not mean Tcl scripts running on Eggdrop were written to handle UTF. For that reason, if you're using a script that isn't displaying Emoji's properly, check out that section below FIRST
There are four common issues users encounter when dealing with UTF-8 encoding issues.
HTTP scripts output "funny" characters
If a script that pulls information from an HTTP source isn't outputting characters properly (letters with accents, etc, are being displayed as other incorrect characters), take a look at https://tcl.tk/man/tcl8.6/TclCmd/http.htm#M61 for an explanation, then try adding set ::http::defaultCharset utf-8 before the http request in the script.
Emojis don't appear in various other ways
Prior to Eggdrop versiom 1.9.4, perhaps the most common problem was people attempting to use Unicode emojis (Thumbs up, skull and crossbones, party hats, etc), and incorrectly thinking their Eggdrop does not support any UTF-8 encoding at all. The inability to use Emojis is a result of an issue with Tcl currently (by default) not being able to support Unicode characters with control codes over 3 bytes, which means Tcl cannot handle control codes higher than U+FFFF (More information on this can be found at https://wiki.tcl-lang.org/page/Unicode+and+UTF-8). The remedy for this issue requires you to download a current version of the Tcl source from https://www.tcl.tk/software/tcltk/download.html and manually compile it (this cannot be done by intalling Tcl via a package manager such as apt or yum).
Step 1
Open the generic/tcl.h file in your editor.
Search for
#define TCL_UTF_MAX 3
and replace it with
#define TCL_UTF_MAX 6
Step 2
Follow the instructions included with the download to compile and install Tcl.
Step 3
You'll need to recompile Eggdrop, and may need to specify --with-tcllib and --with-tclinc to point to the new location you installed Tcl to. A correct ./configure command may look similar to this:
./configure --with-tclinc=/usr/local/include/tcl.h --with-tcllib=/usr/local/lib/libtcl8.6.so
Locale issues
If you're still having issues with UTF-8, on your host machine, type locale. If it looks similar to this, congrats! you have UTF-8 support and Eggdrop should be able to handle non-Emoji unicode characters.
LANG=en_US.UTF-8 LANGUAGE= LC_CTYPE="en_US.UTF-8" LC_NUMERIC="en_US.UTF-8" LC_TIME="en_US.UTF-8" LC_COLLATE="en_US.UTF-8" LC_MONETARY="en_US.UTF-8" LC_MESSAGES="en_US.UTF-8" LC_PAPER="en_US.UTF-8" LC_NAME="en_US.UTF-8" LC_ADDRESS="en_US.UTF-8" LC_TELEPHONE="en_US.UTF-8" LC_MEASUREMENT="en_US.UTF-8" LC_IDENTIFICATION="en_US.UTF-8" LC_ALL=
If it does *not* look like that (specifically, the LANG variable does not end in UTF-8, then you can change your locale to one that supports UTF-8, and Eggdrop will pick up on that change. To do this:
Step 1
Visit https://help.ubuntu.com/community/Locale (or a similar page for the OS flavor of your choice). In short, you'll want to view the locales available on your system by running
locale -a
and find the UTF-8 setting that matches your preferred language. Open the /etc/default/locale file in your editor.
Search for
LANG=<something>
and replace it with
LANG=de_EN.UTF-8
or, obviously, whichever language makes sense to use for you.
Step 2
Recompile Eggdrop.
If you are still having issues after making these changes, re-read the Emojis section.
UTF-8 not found on system
If the locale trick does not work, or you are on a system that does not have a UTF locale pack (or you are unable to install one), you can try to force UTF-8 by doing the following.
Step 1
Open the eggdrop1.9.2/src/tcl.c file in your editor.
Search for (around line 650)
if (encoding == NULL) { encoding = "iso8859-1"; }
and insert the following right after it
encoding = "utf-8";
Step 2
Recompile Eggdrop.
How can I check what encoding is set?
- If you have the .tcl command enabled, you can run
.tcl encoding system
to check what encoding Tcl is trying to use