UTF-8: Difference between revisions

From EggWiki
Jump to navigation Jump to search
(Update 1.9.4 UTF changes)
mNo edit summary
 
Line 1: Line 1:
<pre style="color: red">IMPORTANT!! As of Eggdrop 1.9.4, Eggdrop was able to successfully workaround the default UTF handling issues with Tcl. That means if you're using a recent version, you probably don't need to do these steps!
<pre style="color: red">IMPORTANT!! As of Eggdrop 1.9.4, Eggdrop was able to successfully workaround the default UTF handling issues with Tcl 8.x versions. That means if you're using a recent version, you probably don't need to do these steps! Tcl 9 and above should not have an issue with UTF-8.
HOWEVER
HOWEVER
Eggdrop's ability to handle UTF does not mean Tcl scripts running on Eggdrop were written to handle UTF. For that reason, if you're using a script that isn't displaying Emoji's properly, check out that section below FIRST
Eggdrop's ability to handle UTF does not mean Tcl scripts running on Eggdrop were written to handle UTF. For that reason, if you're using a script that isn't displaying Emoji's properly, check out that section below FIRST

Latest revision as of 17:47, 11 September 2024

IMPORTANT!! As of Eggdrop 1.9.4, Eggdrop was able to successfully workaround the default UTF handling issues with Tcl 8.x versions. That means if you're using a recent version, you probably don't need to do these steps! Tcl 9 and above should not have an issue with UTF-8.
HOWEVER
Eggdrop's ability to handle UTF does not mean Tcl scripts running on Eggdrop were written to handle UTF. For that reason, if you're using a script that isn't displaying Emoji's properly, check out that section below FIRST

There are four common issues users encounter when dealing with UTF-8 encoding issues.

HTTP scripts output "funny" characters

If a script that pulls information from an HTTP source isn't outputting characters properly (letters with accents, etc, are being displayed as other incorrect characters), take a look at https://tcl.tk/man/tcl8.6/TclCmd/http.htm#M61 for an explanation, then try adding set ::http::defaultCharset utf-8 before the http request in the script.

Emojis don't appear in various other ways

Prior to Eggdrop versiom 1.9.4, perhaps the most common problem was people attempting to use Unicode emojis (Thumbs up, skull and crossbones, party hats, etc), and incorrectly thinking their Eggdrop does not support any UTF-8 encoding at all. The inability to use Emojis is a result of an issue with Tcl currently (by default) not being able to support Unicode characters with control codes over 3 bytes, which means Tcl cannot handle control codes higher than U+FFFF (More information on this can be found at https://wiki.tcl-lang.org/page/Unicode+and+UTF-8). The remedy for this issue requires you to download a current version of the Tcl source from https://www.tcl.tk/software/tcltk/download.html and manually compile it (this cannot be done by intalling Tcl via a package manager such as apt or yum).

Step 1

Open the generic/tcl.h file in your editor.

Search for

#define TCL_UTF_MAX 3

and replace it with

#define TCL_UTF_MAX 6

Step 2

Follow the instructions included with the download to compile and install Tcl.

Step 3

You'll need to recompile Eggdrop, and may need to specify --with-tcllib and --with-tclinc to point to the new location you installed Tcl to. A correct ./configure command may look similar to this:

./configure --with-tclinc=/usr/local/include/tcl.h --with-tcllib=/usr/local/lib/libtcl8.6.so


Locale issues

If you're still having issues with UTF-8, on your host machine, type locale. If it looks similar to this, congrats! you have UTF-8 support and Eggdrop should be able to handle non-Emoji unicode characters.

LANG=en_US.UTF-8
LANGUAGE=
LC_CTYPE="en_US.UTF-8"
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE="en_US.UTF-8"
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"
LC_ALL=

If it does *not* look like that (specifically, the LANG variable does not end in UTF-8, then you can change your locale to one that supports UTF-8, and Eggdrop will pick up on that change. To do this:

Step 1

Visit https://help.ubuntu.com/community/Locale (or a similar page for the OS flavor of your choice). In short, you'll want to view the locales available on your system by running

locale -a

and find the UTF-8 setting that matches your preferred language. Open the /etc/default/locale file in your editor.

Search for

LANG=<something>

and replace it with

LANG=de_EN.UTF-8

or, obviously, whichever language makes sense to use for you.

Step 2

Recompile Eggdrop.

If you are still having issues after making these changes, re-read the Emojis section.

UTF-8 not found on system

If the locale trick does not work, or you are on a system that does not have a UTF locale pack (or you are unable to install one), you can try to force UTF-8 by doing the following.

Step 1

Open the eggdrop1.9.2/src/tcl.c file in your editor.

Search for (around line 650)

if (encoding == NULL) {
  encoding = "iso8859-1";
}

and insert the following right after it

encoding = "utf-8";

Step 2

Recompile Eggdrop.

How can I check what encoding is set?

  • If you have the .tcl command enabled, you can run
.tcl encoding system

to check what encoding Tcl is trying to use