PHP patches

35050 - Capital "I" letters in func/class method names do not work with Turkish locale

Fix for PHP bug 35050

First patch version - zend_operators.c.diff.gz
Second patch version (after comments by Marcus Boerger) zend_operators.c.v2.diff.gz

ascii_tolower() function was written by Björn JACKE for Lynx browser. I have contacted Björn and he confirmed that code can be used under any license.

> You wrote patch for Lynx
> http://j3e.de/linux/patches/lynx2-8-5dev16-localefix-bj.diff
> 
> Lynx is licensed under GPL. Can I use your ascii_tolower() function in PHP
> under PHP license? http://www.opensource.org/licenses/php.php

yes, you can use that patch under PHP license or any other license you want.

Cheers
Bjoern

Patch is written for PHP 5.2.5-dev. It should work in 6.0-dev and PHP 5.2.1 or later version. Any version later than 2006-12-05.

Please note that I know when zend_tolower() was introduced in zend_operators.c, but I don't have information about all changes made by Stas in that commit. If you want to fix older PHP version, you will have to hunt all tolower() calls and replace them with zend_ascii_tolower().

According to PHP developers patch breaks other locales. I don't have information about any broken features and don't understand how it can break things. If PHP is used on Windows, zend_tolower() acts same way as patched version. On Windows PHP uses locale unaware _tolower_l() function instead of locale aware tolower(). It might break things, if you have 8bit classes and method names and expect that 8bit symbols are case insensitive. If you have such code, it is unportable and depends on some specific system locale.

If you need workaround for other PHP versions and can't change your PHP compilation, just set LC_CTYPE locale to C. It also deals with programming mistakes in PHP scripts. If LC_CTYPE locale is set to C, gettext translations must call bind_textdomain_codeset() for any used gettext domain.

setlocale(LC_ALL,'tr_TR.UTF-8');
setlocale(LC_CTYPE, 'C');

Test results

make test - unpatched, patched, difference.

Zend/bench.php - unpatched and patched.

turkish.php tests:
 C locale - unpatched and patched
 tr_TR.ISO8859-9 locale - unpatched and patched
 tr_TR.UTF-8 locale - unpatched and patched


Last modified: Tue Sep 18 15:56:54 EEST 2007