PHP : How to handle URI or URL with non-ASCII characters such as Chinese/Japanese/Korean(CJK) ?




This tutorial is for CodeIgniter, but it should work with any PHP framework or raw PHP.

While working with CodeIgniter, I need to grab URI with Chinese characters such as :

http://webserver.com/令吉大贬伤内需‧白文春-0fu15fw

with the CodeIgniter's URI helper.

 $query = $this->uri->segment(2);

and use the $query as a parameter to search my database and pull out the relevent data.

Trouble is, the $query will become

%E4%BB%A4%E5%90%89%E5%A4%A7%E8%B4%AC%E4%BC%A4%E5%86%85%E9%9C%80%E2%80%A7%E7%99%BD%E6%96%87%E6%98%A5-0fu15fw

instead of 令吉大贬伤内需‧白文春-0fu15fw. Because of not having the URL decoded properly, the database query will not be correct and will always return empty result.

To fix this problem, use the PHP urldecode() function. For example

 $query = urldecode($this->uri->segment(2));

This will ensure the non-ASCII characters get in the URL decoded properly. Hope this tutorial can be useful to you.

References :

http://unicode.org/charts/PDF/U0000.pdf

http://php.net/manual/en/function.urldecode.php





By Adam Ng

IF you gain some knowledge or the information here solved your programming problem. Please consider donating to the less fortunate or some charities that you like. Apart from donation, planting trees, volunteering or reducing your carbon footprint will be great too.


Advertisement