UTF-8 is an encoding method that allows you to mix languages and scripts within a single document without needing to switch between different character sets.
Everything is moving away from the old standard ISO-8859-1 to UTF-8.
UTF-8 is great because it allows you to use a wider range of characters. For example Greek:
Τη γλώσσα μου έδωσαν ελληνική
το σπίτι φτωχικό στις αμμουδιές του Ομήρου.
Μονάχη έγνοια η γλώσσα μου στις αμμουδιές του Ομήρου.
από το Άξιον Εστί
του Οδυσσέα Ελύτη
But sometimes there can be problems in the transition from ISO-8859-1 to UTF-8.
There are a whole range of examples:
The list could go on.
If you're planning on using UTF-8 (which you should) there is a simple way to set your website to this character set.
Using php:
<?php header('Content-Type: text/html; charset=UTF-8'); ?>
Now I could go into the Content-Type and using application/xhtml+xml if you're using XHTML 1.1 or higher. But I won't because IE is crap and doesn't support it.
I might talk about it later, because it isn't really related to the character set.
One thing to note. Mysql 4.0 doesn't *really* support UTF-8. So that Greek mightn't work. That is one of the reasons I am looking to moving to Mysql 4.1 (and also for sub-query support).
Does that PHP print out this?:
<meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />
That is what I meant. Forgot the damn encoding...
Header information is a bit different to meta data. The meta stuff just states what the site *should* be, but doesn't set it to that. (meta: data about data).
To get browsers to use UTF-8 you must send out a different header telling the browser to switch to that mode. The PHP code above does that. But you need to remember that header data *must* be outputted before any html.
So you should use both the PHP code and the META html code.
And, just for the record, the "META html code" dale mentions above should actually be "meta HTML code", if you're using XHTML...
<pedanticism />
Does that PHP print out this?:
Just curious... Just trying to figure out the difference of declaring that element through PHP, or just writing the HTML out.