utf8mb4 ✅🤜🏽 – Fun with MySQL
20th of September, 2018
This is a tech post. I recently came across a problem with storing data I’d like to share.
As you might know, I earn money by running a web development agency. Years ago, I used to even work as a web programmer. Not so much anymore, but I still like to keep up a bit and learn about interesting things. One of those things is utf8mb4 in MySQL databases.
A few weeks ago I was debugging a weird Emoji problem with my colleague Maxim. He is a great programmer, I used to be an average one at best. Still, it took some time for us together to find the solution to the problem.
Why would the character ”✅” get saved into our utf8 enabled MySQL database without a problem, but “🤜🏽” would not?
After a while of changing code and seeing what would happen, – you know, standard web developer stuff, – I remembered a good friend of mine, Andreas Reich, had sent over a random “hey, check out this weird MySQL problem” article a while ago. I skimmed it but had no practical use of this new knowledge at the time.
Now I had. Here’s the solution to MySQL’s encoding problems.
Turns out, these new skin-color Emojis (and others as well) need an additional byte of storage and also MySQL is a bit broken in handling utf8 encoding.
The TL;DR of this post would be:
If you use a MySQL database (and would like to store special Emojis), always use “utf8mb4” as the character set, not the broken “utf8”. Also, best alter your existing databases.