Does anybody know of tables of Unicode codepoint frequency in real-world text for use in building compression tables?
I could calculate it based on the material I have available to me, but that seems like it's going to disadvantage a bunch of people that don't speak English.