I would say groovy is the best scripting language on JVM platform.

And character encoding is the worst thing in the world. I use UTF-8 to make my file compatible with other OS, but stupid windows decide to use GB2312 when I use Chinese.

So, in the screenshot, you can see I successfully integrate groovy scripting in my program, but somehow the JVM failed to translate the file related string from gb2312 to UTF-8.

According to StackOverflow, from PS core v6+, UTF-8 without BOM is the default encoding. But sadly, I'm on v5.

Follow

About the charset.

I used to force JVM using UTF-8 by adding "-Dfile.encoding=UTF-8".

Just a moment ago, I disabled it and somehow it works. Looks like Powershell can print UTF-8 in GB2312 env, but it's the Java's File returns what the filesystem gives it, in this case it might be GB2312. Forcing JVM running in UTF-8 will make it think that is a UTF-8 string, instead of translate from GB2312.

And, in Windows, even your terminal is UTF-8, you still got GB2312 filenames from the fs.

And, somehow, the fs can take UTF-8 filenames and handle it correctly. Then why not use UTF-8 by default? Why?

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.