Follow

I would say groovy is the best scripting language on JVM platform.

And character encoding is the worst thing in the world. I use UTF-8 to make my file compatible with other OS, but stupid windows decide to use GB2312 when I use Chinese.

So, in the screenshot, you can see I successfully integrate groovy scripting in my program, but somehow the JVM failed to translate the file related string from gb2312 to UTF-8.

According to StackOverflow, from PS core v6+, UTF-8 without BOM is the default encoding. But sadly, I'm on v5.

Of course, kotlin might be the best option for cli scripting. Like, instead of `entry.ctime.isBefore(now.minusDays(3))`, I can defind something like `ZonedDateTime.isBeforeNDays(days: Long)`.

But 1) kotlin scripting is still experimenting, 2) groovy scripting only need 1 jar to work.

The only remaining thing is to test if it works with GraalVM's native image. (I would guess it won't work. Groovy seems like generating those JVM bytecode on the fly, and the native image doesn't have JVM at all.)

About the charset.

I used to force JVM using UTF-8 by adding "-Dfile.encoding=UTF-8".

Just a moment ago, I disabled it and somehow it works. Looks like Powershell can print UTF-8 in GB2312 env, but it's the Java's File returns what the filesystem gives it, in this case it might be GB2312. Forcing JVM running in UTF-8 will make it think that is a UTF-8 string, instead of translate from GB2312.

And, in Windows, even your terminal is UTF-8, you still got GB2312 filenames from the fs.

And, somehow, the fs can take UTF-8 filenames and handle it correctly. Then why not use UTF-8 by default? Why?

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.