@freemo ok, thanks, that's some updates to my knowledge base
Based on my experience on JVM, when I switched from JVM's native thread to Kotlin coroutines (which is based on threads but is able to share threads, so less thread suspension overall), I got a free performance boost. I assume go can achieve the similar thing. If so, I would say there is a free optimization without largely redesigning the algorithm.
Also, I always prefer strong typed languages when co-op with other developers. Python makes me panic when I don't know what the type of variable x
@skyblond JVM? Thats Java, thought we were talking about python?
Sounds like you implemented threading poorly in java and moved to a language where the way you built it was more appropriate to that language... I mean sure, they might be able to write code on Rust better than in Python simply because they are better at writing code for Rust than they are for Python... but thats not the language's fault.
> Also, I always prefer strong typed languages when co-op with other developers. Python makes me panic when I don't know what the type of variable x
Based on the context here you mean staticly typed, not strongly typed
@freemo oh.. my bad. I did a quick google search and found Python is quite different from JVM. On JVM, a thread is always a kernel thread. On Python, it might or might not be (StackOverflow told me this). So when I initially thought about GIL, I thought it would be act like a monitor lock on JVM, which cause a kernel thread to suspend and have a relatively huge penalty.
And after another goolge search, yes, I do mean statically typed and strongly typed, where you need to declare the type of a variable and cannot change it on the fly.
@skyblond Yea in java multithreading with the purpose of leveraging multiple CPUs is pretty straightforward and similar to any other language.
Python really is the odd one out where multithreading is a bit of a cop-out as it doesnt actually run in parallel and across CPUs and thus requires either C-python to bypass the GIL or multi-process handling.. which is quite ugly to do efficiently.
> The key limitation of co-routines is that other "threads" do not have a chance to run until the current code hits a "yield". This also means you don't need to bother with locks and stuff like with true multi-tasking, hence the efficiency for an interpreted language.
The whole point of multithreading in this discussion is its ability to leverage mutlicpus like it does in other languages... This sounds like they are sharing one thread since only one is running at a time.
As a JVM lover, I am jealous of the ability to call C code directly from CPython.
And yes, coroutine is powerful. I'm using kotlin coroutines and it's a huge (free) improvement of Java's native thread.
@freemo co-routines can run on multiple threads, where the "tasks" can yield and the thread from a pool can switch to another "task" without suspension or something. At least kotlin coroutine can, and according to sdgathman, Python can do it too. That's the ultimate free boost you can get by just switching to another tech.
But if Python can do that, then my earlier hypothesis about switching to go will give your free boost is wrong
Agreed, while python's biggest shame-to-fame is its inability to leverage multiple cpus through multithreading (or coroutines).. the flip side is that it is so trivial to interact with C-code that it makes up for that in a unique way that has its own value.
IF you need a language where you need to do a lot in C but what the convience of high-level language where you can get away with it, python is great... if you need a high level language that is cpu-intensive and effecient and easy to write without wanting to touch C, then python is a horrible choice.
Me personally, I use python a lot (just finished a 2-year project in python)... but over the years have found it just isnt a suitable language for most things, because of these very reasons.
And to your other comment.. usng a thread pool to run coroutines doesnt get you anything here because multi-threading in python wont spam multipleCPUs.. youd have to use multiporcesses still.
It will be great if Python can be ported to other platforms. The flexibility of Python is great for exploring things like new network structures, etc. But once you have decided most of the specifications, it's better to use something like C or Java to build a more strong code base (so a typo won't screw you up, LOL)
And I have to say, I'm really jealous of Python on machine learning stuff, where JVM is (almost) completely being ignored until people need a more robust way to develop and only find out the Java is a complete nightmare to do operator overloading and have to write something like "a = a.mul(b)"
Python is a nice little language for somethings... The more i use it the more I personally dislike it.. the lack of multi-cpu is a deal breaker for sure.. but i also just fine the language a bit ugly and hackish in its presentation.. which i could have gotten past if it wasnt for the cpu issue.
For high level stuff I much prefer java if i need something rigerous and formal, or ruby if i need something fast and loose... of course thats for traditional OO high level stuff.. For other categories stuff like Haskell is a lot of fun too.
Yea my last project i tried moving to pypy for exactly that reason. It turned out i couldnt move because of some dependency problem or something... We had tons of people trying to figure out how to multi-cpu out application and it was one dead end after another and would have effectively required a rewrite.
But it looks like Jython only supports Python 2? Python 3 is still in the future...
You can call C from any high level language. The difference is the varrier in python is very low, in java it is very high. Moreover in java it is a very expensive operation to cross the barrier from native to jvm unlike in python.
JNI is still Java oriented, which requires you to write some glue code to translate the difference between C and JVM world. (but with JNI, you can call Java code in C)
Now I'm using JNA, which is a black magic that automatically generates the proxy to call native shared libraries without any additional C code, with the cost of performance.
Compared to Java functions, yes, but overall no. According to StackOverflow, each JNI call is several nano seconds more compared to Java function calls. The main overhead is copying data from the Java heap to native memory. But if you use something like native buffer, there should be no such overhead.
If you load a big dataset in Java using a byte array, then you want to pass it to, let's say some BLAS implementation, then good luck copying all those data. But with native buffers, you can just pass the pointer of that buffer to the BLAS implementation and you're good to go.
https://stackoverflow.com/questions/13973035/what-is-the-quantitative-overhead-of-making-a-jni-call
@freemo @trinsec Based on my limited knowledge of python, the multithreading part is pretty heavy, if I recall correctly, you need a new python process to start a new thread (sounds familiar with JVM ). And go is pretty good at multithreading (I mean user-mode threads). If not limited by the IO, I would assume a go implementation will speed up some of the process. Maybe also ease the load on developers, considering go offers some great built-in multithreading structures.