Follow

Have I mentioned yet today how much I hate python multiprocessing, or writing cpu-limited algorithms on python in general... my god why could they not fix the GIL lock problem in python3, are the devs really this lazy!

Β· Β· 4 Β· 0 Β· 3

@freemo A day on is like 1500 hours, so ya it could have been today! 😎

@freemo FYI this kind of thing is kinda disrespectful. Python is a remarkably successful language and a lot of incredibly smart people work very hard on it.

When you start working on a project as widely used and complicated as Python, you realize that backwards compatibility concerns very quickly make seemingly simple things incredibly difficult.

@freemo Ripping out the GIL has been the subject of multiple multi-year projects that have failed for various reasons.

Honestly, I would love to see the GIL gone, but I'm not sure how much better that would make things in a lot of cases. Python tends to be slow for a number of reasons, so the most common idiom is that when you need real performance on something CPU bound, you drop into a lower level language like Rust or C (at which point it's very easy to drop the GIL, incidentally *also* making it easier to parallelize).

@pganssle Oh and as for dropping into a low level language, if i were to do that, particularly on this project where the multiprocessing component cant be easily isolated fromt eh rest, then there would be very little if any of what im doing actually in python.

At that point I would just pick a language that didnt have these issues but still high level and avoid python all together. If i wanted to code in C I would have.

@freemo In a lot of cases, that lower level language is actually just Cython, which is a superset of Python that compiles to C and basically reads and writes like Python.

And in most cases, that lower level language code is already written *for you*, since you can usually find a language that does most of the hard work for you in C. Numpy, scipy, pandas, scikit-learn, etc, are all examples of this from the scientific/data stack, but there are equivalents in other stacks as well.

If you want to do a lot in parallel, check out Dask.

@pganssle In this case I am already using numpy and scipy, and pandas which i think also has some low level bindings.

As far as I can tell none of them do parallel processing out of the box (if they did I probably wouldnt need to do it myself). There were some projects that add it on, parallellal or whatever it was called, modin, and I think one other. Tried all three none of them really helped and had marginal improvements.

With that said I **have** been able to multiprocess this and the performance has went from 2 hours to process a year simulation down to 10 minutes. Not as good as i could get with better parallelization tools, but good enough to work with it for now. My main issue is how convoluted multiprocessing is rather than if it can do it at all.

I tried Dask through modin and it didnt do much speedup, I may need to just bite the bullet and try pure Dask at this point, though im not sure if it will solve my problem. Even if i could get my numpy and pandas operations parallel the core of the issue is realtime processing of streaming data effectively and neither numpy nor pandas handles that well, which suggests to me Dask either wont fix the issue either or if it does it will be even more convoluted than just using multiprocessing.

Either way, for now i am able to get things working with multiprocessing but I really dont like what the code looks like to get it there, or the compromises I had to make that limit me from seeing better performance.

@pganssle Yea I suppose I should have kept my criticism to the language and the tech and not extend it to the developers, just in the sense of disrespect. So I will take that aspect.

But I stand by what I have said, I have worked with countless languages and know hundreds and of all the relatively mainstream and significantly developed languages out there I've never found a language that feels as sloppily put together as python does.

Don't get me wrong I've used python lightly for years. It has a place and for some things it works well enough to be a non-issue. But overall it feels like it wasnt developed to the level of quality i expect of a mainstream language.

@freemo It's incredibly popular and widely used by an *increasing* fraction of the ecosystem for a good reason.

Your criticisms are not uncommon, and they tend to be leveled at Python by people who come from other languages and then don't try and learn the "Python way" to do things.

In any case, I am not a Python maximalist. I suspect that you are not really giving Python a fair shot, or are judging based on some subjective measure like aesthetic, but you have every right to your opinion.

If you think you are going to need to use Python a lot in the future for whatever reason, it would probably be a good idea to start coming up with objective measures of what you dislike about it and see how experienced Python programmers would solve the problems. You may be surprised with how competent and versatile Python actually is.

@pganssle

> In any case, I am not a Python maximalist. I suspect that you are not really giving Python a fair shot,

If i werent giving python a fair shot I wouldnt be using it and I wouldnt be rewriting my code 5x over to make it more pythonic where I can.

I am more than happy to keep trying to use it until it becomes impossible and dont intend to jump ship anytime soon. Likewise i will keep complaining and hold my criticisims until one of those rewrites, which hopefully get me closer to idiomatic python, produces code elegance and performance that I can praise.

For the moment the journey does not have me impressed, time will tell if that opinion changes. But for now I clearly am "giving it a chance", I just can and will continue to complain until those chances I keep giving it prove my preconceptions wrong (and maybe they will).

@freemo Sorry, I think I may not have fully understood the context here. It sounded like you were being forced to use Python for some reason and were complaining about this need.

I think you may have a better experience if instead of complaining about why it doesn't fit your preconceptions you asked, "How do people do X?" Usually if other people complain, "Oh yeah that's basically impossible", then yeah it's a sore spot πŸ˜›

@freemo Not saying there's nothing worth complaining about, TBH, but I've been involved in a lot of language design discussions in Python lately, and I've actually killed at least one "white whale", see, e.g. this thread, which resolved an issue that another `datetime` maintainer said he tried to fix *10 years earlier* and was stymied: mail.python.org/archives/list/

The fact of the matter is that a lot of the stuff that superficially looks sloppy is actually part of a complex and consistent ecosystem. It's third-order consequences of something, or it's something that was designed in another context but cannot be changed for backwards compat reasons.

And frankly, I wouldn't be surprised to find that a lot of the stuff that makes Python really good is tied up in the stuff that makes it need wrapper libraries and such.

@pganssle I didnt realize you were an actual python dev... in that case I understand why your earlier response to my statements were taken personally and seen as rude. I apologize for that. As I said i should have kept it to the technicals.

@freemo Well, regardless, I'd like to think I'd feel similarly even if I weren't a CPython core dev (and I have no personal stake in the GIL, I didn't design it and I didn't even become a core dev until well after the 2β†’3 migration).

In my time in open source, I've found that a lot of the stuff that seems obviously idiotic actually frequently has some real, hard problems at its core. I gave a keynote at PyConf Hyderabad about this recently: youtube.com/watch?v=aRqulQUgiI

My experiences have really emphasized the value of humility in evaluating technical decisions.

@pganssle I wouldnt doubt that the GIL is a non trivial problem to fix, especially after the fact.. But it has been done in non-standard python interpriters and franky when it comes to the developers of core languages I do have an expectation that they be the best and are capable of solving really hard problems. If they werent then the language probably isnt a great language as it wont live up to its potential.

@freemo The hard part isn't fixing it for random non-standard interpreters. The hard part is fixing it in the *core* interpreter without breaking all the stuff built on top of it.

Like, it's easy to swap out the tires on your car when the thing is on a jack and not even fully assembled. It's a lot harder to swap them out while driving down the highway at 65 miles per hour while rushing someone to the hospital πŸ˜›

@freemo Probably the most promising thing on the horizon for the GIL removal (and many other problems caused by the way the C API works) is HPy: github.com/hpyproject/hpy

That still basically involves rewriting all C extensions to use handles instead of manually managed reference counts, and there is very little appetite for another "break the universe" change when people generally have a number of good solutions for this problem already.

@pganssle Are you saying the issue is that it would break third party libs (if so i suppose you are also saying that third party libs dont tend to work with non-standard python interpreters either?)?

@freemo Yes and yes.

PyPy, for example, does have a GIL, and has an extensive compatibility layer to support the C API. My understanding is that for a long time PyPy didn't work *at all*, or worked very poorly, when used with anything that uses the C API.

Even now, it's touch-and-go, and many third party libraries aren't tested against PyPy.

@freemo I am not sure which alternate interpreters don't have a GIL, but I would be shocked if they had the level of compatibility with third party (and private/proprietary) libraries and applications that would be required for upstreaming into CPython.

Another issue is that you may be able to get perfect compatibility at the cost of performance degradation for everything else. Having no GIL but being 30% slower is not a good trade-off for most people, particularly in a mature software ecosystem that evolved with the presence of the GIL.

@pganssle fair, but then all the more reason that my original assertion that it should have been fixed when they moved to python 3, before all the third party apps were rewritten holds true. Trying to fix it after the case, as I said, is anotehr backwards compatability nightmare by the sound of it.

@freemo I'm not sure the GIL-ectomy had come to fruition in time for the deadline, and the 2β†’3 migration was actually not so bad compared to how it could have been if Python 3 had *also* made extensive changes to the C API.

From what I can tell, the 2 to 3 migration nearly killed Python as a language, and might well have actually killed it if it had been any worse. Hard to Monday morning quarterback on this.

@pganssle It certainly is 90% of why I am critical of the language.. Its also my biggest gripe with haskell... breaking backwards compatibility is one thing I expect my languages not to do and if they do it im very hesitant to use them ever again, at least if its a major break.

@pganssle do you know of any sort of atomic-float or atomic-int implementations, sometimes called thread safe counters?

I am not talking about jsut implementing my own lock around it, thats what I'm about to do. Some languages make the "+=" and "++" operators atomic at a low level so youc an do it without needing to lock and thus getting performance improvements... im not seeing anything like that in python.

@freemo I don’t know of any offhand, sorry. Julien Danjou has an article on it that you’ve probably seen: julien.danjou.info/atomic-lock

I’m mildly surprised I don’t have an easy answer for this honestly. Seems like something that should at least be in toolz or boltons or something.

I admittedly don’t do a ton of concurrency stuff that would need this (in Python), though, so there may be something obvious I’ve missed.

@pganssle handrolling a solution is trivial so thats not a problem, and ultimately that is exactly what I did.

The issue isnt that i wanted a pre-made class that had this behavior. Its that a true atomic counter can not be implemented by hand. Sure the implementation you linked behaves the same, but at a low level its using a lock. True atomic counters behave this way without engaging an actual lock and thus are much faster under certain scenarios.

@pganssle Not really forced to use python, no.. there was some pressure to use it simply because there were more python libraries for algorithmic trading (specifically to access the broker APIs themselveS) than other languages.. but if i was really against python that would not have stopped me, especially considering that most of the project is hand rolled anyway (algorithms of my own design).

> I think you may have a better experience if instead of complaining about why it doesn't fit your preconceptions you asked, "How do people do X?" Usually if other people complain, "Oh yeah that's basically impossible", then yeah it's a sore spot

That would depends on the intended goal of complaining... If you are talking about how people might respond to me, then yes, I agree. If i were in a IRC room asking for python help, for example, my tone would be very different and i wouldnt complain so directly. On here however my goal isnt really to get help (though advice and help is always welcome and considered).. its mostly me complaining because it makes me laugh, starts some decent conversations,and it feels good to vent to my friends.

Now if what you mean is that my own mentality might be different if i didnt complain and i might be more productive.. well I'd agree with you if we were talking about an all encompassing mentality. but generally thats not how it tends to play out with me... sure I complain but in the same breath im googling how to do shit in python the right way, looked at 20 different examples, and pulling those that appear most elegant. I've rewritten this code probably 5 times in various ways as I do that. So trust me, while complaining might be off putting to you and others who might be more able or willing to help (and that is an unintended side effect), it isnt really slowing me down when it comes to learning the pythonic way of doing things.

@freemo Multiprocessing or multithreading? When I first wanted to figure out managing multiple programs simultaneously in python, I found how python does threading in one script very confusing. I opted to start multiple scripts as sub-processes instead and it was very logical to me.

@freemo

Isn't needing processor power normally what causes a migration to say C from python? That said, I feel your pain. But I'm curious to understand why your wrestling with python on it's weaker fronts?

@musingsole Largely because python offers a suit of tools that make the peripheral work much easier. All the greatest algorithmic trading tools appeared to be in python.

That said python can (as I recently did) acheive the performance, its not incapable of it. It just makes the process more painful than it needs to be in other languages.. even another high level language that allows for easier shared variables between threads would have solved the problem.

The idea that C is noticeably more performant is a bit of a myth. I code in C often and there are many good reasons to pick C, namely low level access to hardware, and it is the more direct and natural way to do GPGPU for certain types of algorithms... but outside of those two points it really isnt something I would pick for performance reasons as you really arent going to see any noticeable performance gain from C over many other high level languages **if** you code it right.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.