@pganssle Just an FYI, now that I am past some of the multiprocessing headaches and found some easier workarounds that isnt as convoluted as my earlier attempts I can say that while still not my favorite WRT multiprocessing it is a lot more pleasant than my early experiences.

@pganssle though I will say one thing has been infuriating me that i cant figure out whats going on and there is nothing on google... apparently if i create more than about 500 multiprocessing RLocks I get an out of memory error (and we are talking a 64 gig system). I am on docker and I am pretty sure its actually related to the shm size which when upped from 64 M to 2 gig was able to handle the 500 RLock (before i couldnt even get taht much).. but im not sure as it doesnt appear to actually **fill** the shm by nearly that much (though it does fill it a few 100M now)...

Very weird problem but eh, for now im just limiting my algorithms lookback to 500 minutes and it works.

Other than that i figured out most of the multiprocessing problems and have what used to take 2 hours to run down to a minute.

@freemo Weird. If you are doing so much in parallel and it's a big part of your operation (and you think it's worth it to explore this further) it might make sense to try out using Cython or Numba with a function that releases the GIL, then use multithreading instead of multiprocessing.

Running hundreds of processes and serializing / serializing your data probably creates a ton of overhead.

Follow

@freemo Though TBH I'd kinda love to have a problem that admits an embarrassingly parallel solution as an excuse to write something significant in Rust to test out that "fearless concurrency".

Closest I've come is this: gitlab.com/pganssle/metadata-b

There's a big queue that theoretically could be read in parallel, but I think the file system access ends up blocking, because adding multithreading into the mix doesn't seem to have meaningfully sped anything up.

@pganssle massive parallelism is a lot of fun in any language, even in python with all its annoyances its still a fun task.. seen my CPUs all light up to 100% (I have 32 CPU cores) and my run time go from 2 hours to a minute or two is very pleasurable

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.