@jonny In auto-pi-lot do you ever run into a situation where a zmq request is not replied to? and things get stuck? I took a look in `node.py` and couldn't tell if you do some time of timeout and message handling.
I'm asking because I'm thinking of re-writing part of our infrastructure to use REQ/REP zmq calls (which are now local function calls. this would be the first step in moving away from Matlab)..
@jonny Interesting. Currently, we are using pub/sub so I don't have to worry about getting out of sync in a req/rep loop. but for some reason Matlab/jeromq does miss messages sometimes and our current hack is to just send messages meant for Matlab a few times. but I'm not really happy with that and would like something more robust. it's low priority as things are working...
@jerlich
Not sure how easy it is in MATLAB to do threading/processes, but ya having something that can just wait on messages is p good idea
@jerlich
One of the little unreasonably effective tricks in autopilot that makes it work better than it has any right to is that the (tornado) messaging handler is actually at the root of the running process and most of the program logic happens in threads spawned from message callbacks. That pattern seems to make messaging highly available, and also dealers/routers seem to have more reliable implementation than pub/sub, which I had tried early on but found was more effective for really broad fanouts but not so great for normal messaging.