Well I think I finally got to the bottom of why the matrix server has been slow in the past. Now that i have good stats on resources I can better address it... Should have matrix running much smoother soon.

@trinsec

Follow

@freemo @trinsec I do hear the matrix developers say there will be a go implementation to replace the slow and resource demanding python version. Is that still in alpha stage?

(As a jealous JVM lover, I don't trust Python 😂 )

@skyblond

That I have no idea.. But if they think go automatically means faster then it will probably be an abysmal failure.

@trinsec

@freemo @trinsec Based on my limited knowledge of python, the multithreading part is pretty heavy, if I recall correctly, you need a new python process to start a new thread (sounds familiar with JVM :ablobthinking: ). And go is pretty good at multithreading (I mean user-mode threads). If not limited by the IO, I would assume a go implementation will speed up some of the process. Maybe also ease the load on developers, considering go offers some great built-in multithreading structures.

@skyblond

multithreading is ugly, but not exactly heavy... you can do threading without multiprocess but you have to disengage the GIL, which involves a little bit of magic... its an ugly pattern to be sure, but not resource intensive.

@trinsec

@freemo ok, thanks, that's some updates to my knowledge base :ablobthinking:

Based on my experience on JVM, when I switched from JVM's native thread to Kotlin coroutines (which is based on threads but is able to share threads, so less thread suspension overall), I got a free performance boost. I assume go can achieve the similar thing. If so, I would say there is a free optimization without largely redesigning the algorithm.

Also, I always prefer strong typed languages when co-op with other developers. Python makes me panic when I don't know what the type of variable x :ablobgrimace:

@skyblond JVM? Thats Java, thought we were talking about python?

Sounds like you implemented threading poorly in java and moved to a language where the way you built it was more appropriate to that language... I mean sure, they might be able to write code on Rust better than in Python simply because they are better at writing code for Rust than they are for Python... but thats not the language's fault.

> Also, I always prefer strong typed languages when co-op with other developers. Python makes me panic when I don't know what the type of variable x :ablobgrimace:

Based on the context here you mean staticly typed, not strongly typed

@freemo oh.. my bad. I did a quick google search and found Python is quite different from JVM. On JVM, a thread is always a kernel thread. On Python, it might or might not be (StackOverflow told me this). So when I initially thought about GIL, I thought it would be act like a monitor lock on JVM, which cause a kernel thread to suspend and have a relatively huge penalty.

And after another goolge search, yes, I do mean statically typed and strongly typed, where you need to declare the type of a variable and cannot change it on the fly.

@skyblond Yea in java multithreading with the purpose of leveraging multiple CPUs is pretty straightforward and similar to any other language.

Python really is the odd one out where multithreading is a bit of a cop-out as it doesnt actually run in parallel and across CPUs and thus requires either C-python to bypass the GIL or multi-process handling.. which is quite ugly to do efficiently.

C-Python can link with and interact with multi-threaded C code no problem. See https://github.com/sdgathman/pymilter/blob/master/miltermodule.c for an example of the C code end (key object is PyThreadState).

C-Python code itself does not multi-thread, except through multi-processing. (Jython is a Java python implementation that does multi-thread.)

HOWEVER, what is ultimately more efficient than multi-threading for pure python or most other interpreted code is co-routines. Remember those from Knuth's "Fundamental Algorithms"? Python has built-in support for very clean and efficient co-routines - key syntax is the "yield" statement. They work very well ad-hoc, and there are frameworks like the "twisted" library that provide consistent interaction between many parts.

The key limitation of co-routines is that other "threads" do not have a chance to run until the current code hits a "yield". This also means you don't need to bother with locks and stuff like with true multi-tasking, hence the efficiency for an interpreted language.

Another name for this is "cooperative multi-threading". It is harder to do in a language like C with linear stacks. Python stack frames are linked lists - and the resulting conceptual tangle of stack frames when hundreds of complex co-routine tasks cooperate gave rise to the name of the "twisted" library.

@sdgathman

> The key limitation of co-routines is that other "threads" do not have a chance to run until the current code hits a "yield". This also means you don't need to bother with locks and stuff like with true multi-tasking, hence the efficiency for an interpreted language.

The whole point of multithreading in this discussion is its ability to leverage mutlicpus like it does in other languages... This sounds like they are sharing one thread since only one is running at a time.

@skyblond @trinsec

Yes, for CPU bound tasks, it does not help. But synapse is largely IO bound between network API and database calls. The apache/nginx HTTPS processing and the PostgreSQL database all run in different hardware threads. This normally occupies 2 cpus. With more cpus, synapse lets you split the co-routines into "workers" in 2 or more processes.

The Go implementation (Dendrite) involves rethinking a lot of the design - it is not "GO IS FASTER!!!" Among other things, it aims to support (optional) fully decentralized (single user) operation that does not rely on DNS (DNS has been effectively centralized through ICANN since 1996 or so). The monolithic binaries produced by GoLang are easier for end-users to deploy in that situation.

OT: TLS has also been centralized via the shadowy "TLS cabal" that decides which CAs are "trusted" in popular browsers. The way to fight this is browsers that limit trust in a CA. E.g. "trust this CA for domains ending in .roger.org only". Currently, trust is all or nothing.

The safest way for power users to play with this is a browser extension that can "veto" trust in a CA after examining the Cert. That way, worst case fail in case of bugs is rejecting a CA that should have been trusted, or trusting one the browser would have trusted by default.

@sdgathman

As a JVM lover, I am jealous of the ability to call C code directly from CPython.

And yes, coroutine is powerful. I'm using kotlin coroutines and it's a huge (free) improvement of Java's native thread.

@freemo co-routines can run on multiple threads, where the "tasks" can yield and the thread from a pool can switch to another "task" without suspension or something. At least kotlin coroutine can, and according to sdgathman, Python can do it too. That's the ultimate free boost you can get by just switching to another tech.

But if Python can do that, then my earlier hypothesis about switching to go will give your free boost is wrong

@skyblond

Agreed, while python's biggest shame-to-fame is its inability to leverage multiple cpus through multithreading (or coroutines).. the flip side is that it is so trivial to interact with C-code that it makes up for that in a unique way that has its own value.

IF you need a language where you need to do a lot in C but what the convience of high-level language where you can get away with it, python is great... if you need a high level language that is cpu-intensive and effecient and easy to write without wanting to touch C, then python is a horrible choice.

Me personally, I use python a lot (just finished a 2-year project in python)... but over the years have found it just isnt a suitable language for most things, because of these very reasons.

@sdgathman

@skyblond

And to your other comment.. usng a thread pool to run coroutines doesnt get you anything here because multi-threading in python wont spam multipleCPUs.. youd have to use multiporcesses still.

@sdgathman

@freemo

It will be great if Python can be ported to other platforms. The flexibility of Python is great for exploring things like new network structures, etc. But once you have decided most of the specifications, it's better to use something like C or Java to build a more strong code base (so a typo won't screw you up, LOL)

And I have to say, I'm really jealous of Python on machine learning stuff, where JVM is (almost) completely being ignored until people need a more robust way to develop and only find out the Java is a complete nightmare to do operator overloading and have to write something like "a = a.mul(b)"

@sdgathman

@skyblond

Python is a nice little language for somethings... The more i use it the more I personally dislike it.. the lack of multi-cpu is a deal breaker for sure.. but i also just fine the language a bit ugly and hackish in its presentation.. which i could have gotten past if it wasnt for the cpu issue.

For high level stuff I much prefer java if i need something rigerous and formal, or ruby if i need something fast and loose... of course thats for traditional OO high level stuff.. For other categories stuff like Haskell is a lot of fun too.

@sdgathman

Jython leverages multiple cpus, and I believe PyPy (compiled python - the compiler being itself written in compiled python) does as well. C-Python is not the only implementation.

I find Jython in particular very handy for interacting with large mostly Java applications such as Adempiere (or Idempiere).

@sdgathman

Yea my last project i tried moving to pypy for exactly that reason. It turned out i couldnt move because of some dependency problem or something... We had tons of people trying to figure out how to multi-cpu out application and it was one dead end after another and would have effectively required a rewrite.

@skyblond

@sdgathman

But it looks like Jython only supports Python 2? Python 3 is still in the future...

@freemo

C-Python is handy for running big Fortran libraries for linear algebra (matrix ops, etc) optimized for utilizing vector processing. If I were to build an external route optimizer for Cjdns, I would use C-Python to do the json API and populate sparse matrices for Modified Nodal Analysis. I really want to see how "Electric Routing" works in the real world, and Cjdns is a great protocol for it since it uses source routing.
You can call C code from Java with JNI (Java Native Interface) - register C functions that get run for class methods. It's no more difficult than the C-API for C-Python. https://www.baeldung.com/jni

@sdgathman

You can call C from any high level language. The difference is the varrier in python is very low, in java it is very high. Moreover in java it is a very expensive operation to cross the barrier from native to jvm unlike in python.

@skyblond

@freemo @sdgathman

JNI is still Java oriented, which requires you to write some glue code to translate the difference between C and JVM world. (but with JNI, you can call Java code in C)

Now I'm using JNA, which is a black magic that automatically generates the proxy to call native shared libraries without any additional C code, with the cost of performance.

@skyblond - is the overhead for native methods in JVM so very high? It was never a bottle neck for me, even for relatively low level stuff like accessing Posix message queues.

@sdgathman

Compared to Java functions, yes, but overall no. According to StackOverflow, each JNI call is several nano seconds more compared to Java function calls. The main overhead is copying data from the Java heap to native memory. But if you use something like native buffer, there should be no such overhead.

If you load a big dataset in Java using a byte array, then you want to pass it to, let's say some BLAS implementation, then good luck copying all those data. But with native buffers, you can just pass the pointer of that buffer to the BLAS implementation and you're good to go.

stackoverflow.com/questions/13

@skyblond @freemo @trinsec that's dendrite. i think it's almost feature complete, my trials with a single user instance worked well.

there is no automatic tool to switch to it from synapse though, i don't know if it's planned either.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.