Follow

holy hell python doesnt even have a built in red-black tree implementation, just a hashmap style Dict... There is a third party implementation but the best one is something like 8 months since the last release and the repo only has 6 commits...

How do people even survive in ?

Β· Β· 5 Β· 0 Β· 2

@freemo semi-serious deflections:

do you need one for reasons other than performance?
how much active maintenance does a basic data structure really need?

@Ademan

do you need one for reasons other than performance?

Yes and the performance difference would be very significant (so much so it is worth the extra development time)

But it also has the tendency to make code much simpler in many cases as well, which doesnt apply to me in this case, but also a valid reason for the need in any language.

how much active maintenance does a basic data structure really need?

Probably not much but when I see something with very little time invested into it I worry it may be slow to fix bugs or not well flushed out. In this case I could be wrong (I am taking the risk and using it so we will see).

@freemo
Because it's easier for people to learn and use in comparison to and ?

@freemo I don’t think I’ve ever needed such a thing, I suspect most people haven’t, which is why there’s nothing built-in.

You may find what you want by looking instead at how people solve similar problems in Python. Might be something out there that has a red-black tree or similar at its core but isn’t advertised as such.

@pganssle Depends what you mean by “need”. Obviously you could always roll your own Treemap so in the strictest sense you never truly “need” it.

But presuming you are relatively active coder I can guarantee you that you have been in positions where having it would have either (or both) made your code simpler and/or more efficient.

Chances are you accomplished the same end result by either using a hashmap (ordinary dictionary) or a hashset, and just repeatedly sorting it by calling a sort function every time you needed a sorted version of the data (which you may have even done after every new insert in some extreme cases). Which means inefficient code (often significantly so) and more lines of code and more logic than you might need. Not to mention the potential for all sorts of issues on top of that if it is a multithreaded environment.

I have used and needed Tree maps more times than I can count though could have hacked around it easy enough if i had really wanted to.

@pganssle And yes, Treemaps and Navigable maps in most languages are backed by balanced red-black trees underneath or something very similar. So if you’ve ever seen a sorted map/dict in a language it was likely a red-black tree.

@freemo I am a very active coder, what I mean to say is that I have basically never been in a situation where my code was slow enough that it was a problem (and I do a lot of low-level library programming so that’s often “slow at all”) and benchmarks showed that dictionary hash map lookup is the bottleneck and switching implementations would fix it.

Python has a huge amount of overhead for most operations, so often the way to get high performance when you need it is to basically call an API that is a wrapper around an optimized implementation in a low-level language (e.g. numpy).

Admittedly, it may also be that the kind of programming I do is not likely to hit this problem, but I imagine the fact that there’s not an obvious choice here means that it’s either not a very common problem or this is an XY problem and you are overlooking more idiomatic solutions to your overall problem.

@freemo I don’t know the specifics of your situation so I may be way off base, but I often find that when people start using a new language or ecosystem they reach for the familiar idioms from their more comfortable language and don’t realize that in this new language you are meant to structure things differently. I have been guilty of this myself many, many times.

@pganssle Well having your code not be slow enough to force you to use it is, of course, not that uncommon. though in those cases your code was probably still significantly slower than it needed to be… But as i said it isnt just about code performance it is also about code complexity. Even if you never needed the performance improvement it would have forced your code in many situations to be more complex and bug-prone then it needed to be.

In terms of code elegance pretty much everyone has hit it if you ever needed a map where the keys had a sorting order (sorting it yourself every time you modify the map is added code and prone to issues).

In terms of performance you’d only need it if you needed a sorted-key map and that map was particularly large and needed to be re-sorted often.

@freemo I mean, there’s sortedcontainers, which is very popular and seems relatively well-maintained. It’s implemented in pure python and I don’t think it’s a red-black tree, so if it’s a performance bottleneck then you may need something else, but if you just want to simplify your code, then that should work.

@pganssle in my case its a performance concern. I am basically implementing a pass-through array cache that remembers balues read from the underlying array-like object and stores it in a local array it needs to know what values it has seen though which makes it quite complicated as it would need to save ranges of seen values. Whole point of such a cahce is performance.

@pganssle so the keys i a sorted containers implementation are required to be comparable and dont need to be hashable?

@freemo python is mostly for people who want to learn programming, and it’s mostly used for mathematics and machine learning / ai libraries
.

@louiscouture while it is often a first choice in programming language for the noobs I wouldnt say its limited to that at all. It is often used for some pretty advanced math/AL/ML which is why it surprises me it lacks such fundamental features as a tree map considering they are pretty necessary to make a lot of those things if you want them to run efficiently.

@freemo I’m still a student in software engineering and hasn’t really reached AI yet but doesn’t it make no sense to do some demanding calculations on a language that is really slow.

I get the appeal of interpreted languages as it can be run on multiple machines, but then why not use faster alternatives, like Java ?

@louiscouture most mature languages will be close enough to as fast as any other not to matter in and of itself. It isnt that python isnt fast, its more that python doesnt make it easy or pleasant to make your code fast.. ITs multiprocessing element is shit and requires you to jump through hoops and many of its built-in libraries dont leverage multiple CPUs and obscure things away in such a way that it isnt always easy to build that in.. But all that said it can (and many examples of where it has been) made to do things as fast as any other language.

@louiscouture @freemo
ERPnext, Odoo, Tryton (which itself is the base for other systems like GNU Health, Occhiolino), ansible, Slat, OpenStack, mailman, Plone, a lot of Web-Application (based on e.g. Django, Pyramid or Tornado (Facebooc)). BorgBackup, DockerCompose, GNUmed. Used a lot in InfoSec, e.g. in exploit development, detecting vulnerabilites, etc. e.g. GRR Rapid Response.
1/2

@louiscouture @freemo
I missed some introduction, making the message sound a bit harsh. Sorry.

Anyhow, as you can see Python is used for a huge variety of platforms and applications.

@freemo
When I came from other programming languages to Python, I was missing trees, double-linked listes, sort-algos etc, too. But I quickly learned live being easier and me being more productive if I don't need to care about these.

Dicts are acceptable for most (of my) needs, being O(1) in average.
wiki.python.org/moin/TimeCompl

If I would dare for trees, I would checkout one of the many packages available pypi.org/search/?q=AVL+tree
(AVL being a real subset of red-black) or try numpy

@kirschwipfel I fail to see how lacking fundemental tools that not just improve performance but reduce the complexity of your own code somehow makes life easier.

@freemo
Can't see how red-black trees improve performance here: dicts are O(1), for red-black trees it is O(log n).

Can't see how a tree reduces complexity of my code. In fact (beside performance) I don't want to care about the datastructure used. Just put elements in and get them out. Whether tree or dict: I would the interface expect to be the same.

Anyhow: Algorithms never belong to the language, but to the library.

@kirschwipfel hashtrees are only O(1) when looking up a specific value, they are far less efficient when you wish to traverse the keys of a hashmap in sorted order or to ask "which key comes after key X in the sort order"

Hashmaps are, and should be, the prefered choice when they suit the task. I am not implying we should have tree maps and not hash maps, that would be just as harmful.. but a good programmer (one who wants to keep their code efficient and minimal in complexity) needs both and both are used in different situations, and each is significantly more performant in their respective applications.

@freemo I agree their might be use-cases where a tree actually makes sense.

Please check out some of the many (AVL) tree packages at PyPi. I'm quite confident there are well-maintained ones.

@kirschwipfel Thank you! when i come around again and am ready to address this issue (I shelved it for a much simpler approach that is 90% as good) I certainly will. Good to know they are out there.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.