to all of the statistical nerds out there counting memory errors in #cpp to justify the existence of #rust: a logic error in a system that uses explicit memory management often culminates in a memory error. That doesn't make it a memory management error, it's still a logic error, and patching it up with more memory management is over-complicated at best and silently incorrect at worst.
@namark I don't think "patching with more memory management" is accurate. Rust focuses on *correctness* of ownership and mutability, especially at compile time.
By analogy to types: a logic error in a dynamically typed language often culminates in a type confusion ("undefined is not a function"). Does it mean that statically typed languages solving it with "more types" are over-complicated and silently incorrect at worst?
@namark In a statically typed language you annotate functions with types, so that when you make a logic error that causes passing nonsense data, it will be caught at compile time.
Rust is just a logical extension of that idea. You annotate functions also with memory management rules, so that when you make a logic error that causes incorrect sharing or freeing of data, it will be caught at compile time.
@kornel yes I would not count type errors such as "undefined is not a function" in dynamically typed language to statistically evaluate the usefulness of static typing, because I know there would be a lot of false positive that are caused by logic errors and not typing errors. Type conversion is a fix for a type error but it would not fix the underlying logic error. You have to fully understand the code, follow some guidelines or get lucky to identify the logic error from the type error, even if it is reported to you at compile time.
Correctness of memory management is not correctness of program.
@namark I'm surprised you're not seeing the analogy. Do you really not think that statically typed languages reduce defects in programs?
If you had traced 2/3rds of your bugs to type confusion (two + two == "22"), wouldn't you like static typing that makes these problems literally impossible to happen?
Note that rewrite from dynamic types to static types is not merely a hack that casts values, it's a logic-bug-discovering exercise, e.g. urllib3 found *logic* bugs https://sethmlarson.dev/blog/2021-10-18/tests-arent-enough-case-study-after-adding-types-to-urllib3
@namark but besides, your reduction of Rust to just memory management is still inaccurate.
Rust also has sum types. Option replaces null, and makes it a compile-time error to not check for absence of a value. Result won't let you use values without checking for error condition.
Automatic destruction means there are no risky "goto cleanup" with confused state.
Mutex wrapping its content ensures you literally can't make a mistake of accessing the data without locking the mutex first.
@namark I also disagree with the generalization that all memory errors are symptoms of logic errors. They sometimes are, but there's also a fair share of just stupid direct mistakes, like not handling integer overflow in buffer sizes (e.g. stagerfright exploit) or using different buffer length when allocating vs copying. These can be eliminated by construction (you can't have a logic bug in code that doesn't need to be written).
@kornel you are either completely missing the point, or intentionally derailing now, go read the OP, I'm talking about statistic of memory errors in c++, and it's use as an indication of how important/common memory errors that rust eliminates are. It's useless because most logic errors result in memory errors in c++. You are counting these logic errors towards your total of memory management errors. And once again logic errors are not memory errors, that's what I'm trying to explain to you, while you keep imagining that I said that all memory errors are logic errors.
@namark I understand, and I think the statistic is still valid.
1. There are memory errors that are directly related to "logic" of memory management, not a symptom of a deeper error, and fixing just memory management would fix 100% of the problem, with absolutely no hidden more problems.
2. Rust does not just focus on the case 1 as you imply. It does prevent deeper logic errors too, so it can help with ultimate root causes of logic errors that manifest as memory errors.
@kornel now you are literally arguing that most logic errors are memory errors, that's how religious you are about your beloved language.
Here is a pseudocode example to demonstrate to you a logic error, that results in a memory error, that is not itself a memory error, or has nothing to do with memory management. This is apparently utterly incomprehensible to you:
got some resource x
if(condition)
some operation that invalidates x
use x
memory error on last line... how do you solve this problem? You could replace the first operation with equivalent that does not invalidate. You could replace x with an equivalent shared resource that does not invalidate on that operation. You may take a copy of x, and keep the original for use after invalidation. You may rearrange the code so that the second use comes before the condition. All of these options will solve the memory error, but none of them would be correct. The correct solution in this contrived example is to return early in the condition body(or add an else after the if, or whatever other branching technique you use), and never use x in that case, because it was an error condition for that particular subroutine. You will never know this from the memory error alone, you need to understand the business logic of the piece of code, to figure that out. The memory error was not the cause, or even a hint of what the problem is. In fact it was a distraction.
And fixing memory errors without fixing the logic is not better than nothing, it's worse, because it complicates the code and lets the underlying logic error "fester behind a coat of paint". Otherwise it's a segfault. Both are equally bad for security but that is beside the point. Any discussion of silent bugs vs a loud bugs is about development and maintenance, not security. For security any kind of bug in a critical part of the program is a disaster.
@namark As for your example, it is interesting, and it is what Rust addresses.
You would get an error such as "x used after a move".
Rust functions define ownership, so "some operation" could invalidate x only if took ownership. Therefore, returning in that conditional would be the only way to make it compile. Unless you decide to refactor "some operation" to borrow temporarily instead, and then the caller would have a choice.
@namark You're claiming that this complicates and distracts, but I disagree.
Functions clearly declaring if they invalidate or only view their inputs clarify their intent. They require programmers to think which is the right solution (with guidance from the compiler).
In Rust you would not get a segfault either way. Not because it'd hide the memory error, but because it'd force logically correct sharing or moving of data, according to how function interfaces are explicitly designed for.
@kornel are you even reading what I'm writing or are you just a rust commercial? I don't think I can chew it up any further for you, just try again if you will, and present a better argument, I can't engage any further with you bringing up points unrelated to what I'm saying, I'll have to just repeat myself.
@namark We're clearly talking past each other.
I think I understand your argument, but we have different perspective and assumptions, so when presented with the same information, we draw different conclusions. That's why I'm trying to add context/background to the discussion which affects my judgement, but you may find it unrelated or irrelevant, because that's not what you have based your judgement on.
@kornel you are assuming I don't know the very basics of the languages I'm talking about and spewing out unrelated marketing slogans. But alright I'll try one more time and hope you'll put a bit more effort into it. I assure you, I'm not trying to destroy you favorite language, or definitively prove that it is entirely useless (I'll do that later maybe). I'm making a very specific point. So here we go again:
Memory errors are not impossible in rust, they are possible they just happen at compile time. My example was in pseudo code and conceptual, so whether the error happens at compile time or runtime doesn't matter. My point is that the error in question has nothing to do with memory management, and cannot be solved with memory management techniques, it's a completely unrelated logic error. If you solve the memory error in the most obvious way you will not solve the logic error. You can satisfy the rust compiler, but the problem will still remain. Therefore rust by the virtue of being rust does not eliminate this class of errors that statistically you would count as memory errors in C++, because that's how they are more often than not manifested technically. If you find yourself inclined to pursue the route of "it's better than nothing" again, I addressed that when I presented the example and in the OP, so please take that into consideration and continue with arguments from that point. If you think that regardless of all of that rust is still useful, that's fine, but irrelevant.
@kornel *but irrelevant in this particular context
@namark I think your example is a good demonstration of a legitimate problem.
I do agree that the solution may not be most straightforward "add return" (which is the one that Rust would suggest for invalidating function).
@namark I assume you're arguing that "if 70 out of 100 of bugs are 'memory error' bugs, then with Rust you'd still have the ~70 bugs, but they would be classified as logic bugs".
Which I think is too pessimistic. I estimate it would be more like 5 memory bugs remaining unfixed + 20 bugs still hiding as logic bugs, but I would expect the majority to be truly correctly prevented or fixed.
Here's why:
@namark I accidentally a thread: https://mastodon.social/web/@kornel/107491575767457178
@kornel Your latest reply in this thread finally makes sense to me, you can put it right after the OP and it will fit perfectly. It's also fortunate that you have split it exactly here, as for the purposes of this argument I can't meaningfully engage with your opinions and prediction regarding the statistic. I'll read them, but can't promise I'll have much to say.
The only direct argument I see here is your implication that my own opinions and prediction regarding the statistic are somehow relevant to the argument I'm trying to make. They are not. I've presented a class of errors that potentially invalidate or heavily skew any naive memory error statistic, so the burden is now on the "statistical nerds out there counting memory errors in #cpp to justify the existence of #rust" to empirically prove that their statistics are not significantly affected by this. Any other argument will not be a statistical argument. You just can't deduce statistics. You might as well present your arguments without the statistic at that point.
Now I also think that it is practically impossible to actually diligently identify all such logic errors, so I expect them to instead resort to some high arcane probabilistic sampling mambo-jumbo, which being way beyond my comprehension will force me to concede and issue a public apology to the international community of "statistical nerds out there counting memory errors in #cpp to justify the existence of #rust" for helping to refine their statistics.
Also from a cursory glance at the split thread, I see you keep going back to the "it's better than noting" argument, with the "better know in advance" formulation, so I'll try to elaborate on that more: Better to know what in advance? That you have a memory error when you actually have a logic error? What it takes from you to identify the logic error from the memory error, is exactly the same as what it takes to not make the error - overall expressiveness of the code, familiarity with the codebase and time. You have these things - you will not make the logic error. You don't have them - you will patch the false positive memory error and leave the logic error as a time-bomb, or at the very least an unnecessary hurdle for the future readers, compromising the overall expressiveness of the code, anyone's familiarity with the code-base, and everyone's time. In my eyes both outcomes of "hidden logic error", or "indirectly mitigated logic error" are simply worse than "unknown logic error", but even if you think that one of them is worse while the other is better, overall it makes things equal, that is rust does not definitively address these any better than c++ does, so they are still false positives in the statistics of "errors that rust will fix because it's memory safe", or even "errors that rust will help with because it's memory safe".
@namark In terms of data, the closest I have is: https://github.com/rust-fuzz/trophy-case
There you can see that in Rust memory errors are rare. Logic errors do exist. Granted, you could interpret it as a support for your argument that it merely changed memory errors to other logic errors. But I think it still supports the literal statistical argument of Rust fixing *memory errors*, because the statistic does matter for security-conscious software that can't tolerate memory errors, but may survive logic bugs.
@namark Your point rests on assumption that Rust makes programmers fix bugs by addressing memory management symptoms, rather than true root causes, and C/C++ doesn't.
But I think it's unsupported, and demonstrably false. Here's a counter-example:
Double-free in C/C++ can be shallowly patched by nulling out the pointer after free. I've seen it recommended as the best practice. free(null) is even guaranteed safe. This fix addresses memory management, without addressing root cause.
@namark
C++ has copy constructors, which can also gloss over unexpectedly diverging code execution, and unclear architecture/data flow in the program by implicitly copying/sharing of data wherever it's used. So C++ also has features that can shallowly please allocator/sanitizer/valgrind without making programmer think for a second about deeper logic errors.
So the question is: are Rust's feature substantially worse for this? I say no, because "getting" borrow checking requires more awareness.
@kornel nah, nothing in my argument about c++ being better... I do think c++ is better... but that's not in this argument.
@namark I understdood "rust must justify its existence" as implication that Rust needs to improve over the status quo, which is C++.
@kornel yes, so being equal is not enough... how does that meant that I claim that c++ is better? The new thing needs to be better for it to have value, otherwise it's just a waste of time... I mean if rust is just a experiment you are having for fun with, no judgement there, but that's not how it is advertised. If you are not one of the "nerds trying to justify existence of rust by comparing it to c++", that's great, that would make you the happiest rust fan I've met so far.
@namark "time-bomb, or at the very least an unnecessary hurdle for the future readers, compromising the overall expressiveness of the code"
This does not match my experience.
Rust blurs the line between memory management, types & program logic. So fixing mem errors tends to fix logic errors, & clarifies program's intent. e.g you must mark which data is shared, you must mark which functions invalidate their inputs. This helps callers understand what they do, and where the data goes.
@kornel this is beside the main point, but you seem to imply here that rust will suggest an early return in my example, which would be absolutely bonkers, so I decided to try it out
the early return might be obvious to you because I told you it's an error condition, but otherwise I don't think it's obvious at all.
@kornel It's also funny that the compiler chooses to explain default move semantics in such an apologetic fashion. If it's so unnatural, that you need to explain in great detail, why make it default? In c++ the move would have been explicit, no need to explain, it's there, someone wrote it so they obviously didn't want the value to be used from that point on, there is no question whether they wanted a copy or a shared state or reference semantics.
@namark Rust explains the problem using Rust's terminology. Understanding it is part of learning the language. It's "apologetic" (trying to be precise and helpful), because strict ownership is difficult for people coming from languages that doesn't enforce it.
Move by default is an excellent default. It avoids unintentional copies, without subtleties of RVO &etc.
Rust also doesn't allow an empty state after move, which catches additional logic bugs, and makes dtor codegen better.
@namark BTW, I've just caught a real logic bug using Rust's mem management feature.
let this_row = image.get_row(y);
let next_row = image.get_row(y+1);
compare(this_row, next_row);
error: can't borrow image more than once.
That's because fn get_row(&mut self, y) computed rows lazily to `self.temp_buffer` and returned that, so I got the same row twice. I knew this, because I wrote the hack myself, but I forgot about it. Rust noticed &mut image lifetime is connected to row's lifetime.
@kornel Saying where the move occurred would be enough to be precise and helpful. The apologetic part is when it goes ahead and gets insecure about Box not having an implicit copy. Because it knows value semantics are natural and easy to understand, as all literal types in all languages ever use value semantics. And it knows it chose an utterly confusing default with move semantics.
In C++ language default is value semantics, that is exactly the same as literal types, and otherwise you need to be explicit. Const reference semantics is conventional default (that is evey c++ programmer is taught to always accept function parameters as const&, unless they have a reason to not do that). Most other languages also prefer that combination (value and const ref), but decide implicitly which to use based on types. This implicit choice that is different based on type is a common source of confusion in pretty much all garbage collected languages.
Rust on the other hand, not only makes the same sort of confusing implicit decision, but also chooses the most confusing default - move semantics (for the sake of performance no less), I presume because it just can't wait to show off the error messages.
Regarding your code example. How is comparing two rows of an image a logic error? You clearly have a memory management error. The embarrassing part is that, as you say, it's an intentional hack and, with the amount of tools rust gives you to get things right, it probably took more effort than a proper solution would. Just based on the code you provided and the premise of lazy evaluation, get_row must return an object that has shared ownership of the underlying resource and unique ownership of a buffer that it will use. If that's not easy to do in rust, I don't even know what's the point of the language anymore.
Again this is clearly a memory management issue with a memory management solution. I see no logic errors in what you presented unless you want to imply that the entire image type and its interfaces are one giant logic error, just because you couldn't implement proper memory management for it. I wonder if this is how C programmers end up not being able to tell sizes of files. "Ah, that is clearly a logical impossibility, cause I can't do that easily in this handicap of a language". If the underlying resource can't be shared, or can't provide random access to rows (even if lazy evaluated), then your image should be called image_stream or something, and get_row should be called skip_rows, and rust will compiler will not teach such things.
@namark Are you familiar with the Tullock Spike argument?
That cars would have been safer if they all had a sharp spike sticking straight out of the steering wheel. Drivers being clearly aware of the danger would pay more attention, drive better, and therefore a clearly less safe car would be safer in practice.
@namark So I think your argument about paying attention to more than just satisfying memory safety is a light variant of the tullock spike argument. Programmers shouldn't let their guard down by compiler fixing the immediate problem, and should remain focus and alert about deeper issues in the program. Therefore, making it too easy to solve memory management makes programmers complacent/lazy/inattentive to other, possibly more important issues.
I'm not saying that logic errors are memory errors. You've misinterpreted my "there exists" statement as "for all".
I'm saying that there are memory errors (let's say a half of bugs classified as memory errors) that are just shallow and not a symptom of anything beyond memory management problem itself. For example, if you forget to multiply a buffer size in `malloc` by the `sizeof(element)` that's all there is. You fix the buffer size, and the bug is fixed entirely.