It's a common practice in Unity to start the name of a scene, prefab, etc. with an underscore so it will be listed first in the project window.
This is because things are sorted lexicographically and underscores come before letters in ASCII, right?
Well, no.
So what the heck is Unity using to sort things? Did they ... did they really add the underscore as a special case?
@peterdrake Sorting by ASCII order would be terrible anyway because then upper and lower cases of the same letter would end up in totally different sections in the results. Any competent alphabetic search would split and sort between letters, numbers, and punctuation first.
Also, I have not heard of using the underscore that way. That sounds like a sign that you need to refactor instead of hacking stuff in. I always name things based on their form/function. If I prefix it with anything, it's the type (i.e. "MatSteel")
"Sorting by ASCII order would be terrible anyway because then upper and lower cases of the same letter would end up in totally different sections in the results. Any competent alphabetic search would split and sort between letters, numbers, and punctuation first."
I disagree. In theory, all items in a given directory would be capitalized the same way, so this shouldn't matter, right?
More importantly, everybody else does it by ASCII. For example, Python:
sorted(["_b", "B", "a", "_a", "c", "C", "_"])
['B', 'C', '_', '_a', '_b', 'a', 'c']
For the problem I'm running into, apparently Pixel Crushers' Dialogue System sorts the quests by ASCII, not the Unity/C# way. When I wanted certain quests to show up first, I put underscores at the beginnings of those quests behind-the-scenes names. I had assumed incorrectly (based on the aforementioned behavior) that the underscore came before letters. Now I have to repeat that extensive surgery.
"I have not heard of using the underscore that way."
I believe Bond uses it in his (excellent) book. Also, for example:
https://forum.unity.com/threads/editor-project-window-tab-sorting.67916/#post-434218
Finally, I can't find where this surprising sorting behavior is documented, either for Unity or C#.
@peterdrake @andykorth Maybe I've just gotten used to the .NET/Windows way. 🤷♂️
Here's some more info about how it works behind the scenes: https://learn.microsoft.com/en-us/dotnet/api/system.globalization.compareoptions?view=net-7.0
@LouisIngenthron @peterdrake Oh nice find, that's a much better document than I found!
@LouisIngenthron @andykorth Thanks.
I can't imagine how I would have ever found that page, even if I was looking for it.
@peterdrake @andykorth I went to the String.CompareTo page first, and it led me there.
@peterdrake @LouisIngenthron
You can find information about it by looking for "sort collations". Unicode does define one (the Unicode Collation Algorithm)
I think C# matches the SQL stuff microsoft provides here, but I'm not sure:
https://learn.microsoft.com/en-us/ef/core/miscellaneous/collations-and-case-sensitivity#database-specific-information
A lot of it is, understandably, complicated by different locales. (where does á sort vs. a?)
@peterdrake Unity’s project browser uses a custom function called EditorUtility.NaturalCompare() which tries to do “human like” sorting. I’d tell you more but the implementation is on the native side 🙃
@thebeardphantom The documentation gives absolutely no hint as to this behavior, but points out another shocking decision.
https://docs.unity3d.com/ScriptReference/EditorUtility.NaturalCompare.html
Not sure how one would run across this documentation without knowing the name.
@peterdrake By shocking decision you mean how it mentions how numbers are sorted?
@thebeardphantom Yes. Lexicographic is a single, easy-to-understand rule. Stacking up a bunch of special cases (especially if undocumented) is a path to unpleasant surprises.
@peterdrake I think the idea is to mimic how file browsers in OS’s sort based on similar rules. Windows will sort filenames using special rules for numbers. It also sorts underscores in the same way.
@peterdrake I suspect they are just sorting the way the .net does, for example:
string[] words = new string[]{"_b", "B", "a", "_a", "c", "C", "_"};
Array.Sort(words);
Sorts out as:
"_, _a, _b, a, B, c, C"
So not in ascii order at all, otherwise all capitals would be before any lowercase.
The other fun thing was.. back when Unity was Mac only, we started our objects with an asterisk instead of a _. But windows does not like that!