@freemo Happy to help & earn my keep as a mod.
Do you know if there's a way to run more complex queries than simple searches? For example, if I could get the set of local users who have a link in their bio and zero posts, that would include virtually all the SEO spam accounts but pare away most of the real users. It'd be much easier to work through that list. Another useful one would be the set of users who have posted only toots containing links.
Friends, today was the anniversary of the Polytechnique attack. Please take a moment to consider the obstacles still faced by women and minority STEM students in many parts of the world, and, if you can, lend your support as they confront those obstacles.
@freemo oh I see. That's less different than I thought initially - actually pretty close to how I do it fast, but when I slow down to type it out for someone else, I break out the `(n-k)!` term as a separate step.
@freemo I'm not sure I understand how you reason it from that description. Do you mean that you first arrive at the `k!` and `(n-k)!` factors in the denominator and only recognise the `n!` numerator at the end?
Might benefit from a discussion of why `nCk = n!/(k!(n-k)!)`. I don't use combinatorics often enough to have it memorised, so I re-derive the `nCk` and `nPk` formulae every time I need them. Here's my reasoning:
- If you want to arrange `n` objects, you can choose any of `n` for the first, any of the `n-1` remaining for the second, ... down to one for the `n`th. So n objects can be arranged in `n! = n(n-1)...1` ways.
- If you want to choose `k` of `n` objects, you can do this by ordering the `n` objects and taking the first `k`. So initially `nCk = n!`
- However, you don't care about ordering of the chosen objects. If you choose two of `{1, 2, 3, 4}`, the orderings `{1, 2, 3, 4}` and `{2, 1, 3, 4}` should not be treated as distinct. Since there are `k!` ways to order the `k` objects you chose, you need to divide that out. Now we have `nCk = n!/k!`
- Equally, you don't care about ordering of the non-chosen objects. In the above example, `{1, 2, 3, 4}` and `{1, 2, 4, 3}` should not be treated as distinct. There are `n-k` of these objects, so `(n-k)!` ways to order them, which again should be divided out. Finally, we arrive at `nCk = n!/(k!(n-k)!)`
@Shamar ah that helps.
> Alice is in Europe and wants to ensure Bob (who is in the US) that when they connect to a certain Eve's IP, their packets will reach an ethernet in the US.
> Bob trust Alice.
Bob can host a VPN, secure proxy, SSH tunnel, etc., through which Alice connects to Eve. Then as long as Bob's ping to Eve stays below what he could expect for a transoceanic RTT, he knows that both he and Alice are connected to the US one.
If a BGP attack redirects Bob's traffic intended for Eve-US to Eve-CN, his ping will jump. If a BGP attack redirects Alice's traffic intended for Bob to an imposter, she will see a mismatch between Bob's certificate and the imposter's.
@realcaseyrollins Aero engineering has drilled that acronym into me as Maximum Gross Takeoff Weight; it's hard to read it as anything else
@Shamar Can you be a bit more specific about what you're trying to achieve? With this talk about BGP trickery, we're out of the scope of your original question, which was about the computer reachable at a certain IP address - now you're asking for statements about the computer that *should* be reachable at that address if not for malfeasance by the network.
My understanding is that you have a scenario where Alice wants to prove to Bob that she (or rather, the computer under her control) is physically close to Bob, where the network may be maliciously misrouting packets and/or forging responses.
Ping can carry a payload - and according to spec, the reply contains the same payload as the request. But you could invent a derivative where that rule isn't observed. So it would go something like this:
1. Bob sends a ping with an unpredictable payload.
2. Alice computes a hash of the original payload, signs the hash, and sends a ping response with the signature as the new payload.
3. Bob checks that the response time is inconsistent with a distant interlocutor, computes the same hash as Alice did, and verifies the signature against her public key.
Bob's initial payload must be unpredictable so that Alice cannot precompute the response and send it before receiving Bob's message. The payload may have a specific format, though - which would allow Alice to respond correctly to normal pings (i.e. echo the unmodified payload) on the same interface.
Of course if Alice is uncooperative, she could do things to make herself appear further from Bob, but it is hard to appear closer. Similarly, she may be unable to prove location if she's on e.g. a satellite connection - her communication takes exactly the same path from anywhere in roughly a quarter of the globe, so Bob can't tell if she's next door or 10000km away.
@Shamar Not that I know of; technically, the duplicate *is* the ethernet responding to a certain public IP at that point. If there's a specific impersonator (or a list of impersonators), you could test for it with a traceroute like you said watch for a hop through an address on your blacklist.
@Shamar "Safest" in the sense of fewest false positives? Ping from a host you know is in the vicinity of where he claims to be. It's hard to fake fast ping from a geographically distant host.
There's a tradeoff between low false positives and high false negatives, though. If a nearby host has high latency he could be either far away or on e.g. satellite internet.
@realcaseyrollins Possibly, although not all instances are aware of one another, so there might be some edge cases where that strategy fails. I don't understand the server-to-server communication in the Fediverse as well as I understand the client-to-server communication, so I can't invent an exact scenario to exploit it.
What benefit are you hoping to get from this? If you're going to disambiguate homonymic users, users with alts won't have their names shortened anyway.
@realcaseyrollins It's an obstacle to impersonation. On Twitter you can get away with "parody accounts" but that's frowned upon in most of the Fediverse.
For example, let's assume I register @realcaseyrollins@example.com and set it up with your profile picture and display name. Then I can post whatever I want there, and boost it on my main account. If the domain is hidden, it's indistinguishable from me boosting your real account - and sure, the user could open the compose-reply window to see if the mention has "@counter.fedi.live" or "@example.com", but there isn't any indication even that something's amiss and warrants investigation.
@freemo @design_RG @arteteco @Sphinx great, thanks very much!
@freemo @design_RG @arteteco @Sphinx I'm interested in filling this role, and I thank you sincerely for the nomination. Would you mind estimating the time commitment expected of a moderator?
@2ck I think you're meant to ignore the vertical stroke, so it's a "<-shaped" recovery. Basically just that the two segments have diverging outcomes.
@dragfyre Cunard runs transatlantic voyages, as well as longer services where you can book the segments individually (on the one I just pulled up, the legs are Southampton - Dubai - Hong Kong - Sydney - Singapore - Dubai - Southampton). Their service is paused due to the pandemic, though.
@freemo Good point. I figured the primary motivation for following a QOTO user (thereby making your instance a peer of QOTO) is because you think the guy you're following is interesting enough you want to see what he posts. But maybe the connection between the two is more tenuous than that, and my assumption was unjustified.
@freemo Actually, thinking about it, peer count would have the reverse bias (i.e. large instances would seem better connected, while insularity favours small instances). If you have probability P of following a user, independent of his instance, the probability you follow someone on an instance with N users is 1-(1-P)^N. This likelihood increases with N, for 0 < P < 1, so large instances will have more remote followers, and in principle from more remote instances.
Unsurprisingly given the above, Mastodon.social outranks us by several hundred peers. This isn't *that* much, given the disparity in size, but I don't know if that's because QOTO users are on average more interesting :-) or because QOTO's followers already make up close to two thirds of the Fediverse, so it's impossible to do better than about 150% of our count. I feel like this relationship is probably something like a logistic curve, where you get diminishing returns as your instance size approaches infinity.
@freemo @olamundo @design_RG I don't think having a lower insularity than M.S is actually a meaningful comparison, as their user count is more than forty times greater than QOTO's. It seems reasonable to expect that large instances will score higher measures of insularity, even given the same behaviour as small instances.
Quick aside: insularity is defined as the ratio of mentions of local users to mentions of all users.
Consider a Fediverse where all the users are instance-agnostic in their communications; that is, the likelihood any pair of users communicate is independent of whether or not they are on the same instance. Now imagine the limiting case, with just two servers, one a single-user instance and one a giant instance with everyone else. The small instance will have near-zero insularity (the only mentions of local users will be if the user names himself for some reason) and the large instance will have near-100% insularity (the only mentions *not* of local users will be conversations involving this one particular user). The insularity scores differ wildly, even though the users have the same behaviour - they're just a function of where you draw boundaries to group users.
It seems like there ought to be a way to normalise this, something like (local mentions * all users) / (all mentions * local users), but quantifying the "all users" part seems hard. How would you qualify a user as potentially mentionable, how would you treat dormant accounts, etc. - such questions would affect the outcome. Absent such a normalisation, I think insularity scores should really only be compared between instances with similar numbers of active users.