@hansw@mastodon.social if someone replies to our post, your original post is not indexed, only their reply is indexed. If you reply to their post, because you as the author chose not to be indexed, it will not be indexed.
Here's the background issue on scalability, for your interest.
Let's say someone who has been indexed for a long time decided that they no longer wants to be indexed. We will have to search for their presence in all the posts that they are mentioned in. We can't do this at scale without an index, as otherwise it means scanning through millions of rows. We can't really create indices for mentions either, as this index's only use case is to help speed up unindexing of user's posts in mentions.
@hansw@mastodon.social @fedisearch_com we are committed to work on that, but there isn't a solution that is scalable to the size of our data. Commiting to an arbitrary date of your choice seems unrealistic. However, if you can DM me all the account uri that you own, I can do a one-time search and delete for you,hope this addresses the immediate cocnern.
We are launching https://fedisearch.com , a website for doing full text search on fediverse content.
This could be very useful for small instances users that are missing full text searches, just about anyone else who is looking for a blazing fast fediverse search engine.
We have a well behaved indexer that honors users' preference on opting out status indexing.
Please give it a try and let us know what you think.
@freemo
True true. Adding some insights on behind the scenes, our crawler only visits mastodon, pleroma and misskey instances, because it only knows how to parse content from those instances. Some hubzilla posts get indexed because they appeared on the federated timelines of those three type of instances.
Yup this is indeed the way we are aiming for. However, still facing some technical difficulties with the performance when applying this method on mega threads where many people are being mentioned.
@freemo looks great !
@hansw regarding indexing of @ mentions, methods to achieve it is still being explored.
Does the opting out section in https://fedisearch.com/about help?
With the help from @8zu 🎉 , we recently localised https://fedisearch.com in Chinese and Japanese
你好!中文用户
こんにちは!日本のユーザー
With the help from @8zu 🎉 , we recently localised https://fedisearch.com in Chinese and Japanese
你好!中文用户
こんにちは!日本のユーザー
I get your point indeed, and we take users' privacy very very seriously. I think the main problem here is that Pixelfed doesn't support the semantic to allow users to opt out from search indexing. In the case of mastodon, we support this feature.
The same set of settings that blocks google from indexing you would also block fedisearch from doing so as well. We have a insignificant amount of traffic compared to google... so if being indexed is the main worry, we feel we shouldn't be the main antagonist 😅
I'll leave the link to Toot! https://apps.apple.com/us/app/toot/id1229021451 a well received iOS app for mastodon, commercial and proprietary, yet makes 4.5/5 stars on 132 ratings.
And, publicly endorsed by mastodon https://joinmastodon.org/apps
Withdrawing from further discussions
@hansw I understand the concern, and personally hate spammers a lot too.
We try to strike for a balance between out of the box utility and respects for people's desire to stay out of the spotlight, and it indeed has been hard.
If we search pixelfed + a username on Google, we'd find many results. It wouldn't be fair for Google to grab all the search results without opt in while a smaller niche focused projects has to knock on doors right?
Not saying that defaulting to opt-out is right, but just want some extra leniency on our way to do something good to this community.
@namark I don't see mastodon and other community publically deny access to the network from close sourced software. Would you want to substantiate?
The use of analytic solutions on fedisearch is fully disclosed in our privacy policy; data collected are not personally identifiable.
@hansw for sure. Looks like we need to implement self serve immeidate opt out sooner
@hansw I think this is very valid concern. Sorry we let this one slip through. We will resolve this.
@hansw if you are talking about license in the context of open-source licenses (GPL, BSD for sample), they only apply to open-source softwares. In our case, no sourcecode license is granted
@hansw your fediverse software should support this feature, just like mastodon and pleroma do. If it doesn't, it would be wise to ask your admin to add this feature
Hi I'm Justin, admin of fedisearch.com,
Fedisearch is a search engine for fediverse (mastodon, miskey and pleroma) content. Fedisearch respects privacy and robot no indexing directives.