I don't know who needs to hear this, but:
Even if scraping people's copyright-protected work from the internet and feeding it to your AI as a training model is fair use (something that is far from certain), posting a copy of your training set very likely is not.