I've just read the Vice article about Transgender Dataset.
There is nothing wrong (in my cautious opinion) with teaching AI with any data we have access to.
BUT there is so much wrong with:
1) doing "science" with an obviously transphobic reason of "hormone-taking terrorists";
2) processing and publishing sensitive data without consent or "substantial public interest"
3) faking efforts to contact people whose data somebody uses
It is important to consider how and why we process (not just sensitive) information. Our goal should be to avoid doing harm at every step, not only following the law and ethical standards.
As part of one project, we downloaded a public dataset of Polish active psychologists and attached their public emails to it. It is possible to access the dataset at GitHub, but emails are hidden, so predatory journals will not be able to use it.
QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.