@pganssle @eumiro Hi there. I crafted a corpus of 650k names with countries and ethnicities out of PubMed. HIH ! https://gist.github.com/mazieres/0b905a30b1fc9bdbb36237575fe276c8#file-namograph-ipynb
@mazieres @firstname.lastname@example.org Very nice data set, and pretty cool analysis, though this does seem to be only surnames, and it doesn’t preserve capitalization.
QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.