Anyone have a good (open) corpus or generator of human names that covers a good amount of the different types of names people can have?

Preferably tagged with ethnicity or nationality. The names don’t have to be real, just representative. maybe?

@mazieres Very nice data set, and pretty cool analysis, though this does seem to be only surnames, and it doesn’t preserve capitalization.



Why don’t you look at “baby names” websites? They have all that. E.g., Although it’s probably not free to scrape… But they list also some sources at


@FailForward @mazieres That is for given names. It’s not terribly difficult to find lists of given names or lists of surnames, but I’d like more variety. Many people have multiple given names, multiple last names, no last name, no given name, patronymics, etc.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.