Mid-2010s Big Tech: suck all your personal data into the cloud and sell it for targeted advertising purposes.
Mid-2020s Big Tech: suck all your personal data into the cloud for use in ML models for entities to masquerade as mediocre humans.

How long will it take for us to realize Big Tech is not looking out for your individual flourishing and well-being? 🤪

@jaredwhite I lost my insurance card. As you probably know, as an American this can be dangerous to one's health.

Fortunately, I had taken a photo of it.

Unfortunately, I had no idea when, or even what year.

... fortunately, you can type "Insurance card" into Google Photos and it will find it for you. It knows what cards look like.

Thanks for ingesting all my data and looking after my well-being, Big Tech.

@mtomczak I think you're confusing recognizing objects inside of a photo with generative AI. I'm very happy to be able to dictate to Siri or whatever and have it recognize speech, or search for text and have it find that inside of an image. That's wholly different than generative AI. (Also even in cases of mere recognition there have been major issues around privacy…one reason Apple made a big deal about on-device ML and also differential privacy techniques when applied to cloud data analysis.)

@jaredwhite I mean those tools (the training of the images, the training of the audio models) were created using vast quantities of data curated under means that people are increasingly finding questionable.

Google voice recognition, for example, was trained on the labeled dataset from Goog411. A lot of problems that were intractable three-ish decades ago became trivial when Big Data warehouses allowed machine learning developers to start utterly swimming in labeled samples.

I should look up how Apple trained up their systems though. I know they were working on them for decades before the Big Data approach became popular.

@mtomczak Agreed, there have been tons of gotchas around a lot of this stuff. I mentioned Apple as perhaps being more forthcoming about their efforts to avoid the more icky examples but even there they've had screwups like the wrong people having direct access to Siri recordings without a proper opt-in/disclosure that they were storing people's Siri conversations.

My strong belief is that *all* data should be opt-out of ML datasets by default. People should know what they're signing up for.

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.