@jaredwhite I lost my insurance card. As you probably know, as an American this can be dangerous to one's health.
Fortunately, I had taken a photo of it.
Unfortunately, I had no idea when, or even what year.
... fortunately, you can type "Insurance card" into Google Photos and it will find it for you. It knows what cards look like.
Thanks for ingesting all my data and looking after my well-being, Big Tech.
@jaredwhite I mean those tools (the training of the images, the training of the audio models) were created using vast quantities of data curated under means that people are increasingly finding questionable.
Google voice recognition, for example, was trained on the labeled dataset from Goog411. A lot of problems that were intractable three-ish decades ago became trivial when Big Data warehouses allowed machine learning developers to start utterly swimming in labeled samples.
I should look up how Apple trained up their systems though. I know they were working on them for decades before the Big Data approach became popular.
@mtomczak Agreed, there have been tons of gotchas around a lot of this stuff. I mentioned Apple as perhaps being more forthcoming about their efforts to avoid the more icky examples but even there they've had screwups like the wrong people having direct access to Siri recordings without a proper opt-in/disclosure that they were storing people's Siri conversations.
My strong belief is that *all* data should be opt-out of ML datasets by default. People should know what they're signing up for.
@mtomczak I think you're confusing recognizing objects inside of a photo with generative AI. I'm very happy to be able to dictate to Siri or whatever and have it recognize speech, or search for text and have it find that inside of an image. That's wholly different than generative AI. (Also even in cases of mere recognition there have been major issues around privacy…one reason Apple made a big deal about on-device ML and also differential privacy techniques when applied to cloud data analysis.)