Datasette Extract is a new Datasette plugin that uses GPT-4 (and the new GPT-4 Vision) to extract structured data from unstructured text and images and insert it into a SQLite database table. Here's a video demonstrating the plugin:

youtube.com/watch?v=g3NtJatmQR

@simon This is super interesting, thank you for sharing. I’ve been trying to create a Custom GPT to generate structured events data for my neighborhood events calendar (a hobby community service project) from info I paste in. I have to keep reminding GPT-4 to use tabs between columns so I can copy & paste into a Google spreadsheet, even though those instructions in the system configuration. This is so much more elegant.

Follow

@smach @simon How do you do validation on the produced data? How many tests did you run? What's the error rate?

@mapto @smach I've not done any formal validation, but I've been using it on an od-hoc basis for a couple of months now and, provided the input data is reasonably clean, the error rate has been minimal

Mistakes like the one in the video, where an event showed only a month and day and the model guessed 2019 instead of 2024, happen all the time though - you end up having to iterate on the instruction prompts quite a bit

Sign in to participate in the conversation
Qoto Mastodon

QOTO: Question Others to Teach Ourselves
An inclusive, Academic Freedom, instance
All cultures welcome.
Hate speech and harassment strictly forbidden.