Datasette Extract is a new Datasette plugin that uses GPT-4 (and the new GPT-4 Vision) to extract structured data from unstructured text and images and insert it into a SQLite database table. Here's a video demonstrating the plugin:
@simon This is super interesting, thank you for sharing. I’ve been trying to create a Custom GPT to generate structured events data for my neighborhood events calendar (a hobby community service project) from info I paste in. I have to keep reminding GPT-4 to use tabs between columns so I can copy & paste into a Google spreadsheet, even though those instructions in the system configuration. This is so much more elegant.
@mapto@smach I've not done any formal validation, but I've been using it on an od-hoc basis for a couple of months now and, provided the input data is reasonably clean, the error rate has been minimal
Mistakes like the one in the video, where an event showed only a month and day and the model guessed 2019 instead of 2024, happen all the time though - you end up having to iterate on the instruction prompts quite a bit
@mapto @smach I've not done any formal validation, but I've been using it on an od-hoc basis for a couple of months now and, provided the input data is reasonably clean, the error rate has been minimal
Mistakes like the one in the video, where an event showed only a month and day and the model guessed 2019 instead of 2024, happen all the time though - you end up having to iterate on the instruction prompts quite a bit