Has anyone ever heard of software that takes an epub, divides it up into chunks and then creates an RSS feed of those chunks dated at regular intervals? That way you could read a book alongside your blogs in your new reader. If not, should I make a thing like this?
I think I read about something similar that turns audiobooks into podcast feeds.
Boosts welcome
On December 1 at 15:00 UTC, as part of #PyData Global 2022
2022, I am leading a tutorial on #Bayesian Decision Analysis.
Learn more and register here: https://buff.ly/3gDgFLh
PyData Global uses pay-what-you-can pricing, with donations based on location, so it is accessible to all!
@pganssle @jugmac00 @hynek Officially announced today as well:
https://github.com/actions/setup-python/issues/544#issuecomment-1332535877
Inspired by @mkennedy, and the work I'm doing on profiling for Python data processing jobs, some initial scattered thoughts on how performance differs between web applications and data processing, and why they therefore require different tools.
1. Web sites are latency-focused. Web applications typically require very low latency (milliseconds!) from a _user_ perspective. Throughput matters from website operator perspective, but it's more about cost.
Just did first pass of fork()-based multiprocessing profiling for Sciagraph (https://sciagraph.com), a profiler for #Python #datascience pipelines.
First test passed, now to polish it up.
Notes:
1. In case you are not aware, #Python's `multiprocessing` on Linux is BROKEN BY DEFAULT (https://pythonspeed.com/articles/python-multiprocessing/).
2. As a result, this code is quite evil.
3. I am so so happy I am writing software in #Rust. Writing a robust profiler of this sort in C++ would've been way beyond my abilities.
The release candidate for tox 4 - a complete rewrite of the project - is now out; see https://tox.wiki/en/rewrite/changelog.html#v4-0-0rc1-2022-11-29. Please try it because if no show-stoppers are reported, will be a stable release on the 6th of December 2022. I'd hate to break your CI, so test it beforehand. 😀Thanks! https://pypi.org/project/tox/4.0.0rc1/
If anyone else finds this kind of thing useful, I'd totally love it if someone else started using this project. Particularly if you are the kind of person who is going to make lots of improvements to the front-end and then send me PRs 😉
I keep coming up with interesting improvements for this project, but I only have so much time to work on stuff like this.
I started this application in December 2016, before I knew anything about databases, so I hacked together a pseudo-DB out of YAML files, because I wanted to be able to edit the files by hand if I screwed up. As this "database" grew, parsing huge YAML files became a bottleneck; I lived with this for years, but recently, I managed to switch over to using a SQLite database!
I lived with this for years, but recently, I managed to switch to a SQLite database!
This was surprisingly easy, because I already had a pseudo-ORM, and I just load the whole "database" into memory at startup, but I am still not using the features of a "real database", since my "queries" are basically Python code iterating over dictionaries and such.
I really like the "segmented" feed, which breaks up books along chapter and/or file boundaries, recombining them to minimize total deviation from 60m files. I like to listen to audiobooks in ~60 minute chunks, and this automates the process of chunking them up for me.
The implementation was a rare example where dynamic programming was useful in the wild (and not just in job interviews): https://github.com/pganssle/audio-feeder/blob/1a07c8ffa7c7b548471f979382fedb653ce6ee5a/src/audio_feeder/segmenter.py#L45-L102
Thanks to @njs for suggesting the approach and basically implementing it flawlessly on the first try.
I've also created this probably convenient docker-compose repository for (somewhat) easily deploying `audio-feeder`: https://github.com/pganssle/audio_feeder_docker
Now featuring ✨🌟✨*installation instructions*✨🌟✨ (so fancy).
Yesterday I released version 0.6.0 of my audiobook RSS server, `audio-feeder`: https://github.com/pganssle/audio-feeder
It takes your directory of audiobooks and generates an RSS feed for each one, so that you can listen to them in your standard podcast listening flow.
I'm particularly happy with the new feature "rendered feeds", which uses `ffmpeg` behind the scenes to generate alternate feeds where the audiobook is broken up along different lines.
I was thinking about this because I was thinking about how I might optimize the resource consumption of a program where I'm ~the only user, and then I thought, "Hmmm.. If I spend that same time working on an open source project currently used by millions of people, with even a modest improvement I could probably save more energy than my homebrew project will consume in its entire lifetime."
Fun stuff.
It would be interesting to live in a world where most people used this or something like it in production: https://www.sciagraph.com/
I'm kinda curious to know stuff like, "How much electricity would it save if time zone conversions in pandas were 20% more efficient?"
I have been using Git a long, long time. I have worked on Git clients and libraries. At some places I've worked, I am the person folks go to when they need Git help.
And yet, only today I learned you can pass -m to commit twice (or more) and it will do the right thing of making each successive message a new paragraph (which is useful for the convention of a short summary as a single first line and following paragraphs as a more detailed message).
Programmer working at Google. Python core developer and general FOSS contributor. I also post some parenting content.