Okay, you may not gonna like it but I rented a 1TB storage box from Hetzner for 3 euros a month, just to get that foot off my neck. It’s omega cheap and mountable via CIFS so life is good for now. I’m still interested in what I described in the OP, and I even started scribbling some Python, but I’m too scared of fucking anything up as of now.
The annoying part in writing that script was discovering that the filenames on disk don’t match the filenames in the URLs. E.g., given this URL:
https://lemmy.org.il/pictrs/image/e6a0682b-d530-4ce8-9f9e-afa8e1b5f201.png.
You’d expect that somewhere inside volumes/pictrs
you’d find e6a0682b-d530-4ce8-9f9e-afa8e1b5f201.png
, right…? So that’s not how it works, the filenames are of the exact same format but they don’t match.
So my plan was to find non-local posts from the post
table, check whether the thumbnail_url
column starts with lemmy.org.il
(assuming that means my instance cached it), then finding the file by downloading it via the URL and scanning the pictrs
directory for files that match the exact size in bytes of the downloaded files. Once found, compare their checksums to be sure it’s the same one, then delete it and delete its post entry in the database.
When get close to 1TB I’ll get back here for this idea… :P
I’m with you