https://www.youtube.com/watch?v=u9lowRlI0EQ
windows
zip:
https://github.com/hydrusnetwork/hydrus/releases/download/v353/Hydrus.Network.353.-.Windows.-.Extract.only.zip
exe:
https://github.com/hydrusnetwork/hydrus/releases/download/v353/Hydrus.Network.353.-.Windows.-.Installer.exe
os x
app:
https://github.com/hydrusnetwork/hydrus/releases/download/v353/Hydrus.Network.353.-.OS.X.-.App.dmg
linux
tar.gz:
https://github.com/hydrusnetwork/hydrus/releases/download/v353/Hydrus.Network.353.-.Linux.-.Executable.tar.gz
source
tar.gz:
https://github.com/hydrusnetwork/hydrus/archive/v353.tar.gz
I had a great week. I finished the basics of the new file maintenance system I wanted, cleaned up the duplicate filter a little more, and fixed a bunch of bugs.
file maintenance system
There are a number of large file re-checking jobs the client wants to do, both now and in the future. Going back to figure out more accurate video durations and image rotations, discovering webms that were formerly incorrectly detected as mkvs, eventually integrating videos into the duplicate checking system, all of these will require a combined whack of maintenance CPU that I don't want to hit all at once. I have previously sketched out some disparate systems for these jobs, but none were really doing the trick, so this week I unified it all into one nice system that can handle all sorts of jobs. This new system is simple for now but will get more work in future.
You do not have to do anything, but if you pay attention to your maintenance work, you may notice some new file metadata and thumbnail jobs running in the background or on shutdown. This new system fits into normal maintenance just like database analyzing or repository processing. You can govern whether it is permitted to run in regular idle time and/or shutdown time under
options->maintenance and processing and also change its in-built 'throttle', which limits the number of files it will work on (rather than running full bore on what in future may be quite large jobs). The default throttle is 200 files every day, which for most jobs on most machines will be about 30 seconds to three minutes work. Do not expect it to do much work yet.
Existing file regeneration routines now work through this system, and it does its job much better than before. If you hit
right-click->regenerate->x on some thumbnails, the job now runs in a regular popup button (rather than the locking 'modal' one from before), letting you keep browsing while it works. And if you select more than 50 thumbnails (think, say, right-clicking on 2,000 video files and saying to regenerate their thumbnails if they are the wrong size), you will now get the option to schedule that big job for later, at which point those 2,000 jobs will end up in the normal idle maintenance queue, to work at 200 files a day or whatever you wish.
This system is fairly opaque at the moment. You can trigger it with the thumbnail right-click, and certain db operations may schedule new jobs for it, but there is no UI to review it yet. In the coming weeks, I expect to write a new 'review' window off the
database menu that will let you review total pending jobs, start work manually, and add and remove pending jobs
en masse through the regular search interface. I'll slowly integrate more of the client into it as well, letting it add more jobs into the queue by itself.
Let me know how this all works for you!
duplicate filter
The duplicate filter interface got some more work this week, particularly in cleaning up some of my original version's over-engineering. The actions you can choose on the right panel are now split more clearly into 'yes, these files are duplicates, and here is how' decisions vs the 'alternates' and 'not dupes' decisions. Also, 'this file is better' is now split into two buttons for 'delete the worse file' and 'keep both'. This 'delete or not' is split at the shortcut level into two actions as well, if you wish to map both. Existing shortcuts (left-click by default in the filter) will update to the 'and delete the worse file' version.
The complicated 'duplicate action options' object (which governs how to merge metadata across duplicates) therefore no longer handles file deletion. It is also now only attached to the 'better/worse' and 'files are the same' actions–we never found a good reason to merge metadata across all alternates or 'not duplicates', so I have removed it completely. If you want a complicated file delete action, hitting the 'custom action' button now asks you if you wish to delete the file you are looking at, the other one, or both.
Also, to reduce confusion with alternates–which are also technically not duplicates–'not duplicates' is now renamed across the program to the more precise 'not related/false positive'. The 'false positive' action is a record in the db saying 'despite the similar files search thinking these files were related, it was incorrect, so do not bring it up again'.
My hope is that filtering is a bit faster here. If two duplicate files are of very different quality, it is still easy to delete the bad one, but if they are more close and you want to keep both, it is now just one click.
As for the big db-level rewrite, I prepped the duplicate db code for it this week. I am standing at the cliff-edge and feel great about jumping off, so next week I hope to get started on the new code properly and migrate one or both of the current 'alternates' and 'false positive' data to the new system.
the rest
I fixed an issue with the recent 'collect by' session saving where the accompanying sort was not being renewed on a session load. Also, several problems with collected media and sort by 'approx bitrate' are fixed.
There's a new checkbox under
options->sort/collect that makes it so the default sort is updated every time you click a new sort in regular browsing. It sounds a pain but is actually pretty neat!
The 'all local files' domain is now hidden from view in new page selection and the tag autocomplete dropdown if you are not in advanced mode. This domain, which is fairly technical and covers both trash and 'my files' and the sometimes-hidden repository update files, is often confusing to new users and is rarely useful even for people who know what it does.
If you use the client's local booru and need to override its host when you copy an external link, this option has moved from
options->connection to the local booru's
manage services panel. You can also override scheme and port as well! The old host override option is gone completely, and the only other place it was used, the
manage upnp dialog, now fetches this info more efficiently and fails more gracefully.
full list
[Expand Post]
- duplicate filter:
- duplicate action options no longer handle file deletion
- renamed 'not duplicates' across the program to 'not related' or 'false positive'
- 'alternates' and 'not related/false positive' duplicate actions no longer have duplicate action options. no merge content update now occurs on these actions
- the duplicate filter hover panel now splits 'this is better' decisions into two buttons–whether to delete or keep the worse file
- when selecting 'custom action' in the duplicate filter hover panel, it now asks if you would like to delete the current file, the other file, or both
- the 'duplicate_filter_this_is_better' shortcut action will be auto-updated to 'duplicate_filter_this_is_better_and_delete_other'. an alternate 'duplicate_filter_this_is_better_but_keep_both' is now also available
- the 'duplicate_filter_not_dupes' shortcut action will be auto-updated to 'duplicate_filter_false_positive'
- separated the buttons on the duplicate filter hover panel to more carefully split 'yes, files are duplicates' vs other decisions
- in prep for the duplicate db overhaul, refactored all PHash search code and Duplicate management code apart
- misc other prep work for duplicate db overhaul
- .
- file maintenance:
- wrote a new unified manager to handle various long-term file maintenance tasks like regenerating file metadata and thumbnails
- options to govern how this manager can run are now in options->maintenance and processing. you can enable it for idle and shutdown maintenance time and give it a throttle to limit how fast it will work on files, defaulting to 200 per day
- unified the previous db-level attempts at file maintenance to the new system, which supports async job queueing, and moving regen code up to the new manager, out of the db lock
- unified a variety of file and thumbnail regen code to work through the new simpler and saner path
- the right-click->regen thumbnail commands now run through the new manager and no longer need a modal popup. you can keep browsing while they work. they will also not hang the ui as the old system could on big jobs
- when right-click->regenning on more than 50 thumbnails, you now get a dialog asking if you want to do the job now or put it off later
- file maintenance tasks can now run in shutdown time! you will get previews of the jobs with file counts and status progress reports on the shutdown splash
- cleaned up some file extension renaming and dupe-removing code
- in future, I will move the current file integrity check to this new system and have some ui to prompt and set up other big jobs, like fixing various historical misparsing issues
- thumbnail resizing during thumbnail fade that resizes down is now more efficient
- moved the ClientFilesManager to ClientFiles.py
- .
- the rest:
- the 'manage upnp' dialog now moves the duplicated external ip display from the column up to the status text at the top. it fetches the ip after the initial mappings fetch is done. this ip is no longer affected by the external host override option
- cleaned up options->connection page and removed the now defunct external host override option
- the manage services page for the local booru now has optional override for scheme, host, and port for the 'copy external url' function
- fixed an issue with the recent 'collect by' session saving where a restored session that needed a collect was not sorted
- fixed an issue with collections being sorted by approx bitrate
- added a new checkbox to options->sort/collect to set it so the default sort updates every time you choose a new sort anywhere
- fixed an issue with 'remove trashed files from view', which was incorrectly removing on 'all local files' pages
- the 'all local files' file domain, which is frequently confusing to new users, is now no longer an option for new file pages or the autocomplete file domain if the user is not in advanced mode
- the client now searches for versions of urls both with and without a final '/' character when looking up file url import status at the db level and in import lists. system:known_url is unfortunately still an inefficient mess
- improved how the server code deals with some connectionLost errors
- cleaned up and unified some older dialog button code
- fixed a problem in manage tag siblings when petitioning existing pairs and then cancelling when asked for a reason
- fixed a miscount issue when uploading pending tags while many new tags are coming in. progress would sometimes be -754/1,234, ha ha
- db maintenance, repository sync, and file maintenance processing will all now wake on a force idle mode call
- deleted some old code
- misc fixes and cleanup
- some misc gui layout fixes
next week
I have quite a few smaller jobs waiting for me, so other than the new duplicate db tables, that's top priority. Some UI bugs to deal with, maybe some Client API work, an experimental jpeg quality estimator, possibly support for some new filetypes, and hopefully a fun new way to quickly add very complicated OR search predicates thanks to a clever user's work.
Just a note, E3 is coming up soon and I will take my shitposting vacation week for it as usual. I think it'll be 356 that's delayed a week.