https://www.youtube.com/watch?v=NBpReDYG1Fk
windows
zip:
https://github.com/hydrusnetwork/hydrus/releases/download/v354/Hydrus.Network.354.-.Windows.-.Extract.only.zip
exe:
https://github.com/hydrusnetwork/hydrus/releases/download/v354/Hydrus.Network.354.-.Windows.-.Installer.exe
os x
app:
https://github.com/hydrusnetwork/hydrus/releases/download/v354/Hydrus.Network.354.-.OS.X.-.App.dmg
linux
tar.gz:
https://github.com/hydrusnetwork/hydrus/releases/download/v354/Hydrus.Network.354.-.Linux.-.Executable.tar.gz
source
tar.gz:
https://github.com/hydrusnetwork/hydrus/archive/v354.tar.gz
I had a great week. The first duplicates storage update is done, and I got some neat misc fixes in as well.
false positives and alternates
The first version of the duplicates system did not store 'false positive' and 'alternates' relationships very efficiently. Furthermore, it was not until we really used it in real scenarios that we found the way we wanted to logically apply these states was also not being served well. This changes this week!
So, 'false positive' (up until recently called 'not dupes') and 'alternates' (which are 'these files are related but not duplicates', and sitting in a holding pattern for a future big job that will allow us to process them better) are now managed in a more intelligent storage system. On update, your existing relationships will be auto-converted.
This system uses significantly less space, particularly for large groups of alts like many game cg collections, and applies relationships
transitively by its very structure. Alternates are now completely transitive, so if you have a A-alt-B set (meaning one group of duplicate files A has an alternate relationship to another group of duplicate files B) and then apply A-alt-C, the relationship B-alt-C will also apply without you having to do anything. False positive relationships are more tricky, but they are stored significantly more efficiently and also apply on an 'alternates' level, so if you have A-alt-B and add A-fp-M, B-fp-M will be automatically inferred. It may sound odd at first to think that something that is false positive to A could be nothing but false positive to B, but consider: if our M were same/better/worse to B, it would be
in B and hence transitively alternate to A, which it cannot be as we already determined it was not related to A.
Mistakenly previously allowable states, such as a false positive relationship within an alternates group, will be corrected on the update. Alternates will take precedence, and any subsequently invalid false positives will be considered mistakes and discarded. If you know there are some problems here, there is unfortunately no easy way at the moment to cancel or undo an alternate or false positive relationship, but once the whole duplicates system is moved over I will be able to write a suite of logically correct and more reliable reset/break/dissolve commands for all the various states in which files can be related with each other. These 'reset/set none' tools never worked well in the old system.
The particularly good news about these changes is it cuts down on filtering time. Many related 'potential' duplicates can be auto-resolved from a single alternate or false positive decision, and if you have set many alts or false positives previously, you will see your pending potentials queue shrink considerably after update, and as you continue to process.
With this more logically consistent design, alternate and false positive counts and results are more sensible and consistent when searched for or opened from the
advanced mode thumbnail right-click menu. All the internal operations are cleaner, and I feel much better about working on an alternates workflow in future so we can start setting 'WIP' and 'costume change'-style labels to our alternates and browsing them more conveniently in the media viewer.
The actual remaining duplicate relations, 'potential', 'same quality', and 'this is better', are still running on the old inefficient system. They will be the next to work on. I would like to have 'same quality' and 'this is better' done for 356, after E3.
the rest
The tag blacklists in downloaders'
tag import options now apply to the page's tags after tag sibling processing. So, if you banned 'high heels', say, but the site suddenly starts delivering 'high-heel shoes', then as long as that 'high-heel shoes' gets mapped to 'high-heels' in one of your tag services, it should now be correctly filtered. The unprocessed tags are still checked as before, so this is really a double-round of checking to cover more valid ground. If you were finding you were chasing many different synonyms of the tags you do not like, please let me know how this now works for you.
The annoying issue where a handful of thumbnails would sometimes stop fading in during heavy scrolling seems to be fixed! Also, the thumbnail 'waterfall' system is now cleverer about how it schedules thumbnails that need regeneration. You may notice the new file maintenance manager kicking in more often with thumbnail work.
Another long-term annoying issue was the 'pending' menu update-flickering while a lot of tag activity was going on. It would become difficult to interact with. I put some time into this this week, cleaning up a bunch of related code, and I think I figured out a reliable way to make the menus on the main gui stop updating as long as one is open. It won't flicker and should let you start a tags commit even when other things are going on. The other menus (like the 'services' menu, which will update on various service update info) will act the same way. Overall menu-related system stability should be improved for certain Linux users as well.
The new 'fix siblings and parents' button on
manage tags is now a menu button that lets you apply siblings and parents from all services or just from the service you are looking at. These commands overrule your 'apply sibs/parents across all services' settings. So, if you ever accidentally applied a local sibling to the PTR or
vice versa, please try the specific-service option here.
full list
- duplicates important:
- duplicates 'false positive' and 'alternates' pairs are now stored in a new more efficient structure that is better suited for larger groups of files
- alternate relationships are now implicitly transitive–if A is alternate B and A is alternate C, B is now alternate C
- false positive relationships remain correctly non-transitive, but they are now implicitly shared amongst alternates–if A is alternate B and A is false positive with C, B is now false positive with C. and further, if C alt D, then A and B are implicitly fp D as well!
- your existing false positive and alternates relationships will be migrated on update. alternates will apply first, so in the case of conflicts due to previous non-excellent filtering workflow, formerly invalid false positives (i.e. false positives between now-transitive alternates) will be discarded. invalid potentials will also be cleared out
- attempting to set a 'false positives' or 'alternates' relationship to files that already have a conflicting relation (e.g. setting false positive to two files that already have alternates) now does nothing. in future, this will have graceful failure reporting
- the false positive and alternate transitivity clears out potential dupes at a faster rate than previously, speeding up duplicate filter workflow and reducing redundancy on the human end
[Expand Post]
- unfortunately, as potential and better/worse/same pairs have yet to be updated, the system may report that a file has the same alternate as same quality partner. this will be automatically corrected in the coming weeks
- when selecting 'view this file's duplicates' from thumbnail right-click, the focus file will now be the first file displayed in the next page
- .
- duplicates boring details:
- setting 'false positive' and 'alternates' status now accounts for the new data storage, and a variety of follow-on assumptions and transitive properties (such as implying other false positive relationships or clearing out potential dupes between two groups of merging alternates) are now dealt with more rigorously (and moreso when I move the true 'duplicate' file relationships over)
- fetching file duplicate status counts, file duplicate status hashes, and searching for system:num_dupes now accounts for the new data storage r.e. false positives and alternates
- new potential dupes are culled when they conflict with the new transitive alternate and false positive relationships
- removed the code that fudges explicit transitive 'false positive' and 'alternate' relationships based on existing same/better/worse pairs when setting new dupe pairs. this temporary gap will be filled back in in the coming weeks (clearing out way more potentials too)
- several specific advanced duplicate actions are now cleared out to make way for future streamlining of the filter workflow:
- removed the 'duplicate_media_set_false_positive' shortcut, which is an action only appropriate when viewing confirmed potentials through the duplicate filter (or after the ' show random pairs' button)
- removed the 'duplicate_media_remove_relationships' shortcut and menu action ('remove x pairs … from the dupes system'), which will return as multiple more precise and reliable 'dissolve' actions in the coming weeks
- removed the 'duplicate_media_reset_to_potential' shortcut and menu action ('send the x pairs … to be compared in the duplicates filter') as it was always buggy and lead to bloating of the filter queue. it is likely to return as part of the 'dissolve'-style reset commands as above
- fixed an issue where hitting 'duplicate_media_set_focused_better' shortcut with no focused thumb would throw an error
- started proper unit tests for the duplicates system and filled in the phash search, basic current better/worse, and false positive and alternate components
- various incidences of duplicate 'action options' and similar phrasing are now unified to 'metadata merge options'
- cleaned up 'unknown/potential' phrasing in duplicate pair code and some related duplicate filter code
- cleaned up wording and layout of the thumbnail duplicates menu
- .
- the rest:
- tag blacklists in downloaders' tag import options now apply to the parsed tags both before and after a tag sibling collapse. it uses the combined tag sibling rules, so feedback on how well this works irl would be appreciated
- I believe I fixed the annoying issue where a handful of thumbnails would sometimes inexplicitly not fade in after during thumbgrid scrolling (and typically on first thumb load–this problem was aggravated by scroll/thumb-render speed ratio)
- when to-be-regenerated thumbnails are taken off the thumbnail waterfall queue due to fast scrolling or page switching, they are now queued up in the new file maintenance system for idle-time work!
- the main gui menus will now no longer try to update while they are open! uploading pending tags while lots of new tags are coming in is now much more reliable. let me know if you discover a way to get stuck in this frozen state!
- cleaned up some main gui menu regeneration code, reducing the total number of stub objects created and deleted, particularly when the 'pending' menu refreshes its label frequently while uploading many pending tags. should be a bit more stable for some linux flavours
- the 'fix siblings and parents' button on manage tags is now a menu button with two options–for fixing according to the 'all services combined' siblings and parents or just for the current panel's service. this overrides the 'apply sibs/parents across all services' options. this will be revisited in future when more complicated sibling application rules are added
- the 'hide and anchor mouse' check under 'options->media' is no longer windows-only, if you want to test it, and the previous touchscreen-detecting override (which unhid and unanchored on vigorous movement) is now optional, defaulting to off
- greatly reduced typical and max repository pre-processing disk cache time and reworked stop calculations to ensure some work always gets done
- fixed an issue with 'show some random dupes' thumbnails not hiding on manual trashing, if that option is set. 'show some random dupes' thumbnail panels will now inherit their file service from the current duplicate search domain
- repository processing will now never run for more than an hour at once. this mitigates some edge-case disastrous ui-hanging outcomes and generally gives a chance for hydrus-level jobs like subscriptions and even other programs like defraggers to run even when there is a gigantic backlog of processing to do
- added yet another CORS header to improve Client API CORS compatibility, and fixed an overauthentication problem
- setting a blank string on the new local booru external port override option will now forego the host:port colon in the resultant external url. a tooltip on the control repeats this
- reworded and coloured the pause/play sync button in review services repository panel to be more clear about current paused status
- fixed a problem when closing the gui when the popup message manager is already closed by clever OS-specific means
- misc code cleanup
- updated sqlite on windows to 3.28.0
- updated upnpc exe on windows to 2.1
next week
The duplicates work this week took more time than I expected. I still have many small jobs I want to catch up on, so I am shunting my rotating schedule down a week and doing a repeat. I will add some new shortcuts, some new tab commands, and hopefully a clipboard URL watcher and a new way of adding OR search predicates.