https://www.youtube.com/watch?v=3-K94kdbTS8
windows
zip:
https://github.com/hydrusnetwork/hydrus/releases/download/v412/Hydrus.Network.412.-.Windows.-.Extract.only.zip
exe:
https://github.com/hydrusnetwork/hydrus/releases/download/v412/Hydrus.Network.412.-.Windows.-.Installer.exe
macOS
app:
https://github.com/hydrusnetwork/hydrus/releases/download/v412/Hydrus.Network.412.-.macOS.-.App.dmg
linux
tar.gz:
https://github.com/hydrusnetwork/hydrus/releases/download/v412/Hydrus.Network.412.-.Linux.-.Executable.tar.gz
source
tar.gz:
https://github.com/hydrusnetwork/hydrus/archive/v412.tar.gz
I had a great week catching up on smaller jobs, improving search speeds, and adding a 'lite' 407->408 update mode for HDD users who sync with the PTR. There are also a couple of new applications for the Client API.
Update this week will take a few seconds to a few minutes as new database indices are created.
sibling and search speeds
Thanks to feedback from some PTR-syncing HDD users, the new siblings update code, most importantly in step 407->408, takes way too long for them - perhaps more than 24 hours. I have written a little yes/no dialog popup into the update step that talks about this and optionally activates a 'lite' mode that does not apply siblings. This still requires some basic cache copying work, but it is significantly less. If you are still on 407 or before and have been waiting to update, please give this a go and let me know how it works for you.
The 'manage where tag siblings apply' dialog now has some red text to warn about the high CPU/HDD of applying many siblings to a large number of tags. I am still not happy with the 'monolithic' way this db work goes, so when I get stuck into the parents cache, I will write an asynchronous system that does this work in the background, pause/resumable, without interrupting browsing and so on, much like I did with repository processing.
Some things were working slow since siblings (e.g. in a search, mixing wildcard tags with regular tags), but I went through every instance of my new optimisation code, fixing bugs, testing it at large scale, and smoothing spikes out further. Tag, namespace, wildcard, tag presence/count, known url, and file note searches should all be more reasonable. A neat new tag search pre-optimisation routine that checks autocomplete counts for expected result size before deciding how to search now works for more sorts of tags and also kicks in for namespace and wildcard searches, which now break their work into smaller and simpler pieces. I also added and reshaped some database indices, which will ensure that more unusual search types and general operations can still run efficiently. The update will take a few seconds to a few minutes as tag indices are regenerated.
I have learned a bunch about speeding up multi-predicate searches recently - how to get it wrong and how to get it right. I have a plan to speed up rating and known url results, which are still generally not able to speed up with multiple predicates on large clients.
new client api applications
A user has been working hard at making a web browser for the client via the Client API, called Hydrus Web. It is now ready at
https://github.com/floogulinc/hydrus-web ! If you have a bit of networking experience, please check it out - it allows you to browse your client on your phone!
Also Anime Boxes, a booru-browsing application, is adding Hydrus as an experimental browseable 'server' this week, also through the Client API. Check it out at
https://www.animebox.es/ !
I also updated the Client API help to talk more about HTTPS and connections across the internet, here:
https://hydrusnetwork.github.io/hydrus/help/client_api.html
full list
- client api:
- added Hydrus Web,
https://github.com/floogulinc/hydrus-web, to the Client API page. It allows you to access your client from any web browser
- added Anime Boxes,
https://www.animebox.es/, to the Client API page. This booru-browsing application can now browse hydrus!
- the /add_urls/add_url command's 'service_names_to_tags' parameter now correctly acts like 'additional' tags, and is no longer filtered by any tag import options that may apply. that old name still works, but the more specific synonym 'service_names_to_additional_tags' is now supported and recommended (issue #456)
- the /add_urls/add_url command now takes a 'filterable_tags' parameter, which will be merged with any parsed tags and will be filtered in the same per-service way according to the current tag import options.
- the client api help is updated to talk about this, and the client api version is now 14
- updated client api help to talk about http/https
- .
- the rest:
- the 407->408 update step now opens a yes/no dialog before it happens to talk about the big amount of CPU and HDD work coming up. it offers the previous 'full' version that takes all the work, and a 'lite' version that applies no siblings and is much cheaper. if you have been waiting on a PTR-syncing HDD client, this should let you update in significantly less time. there is still some copy work in lite mode, but it should not be such a killer
- the 'manage where tag siblings apply' dialog now has big red warning text talking about the current large CPU/HDD involved in very big changes
- a bunch of file-location loading and searching across the program has the opportunity to run very slightly faster, particularly on large systems. update will take a few seconds to make these new indices
- namespace and subtag tag searches and other cross-references now have the opportunity to run faster. update will take another couple of minutes to drop and remake new indices
- gave tag and wildcard search a complete pass, fixing and bettering my recent optimisations, and compressing the core tag search optimisation code to one location. thank you for the feedback everyone, and sorry for the recent trouble as we have migrated to the new sibling and optimisation systems
- gave untagged/has_tags/has_count searches a similar pass, mostly fixing up namespace filtering
[Expand Post]
- gave the new siblings code a similar pass, ensuring a couple of fetches always run the fast way
- gave url search and fetch code a similar pass, accounting better for domain cross-referencing and file cross-referencing
- fixed a typo bug when approving/denying repository file and mapping petitions
- fixed a bug when right-clicking a selection of multiple tags that shares a single subtag (e.g. 'samus aran' and 'character:samus aran')
- thanks to some nice examples of unusual videos that were reported as 1,000fps, I improved my fallback ffmpeg metadata parsing to deal with weird situations more cleverly. some ~1,000fps files now reparse correctly to sensible values, but some either really produce 1000 updates a second due to malformation or bad creation, or are just handled that way due to a bug in ffmpeg that we will have to wait for a fix for
- the hydrus jpeg mime type is now the correct image/jpeg, not image/jpg, thanks to users for noticing this (issue #646)
- searching for similar files now requires up to 10,000x less sqlite query initiation overhead for large queries. the replacement system has overhead of its own, but it should be faster overall
- improved error handling when a database cannot connect due to file system issues
- the edit subscription(s) panels should be better about disabling the ui while heavy jobs, like large subscription resets, are running
- the edit subscription(s) panels now do not allow an 'apply' if a big job is currently disabling the ui
- cancelling a manage subscriptions call when missing query logs were detected no longer causes a little error
- if a long-running asynchronous subscription job lasts beyond its parent's life, it now handles errors better
- .
- boring details:
- improved a pre-optimisation decision tool for tag search that consults the autocomplete cache for expected end counts in order to make a better decision. it now handles subtag searches and multiple namespace/subtag searches such as for wildcards
- wrote fast tag lookup tools for subtag and multiple namespace/subtag
- fixed some bad simple tag search optimisation code, which was doing things in the wrong order!
- optimised simple tag search optimisations when doing subtag searches
- polished simple tag search code a bit more
- added brief comments to all the new cross joins to reinforce their intention
- greatly simplified the multiple namespace/subtag search used by wildcards
- fixed and extended tag unit tests for blacklist, filterable, additional, service application, overwrite deleted filterable, and overwrite deleted additional
- added a unit test for tag whitelist
- extended the whole 'external tags' pipeline to discriminate between filterable and additional external tags, and cleaned up several parts of the related code
- moved the edit subscription panel asynchronous info fetch code to my new async job object
- cleaned up one last ugly 'fetch query log containers' async call in edit subscriptions panel
- moved the edit subscription(s) panels asynchronous log container code to my new async job object
- misc code cleanup
next week
More small jobs and other bug fixes. Nothing too huge, so I can have a 'clean' release before I go for the big parents cache in 414. I am starting to feel a bit ill, so there's a chance it will be a light week.