/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

Uncommon Time Winter Stream

Interboard /christmas/ Event has Begun!
Come celebrate Christmas with us here


8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

(9.32 KB 480x360 s_sQcYNHTcY.jpg)

Version 358 hydrus_dev 07/03/2019 (Wed) 23:38:21 Id: 5f3432 No. 13106
https://www.youtube.com/watch?v=s_sQcYNHTcY windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v358/Hydrus.Network.358.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v358/Hydrus.Network.358.-.Windows.-.Installer.exe os x app: https://github.com/hydrusnetwork/hydrus/releases/download/v358/Hydrus.Network.358.-.OS.X.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v358/Hydrus.Network.358.-.Linux.-.Executable.tar.gz source tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v358.tar.gz I had a great week doing duplicates work and fixing bugs. duplicates I split this big duplicates storage overhaul into three jobs, and this week marks the third and final job done. Like alternates and duplicates information, potential pairs are now stored in a unified and more efficient way. On the front end, you may notice your potential pairs queue shorten again on week. It will also shrink faster as you process in the duplicate filter, which will present more 'useful' duplicate pairs first and apply your decisions more intelligently at the db level. All the code is simpler except for one key area. If you notice certain the 'show some random potentials', duplicate filter or potential counts load time take way too long, please let me know about your situation. It is possible I will have to revisit this complicated 'join', although in my tests it is performing well. Also, I have written the new record to stop alternate pairs coming up in the duplicate filter repeatedly, as some users have experienced. These relationships are now more concrete, and this concreteness is plugged into duplicate merge operations and so on. You may see one more round of alternates appearing, and then they will be saved properly. Now everything is stored on the new system, there are two main jobs remaining: re-adding various administrative commands like remove/dissolve to properly undo relationships, and adding some options and improvements to the duplicate filter workflow. the rest Pixiv changed their format recently, so hydrus's default parser broke. This should be automatically fixed this week. Thanks to a user who sent in this fix. The issue where mouse scroll events were not being caught when a media viewer did not have focus is also fixed. The 'watcher' page now reports file and check status in the 'status' column! I missed this somehow when I added it for the gallery downloader. This makes it just a little easier to see what a list of threads is currently doing. I may have fixed the problem where exiting manage tags from a media viewer sometimes falls focus back to the main gui. Please let me know if you still get this (and if so, if you know a way you can reliably repeat this behaviour). I improved some of the network engine's 'this connection ended early' checks. This may have fixed some issues users had downloading images and page data from some unreliable servers, but if it does not, please send me any incomplete jpegs and the URLs they came from so I can check further on my end. Also, the whole system is more strict about response lengths now, so if you discover false-positive network failures here, please report them. Also some server issues related to last week's client api authentication improvements (such as file repository file upload sometimes breaking) should be fixed. new client api library If you would like to work on the Client API using Node.js, check out the new module a user wrote here: https://github.com/cravxx/hydrus.js This is now in the help along with the rest of the API here: https://hydrusnetwork.github.io/hydrus/help/client_api.html
[Expand Post]full list - duplicates: - the final large data storage overhaul work of the duplicates work big job is done–potential duplicate information is now stored more sensibly and efficiently. potential pair information is now stored between duplicate file groups, rather than files themselves. when duplicate file groups are merged, or alternate or false positive relationships set, potentials are merged and culled appropriately - your existing potential data will be updated. the current potential pairs queue size will shrink as duplicate potential relationships are merged - the duplicate filter now presents file kings as comparison files when possible, increasing pair difference and decision value - potential pair information is now stored with the 'distance' between the two files as found by the similar-files search system. the duplicate filter will serve files with closer distance first, which increases decision value by front-loading likely duplicates instead of alts. distance values for existing potential pair info is estimated on update, so if you have done search distance 2 or greater and would like to fill in this data accurately to get closer potentials first, you might like to reset your potential duplicates under the cog icon (bear in mind this reset will schedule a decent whack of CPU for your idle maintenance time) - setting alternate relationship on a pair is now fixed more concretely, ensuring that in various search expansions or resets that the same pair will not come up again. this solves some related problems users have had trying to 'fix' larger alternate groups in place–you may see your alternates compared one last time, but that should be the final go. these fixed relationships are merged as intra-alternate group members merge due to duplicate-setting events - a variety of potential duplicates code has been streamlined based on the new duplicate group relationship - improved how a second-best king representative of a group is selected in various file relationship fetching jobs when the true king is not permitted by search domain - one critical part of the new potential duplicates system is more complicated. if you experience much slower searches or count retrievals IRL, please let me know your details - expanded duplicates unit tests to test potential counts for all tested situations - fixed a bug where alternate group merging would not cull now-invalid false-positive potential pairs - the rest: - updated the default pixiv parser to work with their new format–thank you to a user for providing this fix - fixed the issue where mouse scroll events were not being processed by the main viewer canvas when it did not have focus - file page parsers that produce multiple urls through subsidiary page parsers now correctly pass down associated urls and tags to their child file import items - updated to wx 4.0.6 on all built platforms–looks like a bunch of bug fixes, so fingers-crossed this improves some stability and jank - updated the recent server access-key-arg-parsing routine to check access from the header before parsing args, which fixes an issue with testing decompression bomb permission on file POST requests on the file repository. generally improved code here to deal more gracefully with failures - the repositories now max out at 1000 count when fetching pending petition counts (speeding up access when there are large queues) - the repositories now fetch petitions much faster when there are large queues - frames and dialogs will be slightly more aggressive about ensuring their parents now get focus back when they are closed (rather than the top level main gui, which sometimes happens due to window manager weirdness) - rewrote a bad old legacy method of refocusing the manage tags panel that kicks in when the 'open manage tags' action is processed by the media viewer canvas but the panel is already open - hitting 'refresh account' on a paused service now gives a better immediate message rather than failing after delay on a confusing 'bad login' error - improved login errors' text to specify the exact problem raised by the login manager - fixed a problem in the duplicates page when a status update is called before the initial db status fetch is complete - the manage tag siblings panel now detects if the pair you wish to add connects to a loop already in the database (which is a rare but possible case). previously it would hang indefinitely! it now cancels the add, communicates the tags in the loop, and recommends you break it manually - added a link to https://github.com/cravxx/hydrus.js , a node.js module that plugs into the client api, to the help - a variety of user-started network jobs such as refreshing account and testing a server connection under manage services now only attempt connection once (to fail faster as the user waits) - the 'test address' job under manage services is now asynchronous and will not hang the ui while it waits for a response - fixed some unstable thread-to-wx code under the 'test access key' job under manage services - improved some file handling to ensure open files are closed more promptly in certain circumstances - fixed some unstable thread-to-wx communication in the ipfs review services panel - improved the accuracy of the network engine's 'incomplete download' test and bandwidth reporting to work with exact byte counts when available, regardless of content encoding. downloads that provide too few bytes in ways that were previously not caught will be reattempted according to the normal connection reattempt rules. these network fixes may solve some broken jpegs and json some users have seen from unreliable servers - fixed watcher entries in the watcher page list not reporting their file and check download status as they work (as the gallery downloader does) - the client api will now deliver cleaner 400 errors when a given url argument is empty or otherwise fails to normalise (previously it was giving 500s) - misc cleanup next week I had hoped to do some IPFS work this week, but I ran out of time to do it properly. This is now the main job for next week. Otherwise, I will do some of this final duplicates work and some misc small jobs.
(315.20 KB 1097x776 client_2019-07-04_17-15-54.png)

(12.90 KB 2098x365 client_2019-07-04_17-17-23.png)

Ok, so i'm doing a very cursory pass on my files just doing an appealing and unappealing quick look. so these are some really old imported files, 1 year 9 months, and these were pre my db fucked up images. Now, I know what this program can find with a speculative search, and I also know this artist has a habit of fuck off large file sizes, so I wanted to see everything for said images given there were 3 and 5 of nearly the exact same image, or at least close enough to trip a speculative search. but I got nothing. so just to make sure i'm right, I loaded them into a temp db and did the speculative search. that got some results, in fact it got every image. for both of the two sets. so, just wondering, any idea when a toggle to reset my entire duplicate files thing may come in? it would be nice if I could keep alternate mappings, but if I need to reset all that to get all my files in the dup filter it wouldn't be to bad.
Would it be possible to get an option to add a "inbox items" column to the watchers list? I'm rarely interested in the total number imported, and would rather use that screen real estate to know whether any given thread has updated. On that subject, I would really appreciate the ability to customize ListCtrls in general: To remove columns I don't need, and reorder them as wanted.
>>13087 Here. Can confirm I'm able to upload files again, thanks!
Getting this error when trying to run some file lookup script in the manage tags window. Only errors on the scripts that use the GET query type, POST works fine still. Didn't happen in v356.
UnboundLocalError
local variable 'f' referenced before assignment
Traceback (most recent call last):
File "include\HydrusThreading.py", line 342, in run
callable( *args, **kwargs )
File "include\ClientGUITagSuggestions.py", line 483, in THREADFetchTags
parse_results = script.DoQuery( job_key, file_identifier )
File "include\ClientParsing.py", line 2722, in DoQuery
parsing_text = self.FetchParsingText( job_key, file_identifier )
File "include\ClientParsing.py", line 2688, in FetchParsingText
if f is not None:
UnboundLocalError: local variable 'f' referenced before assignment
>>13116 Argh, thank you for this report. I apologise–this was something I slipped in late, and it missed proper testing. I am 99.7% certain I can have it fixed for 359.
>>13109 I will be doing smaller 'tidy-up' duplicates work for the next two weeks or so. I've got adding in an 'eligible files rescan' or similar maintenance task in my todo list for this work, so it should appear in a little bit.
>>13110 Yeah, I am very keen to add column selection and resize memory to the listctrls. I still have I think about 18 old listctrls to move to my new class, and then everything will be on the same system. I plan to extend the new class and have global options storage to remember columns and sizes at that point. I keep chipping away at this every cleanup week, and I'll get there eventually. An inbox count column is a great idea–I have my downloaders only present 'new files' for similar reasons.
>>13113 Great, thanks for letting me know.
>>13133 Good to hear, can't wait for that. I have had a bit of a problem So, I will en mass download threads on /b/ this isn't too bad, but in one of the en mass downloads I got the tails sonic babysitting comic, lets not go into more detail then that. so my first thought is lets do a speculative search… and there are at least 4 copies of every image, so I have to do this for every image to cleanse this blight. would there be a way to search duplicates for multiple files at once? I know the use case for this is a bit iffy, when when it comes into play it would be greatly appreciated if it's possible.
>>13144 also, because I set a bunch up ahead of time and forgot, is it possible to have a 'remove from view' option that hits and files already deleted? I did a massive file size pass that killed quite a few massive fuck off images from the db, mostly things I could see from just a thumbnail were horrible, and opening this older saved tab up to plow these images has a good number of the largest ones already removed but showing squares. whiles its useful to see and know they were once there, an option to remove them all from view at once would be appreciated.
Thank you based dev.
Is there any way to search for specific duplicate types? I like to replace files with the best version available, but I fucked up and left a lot of "this is worse" versions around, and I'm not sure if I'm retarded or new versions prevented generic ">0 this is worse" searches.
>>13144 Searching for multiple 'similar files' from right-click is probably doable. Sorting them the other end is probably a hassle, but we'll leave that for later. I'm supposed to be extending system:hash to take multiples soon, so I can probably do them together. >>13145 That's a good idea. I would say hit right-click->select->(something) where 'something' would be like 'remote' files or similar and then hit remove in a second action, but I can't remember if there is a good option to select those files. I'll make sure that select menu is comprehensive and see if I can do a quick submenu that that does remove. Maybe if no files are selected, the remove menu instead branches to a similar select-type submenu. I often do 'select->archived … remove', so it would be nice to do that kind of stuff in one action.
>>13152 Not precisely. I don't have the help written yet to explain this nicely, but duplicate files are now stored in a valueless group that has a single best 'king' file. In the coming weeks, I will add something like 'system:is_king'. I think if you combine something like [ 'system:num_duplicates: duplicates>0', 'system:is_not_king' ], that will let you search these non-king (i.e. worse) files while excluding the good ones, and you can then deleted them en masse. Now I have this simpler duplicates tech, I expect it to slowly appear in more locations. A nice plan is to have the main media viewer say 'hey, this file has a worse quality dupe', a little like how danbooru shows this info in a file page, and have a quick way to flick to that without interrupting the main 'list' of files the media viewer is looking at. See if the new search predicates help you do what you want here, and let me know what else would work for you.


Forms
Delete
Report
Quick Reply