/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

Uncommon Time Winter Stream

Interboard /christmas/ Event has Begun!
Come celebrate Christmas with us here


8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

Bugs Thread hydrus_dev 02/06/2019 (Wed) 01:35:09 Id: 2b149c No. 11542
BUGS THREAD
>>14972 "Exact match" doesn't necessarily mean that, it just means they are very similar. It's just math. A censor bar might not be enough difference to make the numbers go up enough for it to not be considered an exact match. "Reset potential duplicates" is meant to be used to reset matches when changing from a higher search distance to a lower one. So it doesn't make sense to search again files that are already marked as not having potential duplicates.
Since (I think) the last version, v419, the namespace:* search doesn't bring up suggestions for every tag in that namespace anymore. It's just empty, with no suggestions now until I type a letter instead of *. I have the option set to on, but it still doesn't work. I'm running the Linux version.
I'm on version 420 and currently the twitter/nitter downloader currently fails on videos saying "Count not find a file or post URL to download!". Are there plans to fix this? IMO one of the major advantages of hydrus is to download an artist's twitter. It's easy/simple to browse and find what you want on e621 or even fur affinity where things have tags etc. but it is a nightmare on twitter having to just scroll through a lazy-loading timeline full of retweets and other shit. I'm guessing they're just streaming the video using DASH or something rather than an easy to grab URL. This is probably out of scope for a downloader, but youtube-dl can grab twitter videos and gets packaged with everything else. That might be a solution?
Exception Problem while trying to fetch External IP: Traceback (most recent call last): File "hydrus\client\gui\ClientGUIDialogs.py", line 373, in EventCopyExternalShareURL url = self._service.GetExternalShareURL( self._share_key ) File "hydrus\client\ClientServices.py", line 473, in GetExternalShareURL host = HydrusNATPunch.GetExternalIP() File "hydrus\core\HydrusNATPunch.py", line 50, in GetExternalIP raise Exception( 'Problem while trying to fetch External IP:' + os.linesep * 2 + str( stderr ) ) Exception: Problem while trying to fetch External IP: No IGD UPnP Device found on the network ! No IGD UPnP Device found on the network ! It gave me this error when I tried to set up a local booru wih external ip. How do I fix it?
>>14980 iirc the youtube-dl solution is planned, but in the mean time there is another parser that will fetch video through a third-party website in hydrus already. You just have to change your parser links as it isn't linked to it by default.
>>12890 This doesn't exist anymore.
>>14974 Thank you, this is great. I am really sorry for the shit here, it looks like it was lagging twelve minutes. All of that time was spend doing one query, and it was not where I expected, so this profile helps a lot. It looks like there were 65,000 tags in the page that needed to be populated, with the lookup done twice, and 5ms per row is crazy slow, so something is going wrong here. I will examine this further, and I can think of a better way of doing it as well. Please let me know how 422 works for you.
>>14979 Thank you for this report. I am sorry, this got recently broken. I am going to fix it this week and write some unit tests for the new cache that is failing here so these logical typos do not slip through again.
>>14987 Hydrus uses a program in install_dir/bin to try to talk to your router to get UPnP information, which includes external IP. If your router does not respond to the UPnP request, or you are on a VPN of some sort so you can't talk to your gateway at all, this happens. I will brush up the english of this error. >>14989 It is now on options->gui pages, right at the bottom, 'hide the bottom-left preview window'.
This file caused the memory usage to spike up to 11 gigs when I came across it in the archive/delete filter. https://gelbooru.com/index.php?page=post&s=view&id=5735585&tags=paintrfiend
>>14996 I was about 200 in of the 1100 total, if that's important.
>>14988 So I found the 'manage parsers' screen and I assume what I'm interested in is the "nitter tweet parser (video from koto.reisen)". I went to edit it but I'm not really sure what to do from here? Could you elaborate a bit? Also is there a way to sort by 'source time'? I imagine there must be because I can sort by everything from resolution ratio, to import time, to number of tags which all seem less useful than when a file was posted.
>>14998 That is the right parser to use, but you want to change to it in the 'manage url class links' menu.>>14998
(38.93 KB 961x483 sqlite.png)

>>14974 Hey, I looked into your problem today, and I cannot figure out why it is running quite so slow. I could explain a few seconds, even twenty, but not several minutes with 65,000 items. I can write a faster cache for the particular filtering job going on here, but I think I might be fixing the wrong problem in your case. Can you please, with your client shut down, go to your install_dir/db directory, run the sqlite3 console executable, and then paste these lines in? For the second query, you will have to substitute the XXX with a number that lines up with a table that exists in your database, which the first query will give. I have attached an image of what I get in my dev machine. .open client.caches.db
select name from sqlite_master where name like "%siblings%";
explain query plan SELECT 1 FROM actual_tag_siblings_lookup_cache_XXX where bad_tag_id = 1 OR ideal_tag_id = 1;
.exit
If your database is missing the 'ideal_tag_id' indices somehow, or SQLite is unable to plan efficiently with them, that would explain the quadratic slowdown you are getting, and it means I have something to repair, not optimise.
>>14996 >>14997 These gigantic pngs are a pain. I am not really sure what to do about them. Unfortunately, it is one of my image libraries needing most of that space, likely OpenCV, possibly Pillow. A long time ago, some clever guys constructed large malicious pngs with extremely inefficient pixels/palettes/whatever, and they were able to crash people's machines as the decompression routine of loading the png suddenly ate up all the memory of the machine and crashed it. These were called 'decompression bombs'. We now get this effect naturally when patreon artists sperg out. The good news is memory usage goes down quickly once the image is rendered, although 11GB is larger than I have heard of before. Even then, just having that image at 100% in memory is a 750MB bitmap. Although I can detect the size of a png, being able to detect how efficiently it will unpack is not something (or my libraries) seem to know how to do. Perhaps in future they will be handled better, or I'll find a new library or a new way of rendering just part of an image at a time so I don't need 100% to get the scaled version, but for now I do not have great answers. I will save that png though as a great example of something that will probably kill my 8GB dev machine. I am sure there is extra inefficiency in my loading routines, so I will be able to examine what is going on easily with it.
>>15000 Follow-up: If the problem is your database does not have the required indices for some reason (my best guess here atm, either due to hard drive failure or a weird update logic problem), I have fixed this for 422. The database now automatically checks if those indices (and some others) are missing on every boot and recreates them if so.
>>15003 My problem seems to be solved. '%siblings%' in my DB lookes nothing alike compared to yours (sorry if I fail with the formatting): sqlite> .open client.db
sqlite>
sqlite> select name from sqlite_master where name like '%siblings%';
tag_siblings
sqlite_autoindex_tag_siblings_1
tag_siblings_service_id_good_tag_id_index
sqlite>
So I started suspecting that I miss some indexes/caches/etc, and once again performed database→regenerate→* from menu, except "clear service info cache". Since then autocompletion works perfectly. '%siblings%' list however didn't changed. I also was unable to perform explain query: sqlite>sqlite> explain query plan SELECT 1 FROM sqlite_autoindex_tag_siblings_1 where bad_tag_id = 1 or ideal_tag_id = 1;
Error: no such table: sqlite_autoindex_tag_siblings_1
sqlite>
sqlite> explain query plan SELECT 1 FROM tag_siblings_service_id_good_tag_id_index where bad_tag_id = 1 or ideal_tag_id = 1;
Error: no such table: tag_siblings_service_id_good_tag_id_index
sqlite>
sqlite> explain query plan SELECT 1 FROM tag_siblings where bad_tag_id = 1 or ideal_tag_id = 1;
Error: no such column: ideal_tag_id
sqlite>
For the record, here is my tables list: sqlite> .tables
alternate_file_group_members local_file_deletion_reasons
alternate_file_groups local_ratings
analyze_timestamps options
client_files_locations potential_duplicate_pairs
confirmed_alternate_pairs recent_tags
current_files remote_thumbnails
deleted_files repository_updates_11
duplicate_false_positives service_directories
duplicate_file_members service_directory_file_map
duplicate_files service_filenames
file_inbox service_info
file_modified_timestamps services
file_notes statuses
file_petitions tag_parent_application
file_transfers tag_parent_petitions
file_viewing_stats tag_parents
files_info tag_sibling_application
ideal_client_files_locations tag_sibling_petitions
ideal_thumbnail_override_location tag_siblings
json_dict url_map
json_dumps vacuum_timestamps
json_dumps_named version
last_shutdown_work_time yaml_dumps
sqlite>
Once again thanks for your help.
>>15005 Sorry, it had to be client.caches.db, not client.db, when you did the first '.open' line. No worries. Since you ran the regenerate and that fixed it, that I think suggests this was an indexing issue, which is automatically fixed for all other users going forward tomorrow. If you are interested, the slow bit for you was the 'is this tag involved in the siblings system?' method, which is part of when the client needs to figure out which siblings your autocomplete results (generated from thumbs) need to have so it can swap in typed text for a real result based on sibling mappings. This thing basically says: 'for every tag, is it either a bad tag or is it an ideal tag in the fast(lmao) siblings table?'. To speed it up, I have both the 'bad' column and 'ideal' column indexed. For some reason that I am not sure about, you did not have the 'ideal' column indexed, which meant for every one of those 65,000 tags, rather than being able to quickly test the second half of the OR, a few iterations into a search tree, it instead had to iterate the entire table for every negative result. Assuming you sync with the PTR or a similar big-sibling-number service, that probably meant something like 63,000 * 100,000 row fetches, or several billion loops on a handful of kilobytes in memory (six minutes), rather than a couple hundred thousand (about 50ms or so). You may not be completely fixed, however. Since you have done the regenerate, your sibling tables were emptied, so if the problem was not in fact a missing index, you may see bad performance creep back in as your sync progresses under tags->sibling/parent sync->review. Let me know if this happens. If this has fixed you long-term, thank you for your feedback and help.
>>14999 So I changed the parser to the koto.reisen one in the manage class links screen for 'nitter timeline' and 'nitter tweet' but it doesn't seem to have done anything. Is there something else I need to change or is this just broken for now?
>>15009 just change for 'nitter tweet' URL class is needed, other nitter should be default.
>>15011 I tried this and it got some of the videos, I think the failing posts are because they have links that have since died. Either way thank you for the help.
>>14955 >For pools, I also get the same problem. My guess is gelbooru changed their format. Thank you, I will check with the guys who make downloaders to see if we can figure out a fix. Wanted to follow up on this, has the fix been made? I assume I am supposed to be looking here for it https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/tree/master/Downloaders/Gelbooru and since the last non-readme change was six months ago, I gather that it's not fixed.
>>14960 This problem still occurs with version 423, by the way. Hydrus launches, spawns no GUI, and shoots a CPU core to 100% indefinitely.
I'm having trouble, I go to services > review services > remote, and it shows the PTR is paused. I click to resume, and it starts processing requests. I leave it overnight, then come back and it shows a notification in Hydrus stating: Failed to process updates for the public tag repository repository! The error follows: DBException TagSizeException: Received a zero-length tag! Traceback (most recent call last): File "hydrus\client\ClientServices.py", line 1724, in _SyncProcessUpdates num_rows_done = HG.client_controller.WriteSynchronous( 'process_repository_definitions', self._service_key, definition_hash, iterator_dict, job_key, work_time ) File "hydrus\core\HydrusController.py", line 852, in WriteSynchronous return self._Write( action, True, *args, **kwargs ) File "hydrus\core\HydrusController.py", line 237, in _Write result = self.db.Write( action, synchronous, *args, **kwargs ) File "hydrus\core\HydrusDB.py", line 1100, in Write if synchronous: return job.GetResult() File "hydrus\core\HydrusData.py", line 1843, in GetResult raise e hydrus.core.HydrusExceptions.DBException: TagSizeException: Received a zero-length tag! Database Traceback (most recent call last): File "hydrus\core\HydrusDB.py", line 679, in _ProcessJob result = self._Write( action, *args, **kwargs ) File "hydrus\client\ClientDB.py", line 19772, in _Write elif action == 'process_repository_definitions': result = self._ProcessRepositoryDefinitions( *args, **kwargs ) File "hydrus\client\ClientDB.py", line 14780, in _ProcessRepositoryDefinitions tag_id = self._GetTagId( tag ) File "hydrus\client\ClientDB.py", line 11320, in _GetTagId HydrusTags.CheckTagNotEmpty( tag ) File "hydrus\core\HydrusTags.py", line 181, in CheckTagNotEmpty raise HydrusExceptions.TagSizeException( 'Received a zero-length tag!' ) hydrus.core.HydrusExceptions.TagSizeException: Received a zero-length tag! Database Traceback (most recent call last): File "hydrus\core\HydrusDB.py", line 679, in _ProcessJob result = self._Write( action, *args, **kwargs ) File "hydrus\client\ClientDB.py", line 19772, in _Write elif action == 'process_repository_definitions': result = self._ProcessRepositoryDefinitions( *args, **kwargs ) File "hydrus\client\ClientDB.py", line 14780, in _ProcessRepositoryDefinitions tag_id = self._GetTagId( tag ) File "hydrus\client\ClientDB.py", line 11320, in _GetTagId HydrusTags.CheckTagNotEmpty( tag ) File "hydrus\core\HydrusTags.py", line 181, in CheckTagNotEmpty raise HydrusExceptions.TagSizeException( 'Received a zero-length tag!' ) hydrus.core.HydrusExceptions.TagSizeException: Received a zero-length tag! Then under services, it shows the PTR is paused again. Not sure what's going on. I will try updating to the latest version right now though. I'm currently on 420, network version 19. Windows 10.
>>15076 Btw, it looks like it did run through the repo processing because I reached the bandwidth limit. I am working on updating because my repo is outdated. I just updated my Hydrus though to the latest, I will let it run tomorrow after the bandwidth cap resets, and see if I still get any errors.
Pixiv artist lookup seems to be broken as it uses the old url style that no longer works
>>15104 Try deleting pixiv url classes, parsers, and GUG's then add the defaults but only add the ones with "api" in the names and then make sure the url class links are linked correctly.
Gelbooru tag parsing seems to be broken. It searches and downloads files like it should, but no tags are applied to files. This started over the past couple of days. I was using v420, and then updated to v425 to see if it fixed it, but it did not. All other boorus are working just fine.
>>15117 they changed some stuff on the site that broke the parser, replace the old parser with this one.
(48.43 KB 782x309 screenshot.png)

>>15006 Hi again. Indeed my problem is not solved. This is what I noticed after several weeks of repetitive cache regenerations and version upgrades. Once I do database→regenerate→tag storage mappings cache, autocomplete becomes fast. But tags→sibling/parent sync→review tag/sibling maintenance→public tag repository becomes completely unsynchronized. Not sure if it searches for siblings correct, but it works light speed fast. Once I do resync, autocomplete is slow again. So it seems the closer the parents and siblings cache to be fully synced, the slower autocomplete becomes. However in two cases it still works fast: 1. With 'searching immediately' turned off (can you explain this?) 2. If you did a large search and quickly make a second one while the status bar still shows 'Loading… XXXX of YYYYY'. Do you need any further feedback from me? If it helps I can make DB profile debug logs with/without sibling/parent cache synchronized. I also noticed that I made the SQL query you asked me to do on wrong database. Do you still need it?
Hey, I am sorry, the holiday ended up killing my schedule, and I couldn't give this thread proper attention. 8kun are now saying the expect to go off the clearnet in the near future and go TOR only. I have been wanting to move to a different board for a while, I was just procrastinating, so this seems like the time to do it. Endchan will be our primary board for the time being. I am going to clear this thread of newer posts, make an archive of it to post on Endchan, and then make a sticky announcing I will delete the board next week.
>>15054 Ah, I am sorry, I am not sure what the answer back was. I just talked to the guys again now. They aren't hosting copies of what I bundle by default, only downloaders that can do different sites or a whole bunch more. It appears pools now have slightly different format for some users. Pools still works for those guys, but they are logged in, I am not. I will fix this myself this week, since I get the problem, and roll it into the update. Sorry for the trouble! >>15057 Unfortunately the problem seems to be with PyInstaller, not my code. A couple of guys have been working on a different build method here: https://github.com/ReAnzu/hydrus/actions/runs/452850300 I hope to catch up with their work when it is finished and working and fold it into my build too. >>15076 >>15080 Hey, I am sorry for the trouble here. Thank you for the report. This should be fixed in the latest versions. Please let me know if you still have any problems.
>>15117 >>15118 I will be folding this into 426 btw. >>15125 Thank you for this report. I am sorry for the continuing trouble. If searches are slow when you are searching a page of results, but fast when you search an empty page (or one with 'searching immediately' off, which then searches the database, rather than the tags for the files in front of you), then the routine that is generating search results from thumbnails is slow. Some more feedback would be very helpful. Please run 'db profile mode' and do some slow autocomplete searches with files in front of you, and then pastebin or email me that profile log. It is the 'media_predicates' routine that is running slow for you. The >>15000 query on client.caches.db would be helpful. Let's see what SQLite wants to be doing here.
This thread is now archived at https://archive.is/cq5Tc . I will not post here any more, as the board will be deleted next Wednesday. Please move to the Endchan thread at https://endchan.org/hydrus/res/9.html , thank you! This plan has changed. The main imageboard location for hydrus is now a General on >>>/t/ here on 8chan.moe.
Edited last time by hydrus_dev on 01/20/2021 (Wed) 04:40:11.
>>14683 >If you do have it, is MPV the player under options->media? If your hydrus does not have libmpv access, or is failing to import it, it will fall back to my native viewer, which has no audio support. Fixed. Thanks a lot. I'm new to Hydrus. Using Linux MX. Before I was using Tagspaces which is a sluggish piece of trash. Looking for a more suitable alternative I bumped into Hydrus; so far I'm very pleased by its performance and rich settings.
>>15118 Is there a way to get the blacklist back to working? There's usually a cookie for you blacklist on gelbooru but for some reason the downloader doesn't use it and stopped working for me a month or so ago. I can't remember if it was something I did or if the site changed or what.


Forms
Delete
Report
Quick Reply