/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

US Election Thread

8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

(65.56 KB 480x360 pE64C2RX3ns.jpg)

Version 317 hydrus_dev 08/08/2018 (Wed) 21:19:51 Id: 744617 No. 9608
https://www.youtube.com/watch?v=pE64C2RX3ns windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v317/Hydrus.Network.317.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v317/Hydrus.Network.317.-.Windows.-.Installer.exe os x app: https://github.com/hydrusnetwork/hydrus/releases/download/v317/Hydrus.Network.317.-.OS.X.-.App.dmg tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v317/Hydrus.Network.317.-.OS.X.-.Extract.only.tar.gz linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v317/Hydrus.Network.317.-.Linux.-.Executable.tar.gz source tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v317.tar.gz I had a great week. I polished some ongoing download stuff and caught up on a bunch of smaller jobs. tag filter and tag import options A while ago, I wrote a new 'tag filter' that applies blacklist and whitelist rules to a list of tags. It is more powerful than some of the old systems I was using at the time, but the ui was pretty ugly, so I didn't roll it out everywhere it could be. This week, it gets new responsibilities and completely reworked ui and workflow. The filter's edit panel (which you can get to under tag import options from the blacklist button or beside the 'get tags' checkboxes) now has simpler 'whitelist' and 'blacklist' sub-pages that let you just say 'just give me namespaces x, y, and z' or 'do not get "tagme" tag' or 'do not download if you see "vore" or "diaper"' and so on while still letting you make a very complicated filter if you need to. These simple pages are the default. They also offer namespace checkboxes (as compiled from all your current parsers) to make it simple to set up a character/creator/series whitelist in just a few clicks. All the surrounding help has been given a pass as well, so it should all be just a better all-around workflow from now on! The old list of namespace checkboxes on tag import options is now gone. If you have some tag import options that use these, they will be automatically converted to the new 'get tags' and a tag filter with the same namespaces checked, so you shouldn't experience any change–your options are now just more powerful if you ever want to change them. With this change, tag import options are now completely decoupled from the old downloader, which makes a bunch of things easier on my end. Other than some little tweaks here and there, I think I am done with tag import options for now, so if you have been waiting to see how it now works, please check out the new defaults options under network->downloaders->manage default tag import options. multi-downloader improvements I made some quality-of-life improvements to the new multi-gallery and -watcher download pages. Loading a highlight should now be much much faster (>3s for some down to typically <50ms) and filter out deleted files. And adding or removing a query/thread should update the list immediately rather than after an annoying half-second delay. Also, if you prefer a new query or watcher to be immediately highlighted if there is not an existing highlight, there are now options for this under options->downloading. Also, an issue where watchers were often still checking after 404 is fixed. While I regret the problem, I am glad the new gallery log and related improvements are making it easier to identify and diagnose these 'gallery-level' problems when they happen. And I have added some parsers for imgur and derpibooru, thanks chiefly to efforts from the community. You should now be able to drag-and-drop pretty much any imgur link onto the client and get everything, including for large galleries, and for mp4 videos as well. Derpibooru doesn't have gallery support yet, but you can drag-and-drop single file page URLs and you should get tags and everything. Let me know if you discover any problems with these! slower thumbnail scrolling I've added an experimental new option to options->gui to change the rate of thumbnail scrolling. It starts at 1.0 for 1 thumbnail height per scroll tick (which for most OSes means 3 thumbs scrolled per mouse wheel click), but if you want to change it to 0.5 or 0.37 or whatever, it should work ok. I had to improve a bunch of the thumbnail drawing pipeline to get this to work, and I am glad I did as some of it was real creaky. I expect to do a little mork work here in the coming weeks to reduce some scroll jitter (and maybe some redundant CPU) I noticed while working on this. full list - completely overhauled the tag filter panel: - the tag filter panel now has 'whitelist' and 'blacklist' pages beside the old 'advanced' sub-panel. these new simple pages are much more human friendly for common workflows and provide easy-select checkboxes for namespace classes (which are compiled from all the namespaces your parsers can currently do) - the tag filter rule entering workflow now stops you from creating overcomplicated rulesets: when adding a blacklist rule, it will now only add an explicit entry if it is not already blocked by a higher rule (otherwise it will just discard from whitelist, if there)–and when adding a whitelist rule, it will now only add an explicit entry if it is already blocked by a higher blacklist rule (otherwise it will just discard from blacklist, if there) - tag filters now provide more human-friendly summary statements - misc improvements to tag filter ui logic - the various help texts surrounding the tag filter panel all got passes - the tag filter panel now uses text-and-paste controls for mass-adding of tags - namespace checkboxes have been completely removed from the tag import options panel and various other related places. any existing TIO with checked namespaces will be automatically updated to 'get tags' with an appropriate filter. this is an important step in the rewrite–everything is now handled in the new tag filter panel - simplified and sped up the actual tag filtering code - .
[Expand Post]- numerous multi-importer improvements: - the gallery and watcher page lists will now ~dynamically~ resize in height based on number of entries, from roughly four columns to twenty four. this relayout code somehow seems to work on all platforms - sped up the 'results loading' step of gallery/watcher highlighting immensely–on a typical list of a couple hundred files, it should now be about 50ms total (before, depending on presentation rules, it could be 0.8-3s) - added an additional db-skipping optimisation for calculating presentation status - watcher and gallery highlights will now filter out trash and completely deleted files (the ones that appear with a dark default 'hydrus' icon) on reloads - added two checkboxes to options->downloading for 'if nothing is highlighted when I add a new X, highlight that new X' for watchers and galleries - adding or removing a query or watcher from the new multi-lists should now be reflected in the list ui instantly, rather than after a <=1s delay - added url classes and parsers for imgur single and multiple urls–thanks to the community for providing some examples - added url class and parser for derpibooru single file pages–again thanks to the community. derpibooru hence now supports basic drag and drop import - fixed an issue where the watcher was often still checking despite 404 status - watchers and galleries use a little less CPU to update some of their ui - added simple subsidiary page parsing support to file import objects (previously, this only worked in the gallery log) - . - gave the thumbnail scrolling code a pass–it is now a bit cleverer about drawing and uses a larger number of smaller 'tile' bmps rather than pages - added an 'EXPERIMENTAL' option to options->gui to change the number of thumbnails each scroll tick scrolls. it defaults to 1.0, but you _should_ be able to set 0.5, 0.37, whatever. please report any bugs! - added a thumbnail debug mode to help see the new thumbnail layout boundaries - . - misc: - the max subscription file limits are now 10,000 for users in advanced mode - the default subs initial/periodic limit is now 100/100 (bumped up from 100/50) - the file import dialog now has a little cog icon to change whether human sort is applied on path addition events (for e.g. if you want to add in some date order from an explorer window) - humansort now sorts case-insensitive - by default, unmatched urls will no longer display in the top-right of the media viewer. see how you like this and let me know if you would like an option to put them back - the speed text on the right-side of the network job control now dynamically resizes to its min size, which gives the text on the left side (where it is often cut off, saying 'overriding bandwidth …') more space when available - I think I fixed an issue where the popup frame could spam-resize in odd ways (such as growing a pixel wider every update tick) - watchers will no longer include the '* ' highlight prefix in subject-based sort comparisons - in prep for an eventual major code refactoring, the thumbnails' underlying media object now stores a faster db-based numeric file identifier - 'duplicate' calls on the new listctrl will now insert the dupes in the current correct sort location, rather than tacking them on the end - drag and drop imports to the new listctrl will also now insert like this - caught up edit subscriptions panel to the finalised common listctrl panel code, including the import/export/duplicate buttons - the multiple checkboxlist selection dialog now sorts by label - converted all old checkboxlist dialogs to the new panel system - massively sped up certain kinds of parsing that were wasting time hitting a cache test way too often - fixed an old hash filtering system - moved to a simpler and more stable way of calculating certain text extents - fixed an issue where the include directory (which has the original source, which isn't a big deal but is nice to have) wasn't being correctly copied into the linux build - the os x .tar.gz build now has the include directory - refactored some client tags code around - misc cleanup next week I enjoyed catching up on things this week, and there is more to do. I'd also like to get some more gallery parsers done and fixe some unusual problems in the new system (gelbooru is sometimes cutting queries off at page two, and some similar issues with other boorus), and start thinking about the final 'searcher' object for the downloader overhaul.
>>9610 Can you give me an example of one of these derpibooru gif/webm pages and the alternate download link? Or is this a setting inside derpi associated with your login? If I go here: https://derpibooru.org/1803434 The download links still serve gif, but I notice the actual content embedded in the page has some unusual 'data-uris' values that point to webm/mp4. Or am I just not seeing the 'download as webm' button?
>>9611 Thanks, I will check this. I notice your derpi URLs are getting 'index' as their page_num, rather than 0, 1, 2. Maybe my code is messing up here.
>>9614 Sure, I will add a timestamp. Stuff like 'restart this search' will have to wait a bit for me to do the new 'searchers' object, as search initialisation is still on the legacy system.
>>9620 >>9624 >>9625 >>9626 Yeah, they done fucked us again. As far as I can tell, it looks like they changed some settings on the AWS bucket(s) and now you need some cookie authentication to see what's inside their CDN. It might be an attempt to stop people like us and e621 from getting the originals, or it might be a standard upgrade that tightened up the old hole. What a pain in the neck, I just wanted to play videogames, etc… I'll roll out a different parser this week that grabs the 1280px URLs again and then keep my eyes open for a new way in. Please let me know if you discover a new URL format that works.
>>9622 It looks like you have had a hard drive problem that has damaged your database. Please check out 'install_dir/db/help my db is broke.txt' for some background reading. You may have additional problems beyond this missing table. The missing table here is not very important, but it is not easy for you to fix. I will try to write an auto-healing routine for v318. Please check your hard drive is healthy in the meantime, roll back to a recent backup if you have one, and let me know if v318 lets you boot. (If you have a backup, recent or not, let me know, as we can do a better manual migration fix here.)
>>9623 You can try help->debug->report modes->network report mode, which will spam you with every job getting scheduled. I'd like to expose the per-downloader bandwidth managers at some point, for advanced mode users. >>9621 I think the HF login system (which basically clicks-through the "18+" landing page and then sets up a 'view all' filter) may be broken. I would like to replace it with a proper user/pass login once I am done with this overhaul.
>>9628 Please check the new 'cog' icon on the file import window. It should have an option to turn off the sorting, and it should remember what you last set it to. Let me know if it doesn't work for you!
>>9630 >>9631 I am not sure if I understand this. Can you describe the workflow you are going for here further, maybe with an example? Do you want to quickly add like 'ylyl' tag to a thread of results? Atm I recommend waiting for threads to be DEAD/404 before adding tags or other processing metadata, as it is usually easier to just go Ctrl+a->F3 on the final results before you dismiss the thread entry from the larger multi-watcher.
>>9640 its a setting inside derpi the little upper right hand horse, settings, local and serve webm I have asked for a proper webm download link for gifs but they seem uninterested/unwilling to do it despite them having the files in place already.
>>9642 Thanks, and with search retry its more of a down the road I let this shit sit for several weeks/months issue >>9643 Ok, now im currious, would it be possible to have a 1280px and a raw grabber, and have the tumblr gallery do an auto dup detector pass to eliminate the lower quality version, this way the when it works, it gets rid of the lower quality version when it can get a higher quality one… there may be some other way to do it without the need for dup detector, but getting all the links for the 1280px and raw, then discarding one of them when you see the higher quality one is there should be possible. >>9645 ok good to know, I have most of the artists I have been getting from in a txt file for easy re grabbing, but god damnit its 480~ artists big… either way this was a long time coming and something I was going to do at some point with gallery multi mode, it just made it very easy to do now. on that note, hydrus does not like having 20-50 watchers dumped in it at once, it hangs quite a bit when I do that. never really noticed from 4chan as the threads were so small after the initial dump that by the time i grabbed 1 boards images and went to another board to look for new threads, the first operation was usually done.
(119.96 KB 862x618 Untitled.png)

>>9644 I ran chkdks and CrystalDiskInfo and the drive is fine. I did the integrity check with SQlite and both files came up ok. I tried cloning the database and now I get this slightly different error, but don't get the local_hashes error anymore. I unfortunately don't have a backup. Learned my lesson lol
Just went back to 316 thread >>9639 I use to have 600k some images open at any given time, after the post I made I went and culled through everything, Now I have maybe 10k images at most open, a fuck ton of watchers sure, but less then 10k images, the program is eating 6.5gb of ram. I'm going to go through the images and cull them once the hentai foundry download is done, restart the program, and see whats up. but honestly, even in a worst case scenario, 1000 of the largest thumbnails in t00 only come to 53mb you know with that math, my hydrus was eating up 15-16gb at times, and that's the equivalent of loading up 150k thumbnails which I know the program was not doing. I think there is something bloating larger dbs, but I wont be able to comfirm for a while.
I'm probably just retarded but does anyone else have an issue where the gallery parses only the first page of a tag. I've only tried it with gelbooru so far but it only ever gets the first 42 images then stops. I can't seem to find a way to fix it and looking through all the changes my illiterate ass can't tell if this is now intended or not. Can anyone help or clarify?
Looks like I'm late to get a response from dev, but I'll post this anyway. Can you add (if it's already available, tell me how I can) add a sort of filter like the archive/delete filter except it either does or does not add a tag? Say I want to go through several hundred untagged files and add the tag "solo" to images with one character in them. I could activate this filter for the tag "solo" and breeze through all the images, tagging ones that apply.
>>9653 if it's like ratings, you can make a tag hotkey, highlight all of the ones you want, and press the hotkey. Personally I do this with ratings as ratings are far easier to deal with then tags at the moment.
>>9652 yea, gel is stopping at 42 images, I have 2 searches from them in gallery, both of which stopped at 42 despite having 6 pages + of I think 44 images. jlullaby and deepthroat x-ray
>>9647 Yeah, previously I would wait for threads to finih downloading from archives and then tag them, but it's a bit tedious since I have to name each tab with the tags I want to put on all of the files for the thread, lest I forget, since I can't keep everything open in 4chan X after it's already dead, it lags too much with so many threads in the queue even when it only shows threads per current board. I was just pre-tagging each thread watcher and had a tab for each thread before multi-watcher, I guess I'll go back to just using one multi-watcher tab per thread. But it would be nice to have the thread subject auto-importable as a subject: namespace tag since it's getting pulled anyways.
>>9646 Thank you very much, didn't see the new icon, it works just as i wanted
So the link that appears underneath my ratings on an image (e.g. Sankaku), is it possible to set those to open in a specific browser rather than my system default? Thanks
(231.08 KB 736x755 screenshot2.png)

(274.01 KB 735x756 screenshot1.png)

Since a few versions ago, some (not all) gifs on my systems aren't displaying correctly anymore. I am running Arch Linux, version 317 and it's still not fixed. When I attempt to display a gif, I get the following output in my console: > (client.pyw:7): GStreamer-CRITICAL **: 15:56:22.841: gst_element_get_state: assertion 'GST_IS_ELEMENT (element)' failed > Unable to stop the stream: Inappropriate ioctl for device The first frame of the gif looks correct, all subsequent frames look glitchy. I attached screenshots. I think this regression occured around 414 or 415. I hope the NSFW nature of the images isn't a problem.
>>9665 I'll attach the gif file here. Something else: apps that also rely on the gstreamer backend, seem to have an issue displaying the file as well or can't do it at all (the image also doesn't display correctly in ahoviewer).
>>9651 >1000 of the largest thumbnails in t00 only come to 53mb Those are compressed jpgs, most likely thumbnails are stored uncompressed in memory.
>>9676 which would mean some program re writing should be in order ———— that said, hdev, got a question, now that the hentai foundry mass download is done, at least for now, I noticed that hf has 2 popups for each one, a scraps and pictures. would it be possible to set up a search parameter that would hit everything at once, or at least an all the usual suspects one? I just came across a few artists who have tumblrs, and because tumblr is fucked, I put their name though all the boorus that should turn something up, I dont see a reason something like this cant be implemented, but would it be user doable? like a custom gallery, and then with check boxes, all the searchable galleries that hydrus has to offer.
>>9677 I should clerify, this part >like a custom gallery, and then with check boxes, all the searchable galleries that hydrus has to offer. Could have a one time search feature, but I mean more a ui element that would make making custom searches easier, you hit boxes and it makes it and saves it for use later though selecting it like the program has now.
>>9648 Thanks. Please try the attached parser. Drag-and-drop it onto the network->downloader definitions->manage parsers dialog to import and then change the derpi link under manage url class links in the same submenu. Give it a go for a bit and let me know how it works. >>9649 The dupe issue is difficult and covers a lot of sites and situations, so I am leaning towards generalised solutions over specific ones (such as per-site, with their own baroque rules). I expect to work on a system that will present 'easy' dupe decisions like in an en masse way. Something like "Hey User, these hundred pairs of files look like near-pixel-exact resizes to me–do you want me to auto-merge them?". I am fairly confident there are several systems I can build to do this, and it would reduce the whole pain in the neck issue of tumblr resize processing to a handful of clicks and a bit of wasted bandwidth. And yeah, parsing is actually pretty CPU heavy. Giving the client dozens of jobs all at once will clack your ui for a few seconds. It might be worth me writing a profile mode for it, in case I am messing up anywhere in particular–I'll write that down as a thought.
>>9650 Hey, it looks like that clone–or the subsequent file arrangement–did not work out well. That's an error that is like "Hey, I just tried to create a new database from scratch but some parts of an existing one were already there." This suggests the clone either wiped out a really important table or maybe client.db got removed from the db dir while you were making the clones and rearranging files? Please double-check you have the right db setup: client.db client.caches.db client.mappings.db client.master.db If any of them are like 58KB and look like they were created when you attempted to boot and got that UNIQUE IntegrityError, I think maybe that file was missing and the client attempted to make a fresh file. The clone operation would ideally have been something like: - make clone of client.db called client_new.db - move old client.db out to a safe folder somewhere - rename client_new.db to client.db - attempt boot And replace 'client.db' with 'client.caches.db' or any of the other files you attempted. Did the renaming step get missed?
>>9651 >>9676 >>9677 Thanks. Thumbnail bmps are only loaded once they need to be drawn to screen, in the fading 'waterfall' effect. Old thumbs are unloaded as soon as the limit in options->speed and memory is hit. I think the default there is a couple hundred MB. But every media object behind the thumb does take a little memory, I would very roughly estimate maybe a handful of KB per object for a really heavily tagged file? And there are all the lists and quick-access indices that go with the results. That was probably a decent whack when you had 100Ks of files open. Now you are pared down on the ui end, I guess you have some db stuff hanging about. For instance, there's a big quick-access list of all inbox file ids open at all times, which for a client with 100,000 in the inbox is usually no big thing, but if you have millions, I guess it would add up a bit, maybe 100MB? A few of those sorts of things in a row might add up to 6GB, but I am not sure. I looked at adding a memory profiler a few weeks ago, but the one I found that did what I want crashed the whole program as soon as I ran anything on the ui end. I'll keep my eyes open for another. You might like to hit help->debug->data->print garbage and email/post/pastebin the big list it will dump to your log to me. If we discover 10K bitmaps or 10M lists or something, that'll point us where things might be going wrong here.
>>9652 >>9655 Thanks lads, this should be fixed in today's release. It turns out gelb randomly(?) do not include the '>' 'next page' link on some gallery pages, and since the new gallery parser relies on finding this, it was falsely assuming some results were finished. I now have a backup in place that attempts to just add '42' to the last gallery url for gelb. Please try these queries again in v318 and let me know how they work.
>>9653 >>9654 Yeah, your best bet at the moment is to add some shortcuts to the 'media' set for the tags/ratings you would like to set or flip. Then use those shortcuts while you do a regular archive/delete filter or browse. At some point I'd love to generalise the archive/delete filter code so you could customise your own filters like left-click = 'add "sexy" tag' and move on right-click = do nothing and move on But I've still got a bunch to do just catching the existing code up to the new shortcut system.
>>9657 If you feel brave, you can dip into the network->downloader definitions->manage parsers dialog and edit the imageboard parsers to provide the subject as a tag. I recommend you duplicate the existing parsers and work on the dupes so you don't accidentally break the originals if you mess up.
>>9664 I was going to say: Yeah, try editing the launch path under options->files and trash. It should work on all urls the program launches. But then I realised the HyperlinkCtrls there don't obey my option–only the 'open known urls' bits off the right-click menus do. I will make a job to fix this.
>>9665 >>9666 Thank you for this report. That swirl looks like PIL's doing. I don't know if my PIL code is just shit or if they can't handle some palettes, but that's a long-time problem that I eventually fixed by moving to OpenCV as default. Please check options->media–do you have the 'PIL instead of OpenCV' bugfix setting checked for some reason? I think PIL also steps in as backup if OpenCV throws an error. I wonder if that GStreamer error is related. What PIL (Pillow) and OpenCV versions do you have under help->about? Yeah, I don't care about nsfw. EDIT: Yeah, when I turn on the PIL bugfix option, I get the same swirl as you in Windows. PIL is kicking in for some reason.
>>9677 >>9678 Yeah, I'd like to make the new 'searcher' object nestable, so you'll be able to say 'search for artist x', and it'll generate four separate queries on four different boorus. This will only work well, of course, for those boorus that use the exact same search strings for what you are looking for. Seems like gelb and danb would be good mergable candidates for artist names, I think? The hentai foundry artist search has always secretly been those two streams, so this new system just exposes that, and things are actually a bit simpler as a result now. I still have to think about this nesting as I build the searcher, though. Things like 'should a nestable searcher create multiple subscription queries or throw everything into the same gallery log'. I don't want to accidentally fuck everyone or let new users completely fuck themselves with infinite looping searches or anything.
>>9693 The way that I see it, and less so from dan then gel or other boorus, it seems like artists can have multiple names and get tagged that way, however the thing is most boorus, if one has multiple names for an artist, you can bet that everyone does. so lets say I had a rule for all the r34 pages and gel, and I want to search an artists name, lets say Hereismyname So I put it in there I find out he has a second name Nothatassholeislyinghereismyname so I do it again, and finally I find out they go under an alias for some art Fuckyouthisismyname its likely each one of those is going to be a hit, so instead of entering those each once for each one on their own, hitting them all at once with each name would be a better option. The only one this wouldnt work for is derpibooru, as derp does an annoying thing where to search an artist you have to type artist: to search artists, rather then just their name, and if you search by name you are never finding them. but on the topic of nesting, would it be possible to make a collapsible nest? something kind of like this, sure its for folders by this could work similar. nvme boot could be artist search brother could be an artists name which could be expaned to show its source nvme boot could open all the images under it in one tab brother would open all of its and an inner one would open from a specific source If you can do something like this, I think several issues would be immediately solvable however I don't know if its possible.
>>9685 ok I have it and its in, though not 100% sure how to use it Ok, got it in the parser, it still returned a gif Ok, thinking there was a conflict, I deleted the original parser after exporting it, Now I get there are no parsers Ok, import the original parser again No parsers Delete the new one No parser I think I did something stupid and i'm not seeing how to fix it/enable a parser do I have to restart the program each time or am I just doing something really wrong and dont realize it?
>>9686 The file sizes look fine and I have all 4. I did the cloning/renaming correctly but only did client.db, that's the only one it said in the help txt. Should I do the other 3?


Forms
Delete
Report
Quick Reply