/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

Uncommon Time Winter Stream

Interboard /christmas/ Event has Begun!
Come celebrate Christmas with us here


8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

(4.11 KB 300x100 simplebanner.png)

Hydrus Network General #8 Anonymous Board volunteer 09/27/2023 (Wed) 16:03:14 No. 20352
This is a thread for releases, bug reports, and other discussion for the hydrus network software. The hydrus network client is an application written for Anon and other internet-fluent media nerds who have large image/swf/webm collections. It browses with tags instead of folders, a little like a booru on your desktop. Users can choose to download and share tags through a Public Tag Repository that now has more than 2 billion tag mappings, and advanced users may set up their own repositories just for themselves and friends. Everything is free and privacy is the first concern. Releases are available for Windows, Linux, and macOS, and it is now easy to run the program straight from source. I am the hydrus developer. I am continually working on the software and try to put out a new release every Wednesday by 8pm EST. Past hydrus imageboard discussion, and these generals as they hit the post limit, are being archived at >>>/hydrus/ . Hydrus is a powerful and complicated program, and it is not for everyone. If you would like to learn more, please check out the extensive help and getting started guide here: https://hydrusnetwork.github.io/hydrus/ Previous thread >>>/hydrus/19641
Edited last time by hydrus_dev on 09/30/2023 (Sat) 17:42:52.
>>13211 Yeah, I saw the huge mess of booru.org GUGs. I did not have any luck modding an existing booru.org to work with bune.booru.org. I do need to learn how to do this eventually and will likely use naa.booru.org as a test base since it's so tiny. But both boorus were so small and niche it was easier and faster for me to just grab all the urls via jank than to use my brain properly.
https://www.youtube.com/watch?v=OLJhFROQg5A windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v545/Hydrus.Network.545.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v545/Hydrus.Network.545.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v545/Hydrus.Network.545.-.macOS.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v545/Hydrus.Network.545.-.Linux.-.Executable.tar.gz I had a good week. I did some small things, and a user contributed some cool things! Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html blurhash tl;dr: we have some special blurry thumbnails now, but you won't see them much yet. Thanks to a user who did a bunch of research and work to get this going, the client now generates the 'blurhash' of all files that have a thumbnail. This hash is essentially a micro-thumbnail, only about 34 bytes, that shows a very blurry coloured impression of the general shape of the image or video. They are usually used as placeholders when it may take time to load something--you have probably seen something similar on slower news websites. You won't see these in hydrus itself much, since thumbnails load fast enough that we don't have to worry about placeholders. In the advanced ways of seeing files you don't actually own, however, I will now show you any file's known blurhash rather than the default 'hydrus' thumb. Same deal for damaged/missing files--if I can't fetch a thumb, I'll now fall back to the blurhash. The more important thing is these hashes are now available on the Client API, under the normal 'file_metadata' call. If you are implementing a browser or similar and have access to a fast blurhash library, try them out! I've scheduled all users to generate the blurhashes for their existing files in the background, which will take a few weeks/months for users with hundreds of thousands of files (although, if you are working with this and want to hurry it along, remember the queue is manageable under database->file maintenance. other highlights The file history chart (help->view file history) now has a search panel, just like Mr Bones! You can search your import, archive, and delete history for creator x, filetype y, or any other query you can think of. Some of the math here got a bit weird, and I am sure I have missed out several 'cannot provide good numbers for this domain' situations, so let me know if you get any really stupid results or outright errors. There's more work to do here, like a button to hide the search panel, which I hope to push on in the near future. Thanks to the same user above, we also have epub support! No 'num_words' yet, but it turns out epubs are just zip files with some html, so I think it'll be doable in future. Also, some rare incorrect jpeg rotations (for 'MPO' jpeg files) are fixed. If you right-click on a selection of files, the 'open->similar files' menu now has a 'in a new duplicate filter page' command. This will initialise the filter page just looking at those files, hopefully making it simple to clear out the potential duplicates in any particular selection. Unfortunately, I am retiring the Deviant Art artist search and login script. DA have been slowly killing their nice old API, and the artist search just went. Individual page URLs still seem to work, but I suspect they will be gone soon. Gallery-dl have a nice DA solution that works with the new API, so if you can't find the same content on a more open booru, that's my best recommendation for now. next week I want to take a very simple cleanup week. Nothing too exciting on the changelog front, but I'll refactor some of the worst code into something nicer to work with.
>>13210 >https://gitgud.io/prkc/hydrus-companion firefox <unsuported :( but I'll try installing chromium just for it maybe then >you can stop reading this post here and just disregard the effortpost you made just for me? no way! >no way in hell am I gonna manually open hundreds of tabs to import each post individually yup, same problem here though with the one I tired it'd "just" be 87 tabs >linkgopher >making a newline separate .txt with all the links darn, why didn't I think of something like that >excel ahem excuse me, it's GNU-calc here! jokes aside, what you're describing is something I once did with other stuff Vim is probably a really good fit for this. You could easily duplicate a line 1407 times and then use some sort of bash script or a vim addon to add the ascending numbers at the rear With this kind of method you'd even have a way to make it very hands-off because we're using bash scripting. And I don't know python (yet) but I am sure with python there's even a way to scrape links and put them directly into a text file or even hydrus itself! If I figure anything out I'll make sure to share with you in an attempt to repay your effort. I tried with the "url downloader" but found out some time later that I have to use the "simple downloader" for this kind of list. Just saying this for others reading along. >wouldn't normally reccomend grabbing entire boorus yeah "happy about their new ability to download 500000 files and then go ahead and do it " I rember, didnt forgor :P Thought the same as for the size of the boards. >>13217 thanks for undertaking the effort of explaining those things to me in the current context funnily enough I knew about that book you linked and I read parts of it but only after reading your post a few things started to make sense so your effort was definetly not in vain, thanks! the stash concept especially was strange to me but now it makes sense I definetly agree with you on the "strange terminology" part, but I guess they just used the words that were availiable and if I use it often enough it'll become natural... (.|?|!) >>13211 Whoah, if it isn't the man/myth/legend himself :) Thanks, I'll be checking out that GUG functionality, looks like you did all that stuff I theorized above already.
>>20354 I'm really liking the file history search. It definitely cements the fact I will be eternally boned. I did notice that if you disable "show deleted" and then submit a search or refresh it, the line for deleted files comes back and the toggle is inverted.
>>20355 You know you can copy all open tabs urls in firefox right?
>>20355 Hydrus companion does actually work in Firefox (well, Librewolf). The issue is that release FF doesn't support unsigned extensions, but Librewolf and FF Nightly (I think) do. You have to run the build script, and then, and I have no clue where I found this information, but you need to open/extract the xpi, and go to manifest.json and change the "id" entry from "id": "hydruscompanion@hydruscompanion.hydruscompanion" to "id": "hydruscompanion@example.org"
Is there any way to search for only a certain tag excluding siblings? Say I have creator:Artist and creator:ArtistNSFW which is sibling'd to creator:Artist, is there any way to search exactly and only for creator:ArtistNSFW?
>>20355 > Firefox unsupported >>20358 To expand a bit on what this anon said: Hydrus Companion works on any non-mainline branch of Firefox and it's various forks. It's a Firefox paranoia issue. They don't let you run unsigned extension anymore. Used to be just a "do it anyway" button on install. Then they made you jump thru hoops in about:config. Now you need to be on the nightly branch to enable unsigned extensions. Some forks, such as LibreWolf, do away with this nonsense and let you run unsigned extensions out of the box. No need to fiddle with settings or opening up xpi files or anything. Fairly easy to migrate profiles over as well. >GNU-calc & Vim Gotta work with what ya know! >url downloader vs simple downloader Did you have trouble pasting the whole list into the url downloader? It's hidden in a tooltip popup on the paste button, but you can paste a massive list directly into the url downloader. Unfortunately it ONLY works by actually clicking the paste button. Not with ctrl+v or the text box. If you try to paste the list into the text box you'll only get one url in there. It's just a weird Hydrus quirk. Happens in a few other places as well, like the tagging window too.
(18.19 KB 438x146 28-20:13:32.png)

(8.42 KB 1011x47 28-20:19:52.png)

Feature request: A "verbatim" sidecar note parser. yt-dlp writes descriptions to a separate .description file in the exact format the original post had. But since hydrus assumes descriptions are single lines you get something like pic 1 related. Yes you can get a json file with yt-dlp --write-info-json but it contains literally every single thing about the video, all on a single line (400k characters!), as you can see in pic 2 it's hellish. I assume a verbatim parser would be easier to implement then a fully fleshed out multi-line parser, but idk. Thoughts hydev?
(452.15 KB 294x342 eyebleach.gif)

>log in to sankaku >they now force you to disable adblock in order to do anything on the site >turn it off for a second just to see how bad it gets >on a single post, see six motherfucking rows of shitty 3DPD porn ads >a tab to a virus site automatically opens Good fucking god. At least the Hydrus downloader still works if you can get it a login token.
>>20362 You'd think there'd be some law about making and hosting malware ads at this point that would prevent this shit from ever happening, leaving only a flood of regular horrid ads.
>>20359 If there's a functional difference between the tags when searching, they shouldn't be siblings in the first place. You could make them parent and child instead. If you make creator:ArtistNSFW a parent of creator:Artist, then searching creator:Artist will only show stuff tagged creator:Artist, while searching creator:ArtistNSFW will show both. Alternatively, you could keep the sibling but have separate file domains for SFW and NSFW stuff. >>20361 I use --write-info-json for youtube descriptions. What's wrong with it? Hydrus doesn't care that it's all in one line.
>>20363 >some law Please, keep the bureaucracy out of our living space.
>>20357 N-not really? Thanks, I'll remember to keep in mind that that option exists. >>20358 >>20360 Thanks guys! Looks like I have to install librewold or nightly then if libre has hiccups - it's not as bad as I thought initially after all if that is all there's to it. I guess it's for the best to stop boomers getting scammed out of their life savings and the pain that bullshit causes. >>20360 >Did you have trouble pasting the whole list into the url downloader? As you allude to it worked with the button. However when I tried doing it with the button in url downloader the imports would fail whereas they'd work without a hitch in simple downloader. I have a more serious problem: My tag counts are sometimes double what they actually are in terms of numbers of files actually tagged so. Also sometimes it just shows me a tag is present when the file is selected but when I open the manual editing for the tags on that file the offending tag isn't on the list to remove it. Re-adding and then trying to remove the tag doesn't work either. Is the illness terminal?
>>20352 previous thread btw: https://archive.ph/M6heU >>13210 I just tried making that text file of links for the 87 file booru using vim and managed to figure it out. https://pastebin.com/raw/wS9hr2WM Damnit it's a nuisance that the distro didn't ship vim with the +clipboard option. What are they thinking. Hrrmph. The only problem I have is that apparently tags aren't imported... How do you do that?
>>20362 >if you can get it a login token Is the only way to do this through the browser addon? I've never messed with it but if I have to I guess I could try.
>>20366 >I have a more serious problem: >My tag counts are sometimes double what they actually are in terms of numbers of files actually tagged so. >Also sometimes it just shows me a tag is present when the file is selected but when I open the manual editing for the tags on that file the offending tag isn't on the list to remove it. >Re-adding and then trying to remove the tag doesn't work either. >Is the illness terminal? You know about parents and siblings, right?
>>20361 Maybe >>20308 can help you? Basically instead of using new lines as a separator, use a string like ||||.
>>20370 Broken link, didn't realize the thread was already archived. >>>/hydrus/20308
>>20367 Without a proper GUG for that specific booru you won't get the tags, unless you have the PTR and the image has already been tagged. If you know how to make a GUG for that booru that's the proper way to do it, but alas, I cannot help with that yet. I'll get them all tagged in the PTR eventually, but the ride never ends. Pretty sure my backlog grows faster than I can tag it. In the mean time you'll have to tag it yourself or copy the tags from the booru manually. If you're familiar with tampermonkey and/or greasemonkey there is very useful script called Booru Tag Parcer at https://github.com/JetBoom/boorutagparser When on a supported booru post, it adds button to copy all the tags to clipboard. You can also configure a shortcut key as well. The tags can be pasted directly into Hydrus Network (or anything else you have that accepts newline-separated tags). It's especially great for when a subscription grabs something from a booru that's tagged on the booru but not yet tagged on the PTR. Open the link, copy tags button , paste into Hydrus, done. >>20366 >paste button Weird. Works fine for me on Windows. But as long as the simple downloader works I guess it's no big deal. >tag insanity Probably just parent and sibling just as >>20369 said. These can make 1 tag become multiple tags. The tag relationships are not obvious on the media viewer, but it is on the tagger window. Also the tagger window only shows tags for the active tagging service. So If you're on the My Files tab you won't see any PTR tags. And vice versa if you're on the PTR tab. You sure you're looking at the correct tag service? A mouse scroll too close to the tabs can put you on the wrong service and without realizing it sometimes. >>20362 Sankaku has never been good.
>>20372 >Without a proper GUG for that specific booru you won't get the tags GUGs don't get tags. Parsers do. And parsers require a url class. >>20366 >>20367 >However when I tried doing it with the button in url downloader the imports would fail whereas they'd work without a hitch in simple downloader. >The only problem I have is that apparently tags aren't imported... The simple downloader just does a check for images on the page and downloads them. You won't get any tags or anything from it. Don't use the simple downloader for boorus. The url downloader fails because by default hydrus doesn't recognize the urls. You have to tell hydrus about the urls. Hydrus already comes with parsers for that work for booru.org, you just need to add the url classes. Add these to hydrus. Go to network > downloaders > import downloaders and drag and drop them. https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/blob/master/Downloaders/Booru.org/booru.org_file_url_2018.09.20.png https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/blob/master/Downloaders/Booru.org/booru.org_post_page_2018.09.20.png https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/blob/master/Downloaders/Booru.org/booru.org_search_gallery_2018.09.20.png Then go to network > downloader components > manage url class links. You should see that the url class "booru.org post page" links to the parser "gelbooru 0.1.11 file page parser". If it doesn't, double click it and select it from the list. Now use the url downloader and you should get tags.
(8.64 KB 865x222 123.PNG)

>>20357 What the heck version of firefox do you have? That option isn't in settings. I use an extension to do it.
>>20368 You can do it with browser developer tools in the network tab if you know what you're looking for. >>20372 >Sankaku has never been good. This is true but that doesn't stop faggots from only uploading to it. Sankaku often has more art by a given artist than better sites, and they don't seem to give a shit about deleting paywalled uploads either.
Is there a way to disable the new blurhash thing completely? It is not for me.
>>20376 To elaborate: Having deleted newly downloaded images from my files, opening the same page gives me all the generated thumbs with a little trashcan symbol on top. Very confusing. Restarting hydrus, those thumbs are then converted into the new hashblurs or blurhashes which I, too, do not like. I do not want to see again what I have deleted in the past. I'll check the option panels just now to see if I can do anything about it myself.
>>20377 >I do not want to see again what I have deleted in the past. Trashcanned items are in your Hydrus trashbin, and will be automatically deleted over time during regular background maintenance. If you want to delete them immediately, just simply select any one file or group of trashcanned files and where the "delete" option was, there will be a "delete permanently now" button. Trashcanned files should not appear in search domains that exclude them, such as "my files". Don't know about other domains, I never use them.
>>20359 >>20364 Not yet, but I will add this. All normal file searches work in the 'display' domain, where siblings and parents are calculated. I am now planning to make a new system: predicate that does normal tag searches but with special (CPU expensive) tech such as 'including weird characters like "["', 'on this tag domain specifically', and as you say, 'on the storage domain'. >>20358 >>20360 As a side thing, I generally recommend people use throwaway site logins with hydrus (for several reasons), and this idea fits well into having a special weird sandboxed browser identity on the side with that separate set of logins and Hydrus Companion, rather than using your real logins and real browser. >>20361 Sorry for the confusion. As >>20370 says, change your .txt note parser to use something other than newline as the 'split' for its row-parsing. Choose something crazy like |||| or &&&&, which your actual note text won't contain, and it'll grab everything as one note. >>20366 >My tag counts are sometimes double what they actually are in terms of numbers of files actually tagged so. Can you give me a specific example, maybe with a screenshot, of exactly what is happening here? I'm not sure I totally understand. If you search for, say, 'skirt', does the tag autocomplete results dropdown say (200), but actually the search delivers ~100 files? Or does it say (100-200)? And, if you right-click on the offending tag, does it seem to have any siblings or parents? If you do not sync with the PTR, try running pic related (EDIT: Shit, I just realised my screen was on the wrong damn menu command. You want 'storage' regen for all, with siblings and parents deferred! The one above what my screenshot shows.) on 'all services'--it will recalculate all your tags and their counts from source, and fixes general miscount and ghost tag issues. If you sync with the PTR, do not run it as this job takes too long to run on a whim.
Edited last time by hydrus_dev on 09/30/2023 (Sat) 21:36:28.
BTW I screwed up PSD thumbnails the last week. The RGB channels are swapped to BGR due to me being stupid and fixing something else last minute. A proper fix is in place for next week! >>20368 >>20375 Yeah manual option is to use network->data->review session cookies and drag and drop a cookies.txt onto it. The only caveat is different browser-addons create different cookies.txt formats and not all will work. 'Export cookies.txt' add-on for Firefox has worked for me before. Or, in desperation, you can copy the one or two cookies across by creating them manually. >>20353 >>20372 >>20373 Oh yeah I don't think I pointed you at the downloader help. If it helps at all, this walks you through the basics of making a downloader. If you want to cobble together an altered clone of an existing booru.org downloader, it will be more helpful than my memory: https://hydrusnetwork.github.io/hydrus/downloader_intro.html >>20376 >>20377 >>20378 Sure, no worries, I'll add an option to not show them.
>>20380 >Sure, no worries, I'll add an option to not show them. Much appreciated, my dude.
>>20374 Hmm, maybe it's waterfox exclusive. I assumed it would also be in firefox, since waterfox barely has anything original and is just a copy without the bloat.
(7.46 KB 554x252 pass_hash.png)

Anyone have a workaround for this yet? I tried getting my cookies.txt in both chrome and firefox like in >>20380, but no luck. Not sure how to manually create the cookies.
>>20383 >but no luck. What are you not having luck with?
>>20384 I can't do gallery downloads with sankaku because it can't login I assume. I get that error when I try. I exported the cookies.txt from my browser and imported it into hydrus and i'm still getting the same message.
>>20380 kek you're not stupid anon Also, don't wanna pester you with stupid requests, but I find it amazing that there's support for xcf and psd files while we still can't import txt files. I think copypastas are something very imageboard-related too. Is it not supported due to the easy changeability of text files (and thus orphaning files in the database frequently leading to some kind of fatal db error)?
>>20369 I thought this wouldn't really apply to me since I only have local tags that I entered myself? >>20372 >Without a proper GUG Okay, thanks for explaining that to a doofus. >booru tag parser that's neat, thanks for the pointer maybe I should consider doing stuff with PTR, I was just afraid i would be causing more work than benefit because of tagging badly/inconsistently and not being sure I fully understand what kind of standards you guys are following in the PTR becasue I don't tend to use hydrus for the things you guys tend to use it for
>>20387 PTR is great for images. There's a pretty decent chance if something came from a booru or image board someone has done at least some preliminary tagging. If you're ever looked at how boorus tag things, that's pretty much what you can expect with how the PTR tags things. Technically anything that you can toss into Hydrus can be tagged on the PTR, but I'd bet it's like 99% image tags. Don't worry too much about bad/inconsistent tags. The PTR has a massive amount of parent and sibling associations which covers many things from typos to namespacing. Helps keep things more consistent. As long as you're not purposely mistagging things you'll likely be fine. The tags are the to find things. Just tag what you would search for and you're good. There are also janitors that keep the worst of the garbage away. They'll review things before it gets pushed out to the whole world. There's plenty of tagging addicts such as my self who will petition corrections for the janitors review. Is the PTR perfect? Fuck no. But it's pretty damn good. Don't get too caught up on making sure everything is perfectly tagged. Just tag what you want to be searchable and if you screw up, the janitors will let you know next time you sync with the PTR. Worst case if you just absolutely need to tag something in a way that's unacceptable to the PTR, that's what the default My Tags is for. >I don't tend to use hydrus for the things you guys tend to use it for Hydrus is for whatever you want it to be. As long as it works for what you need, then that's what Hydrus is for.
I had an ok week. I mostly cleaned code, so the changelog isn't too exciting, but there's fixes for bad PSD thumbnails and the new file history chart's inbox/archive lines. The release should be as normal tomorrow.
It'd be cool if hydrus had some option to apply lossless optimization to a file upon import. A booru app that I like does this when you upload to it, and the way that it preserves integrity is by recording the original sha-256 hash before optimization, and keeping that one along with the actual sha-256 of the optimized file, if any optimization is done. If Jpeg-XL support is added, I'd love for there to be an option to do lossless jpeg conversion to it, as part of a feature like this, if you're not against it.
https://www.youtube.com/watch?v=aFx7woWkZbc windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v546/Hydrus.Network.546.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v546/Hydrus.Network.546.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v546/Hydrus.Network.546.-.macOS.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v546/Hydrus.Network.546.-.Linux.-.Executable.tar.gz I had a simple week. I mostly cleaned code, so there isn't much to talk about except for some bug fixes to recent systems. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html misc I accidentally screwed up PSD thumbnail colours at the last minute last week while fixing something else. This is fixed, and any borked PSD thumbnails should regenerate themselves soon. The new file history chart's cancel button works better, and I fixed some of the counting logic in the archive/inbox lines when you have a search filtering the results. Let me know if you still see archive numbers that are way too high. If you have run into the new 'blurhash' thumbnails and don't like them, hit the new checkbox in options->thumbnails to turn them off! next week I really got lost in code refactoring this week. It is always worthwhile work to do, reshaping old bad ideas and preparing for the future, but it is unexciting, and with a janked out codebase like mine it often feels like bailing out the ocean. For next week, I'd like to get back to my file storage system improvements, ideally getting background migration ready to go.
(75.56 KB 1920x1080 givemebirdearsorgivemedeath.jpg)

Thank you for the update. Two points I'd like to mention still: (1) Since v545, newly downloaded files from subscriptions (that are presented with the help of the little button on the bottom-right corner) will still show trashed files even after they're long gone. Refreshing the page (F5) makes them go away for good. Could the old function be restored? (2) Since the last QT update that restored older window sizes for some of the users including me, other windows blew up in size. Windows like pic-rel look rather funny now.
>>20392 Adding to that, since v545, subscription downloads seem to be sorted by file size instead of - i don't know what it was before so I'll guess - time imported. I'll look into the options if any of this can be reverted to the former defaults!
Since jumping from 534 to 544, I've had Hydrus stop responding until I force it to close fairly often. At least several times a week. It seems random, but it's most commonly happened when clicking on the on the search box with nothing in it. Updating to 546 now since I have to close and restart Hydrus anyways.
>>20392 >Since v545, newly downloaded files from subscriptions (that are presented with the help of the little button on the bottom-right corner) will still show trashed files even after they're long gone. Refreshing the page (F5) makes them go away for good. Could the old function be restored? I don't think there's any "old function". I'm in v538 and it happens here too. Pretty sure it's been that way for a long time.
>>20394 This morning Hydrus outright crashed right after I cleared the search box for a page with about 35 results.
My SSD that only contained the hydrus database and thumbnails has managed to crap out on me, but fortunately I had a backup from about two weeks ago. I understand that the tags and import order will be a mess, but is there anything I can do to get hydrus to recognize the files i've added since? I see a "clear orphan files" action, but I presume that would simply delete the files that had been newly added
>>20383 >>20385 I think the actual logic script for sank is broken, and it is now mandating a cookie that is no longer used. Hit up network->logins->manage logins and set the sank login you have setup to 'not active'. Then you might want to purge your sank cookies and do the cookies.txt import one more time. Hydrus will just start downloading as normal, not try to do its broken login script any more, and ideally sank will recognise your cookies and deliver the right content. >>20386 The .txt case is a funny one. A core predicate of hydrus's current design is that we can always determine a file's filetype from its content. A bunch of maintenance and recovery code relies on this principle. I don't care if your file is called image.jpg--if it is actually a zip, I'll see. And if I think something is a webm, it definitely is (so if I throw it on screen, I probably won't get a crazy error). The annoying thing about .txt, .rtf, and to a lesser extent, html, is you can't super easily figure out what they are just by looking at them. I'm going to have to rework my import and maintenance workflows to handle these situations, either escalating to the user to ask what they are at the import/maintenance stage or make educated guesses based on file extension or best-guess on content (this would be more doable for html). This will likely come hand in hand with arbitrary file support. I'll just have to deal with not knowing, or only being able to guess, at some filetypes, and have UI and workflows for users to override it when things need to be overridden for whatever reason. >>20390 I agree, especially as we move to JpegXL. We have to do a bunch of work before I am ready to entertain this thought, but I definitely think we are heading towards a singularity where we wash all our old formats into a generally agreed upon 'yeah, this format is inarguably better than jpeg and png, and it can do everything they do'. Given how fast things are moving, I expect we'll have AI do upscaling and HDR too as we break out of the sRGB colourspace. If you want to read more, please search the previous threads for me talking about an 'exe manager'/'executable manager'. I have a plan, but I need to improve duplicate systems and write an exe manager and then we need to probably sit down and talk about what which conversion workflows are sensible so we can prep these systems with reasonable templates and not have people wanging around with a hundred different ill-informed conversion protocols.
>>20392 >>20393 >>20395 Thank you for these reports. For (1), yeah I think I caused this a little while ago. I moved from search pages that take a list of file hashes and hide their search interface to a normal search page that starts with a 'system:hash' with the hashes prepped. This is keeping the trash after it has been deleted and nuking the sort. I will see what I can do about both. For (2), that's bonkers! Can you check options->gui and the 'frame locations' sub-panel, and then look for the 'regular_dialog' entry in the list? If you double-click on it, does it say anything odd like 'remember size'? That 'fallback' entry is supposed to size and position itself conservatively and not remember anything, but maybe it got set at some time. If it is like pic related, then let me know and I'll see if I can figure out some debug code or similar so we can investigate why your buttons want to be so huge. >>20394 >>20396 Thank you for this report. I am sorry for the trouble. Can you try turning off (if they aren't already) database->file maintenance->work file jobs at normal time and tags->sibling/parent sync->sync siblings and parents at normal time? Does that stop the hangs? If it does, please do some normal work with help->debug->profiling->profile mode on, and then try turning the work back on, with profile mode still on, and then send me the profiles. If turning those maintenance routines off doesn't fix the issue, can you do a profile anyway? Just do some normal work with it on, and then send me the profile. You can check, but there shouldn't be any private data, so you can pastebin it here, or you can email or discord DM it to me. >>20397 Yeah, check out the document at install_dir/db/help my media files are broke.txt. The concepts involved are not simple, and my English gets overwrought at times, so happy to help if you have more questions.
I tried to use simliar file search with a clipboard image, but I got a popup error: "Sorry, seemed to be a problem: AttributeError("module 'hydrus.core.images.HydrusImageHandling' has no attribute 'StripOutAnyUselessAlphaChannel'")" I'm on version 546 and running on linux from source
>>20400 Thank you, I messed this up in the rewrites last week. Fixed on master right now, if you want to git pull.
>>20399 >Does that stop the hangs? If it does, please do some normal work with help->debug->profiling->profile mode on, and then try turning the work back on, with profile mode still on, and then send me the profiles. Will do. See if I can go a week of normal work without hanging/crashing. >For (2), that's bonkers! funny, I thought those big buttons were intentional the whole time. Doesn't bother me. since I'm not going to be doing anything else but immediately selecting an option I've already decided on when I open that dialogue.
i noticed weird thing with ipv6 on client api. if i turn on "allow non-local connections", it will set interface on ipv4 to be "0.0.0.0", but on ipv6 side it will be on "::1" (loopback interface for ipv6). turning it off, ipv6 side will be "::" (allow anyone to connect to it over ipv6) i checked the code and this nonsense is over "clientcontroller.py" on line 1864 (https://github.com/hydrusnetwork/hydrus/blob/45ca3abd62bdc7bf62d3ca69fe8d5a296e5a93d7/hydrus/client/ClientController.py#L1864C32-L1864C59) should "interface" lines be swapped or something?
>>20401 I just did, and it works now. Thanks for the quick fix!
>>20399 Looking at the relevant part of the text file: >* Files that are in storage, but the database does not know it * >~If you only had drive errors or you restored a backup that included both file storage and database files made at the same time, you can ignore this step. You probably have a couple of extra orphan files, but everyone does, it isn't a big problem.~ >If you restored an older file storage backup to a newer database, these would be files that were deleted after the backup was made. If you restored an older database backup to a newer file storage, then these would be files that were imported after the backup was made. In either case, they are files in your file structure that the database does not know about, and we want to collect them together to A) delete them or B) reimport them. >Run _database->db maintenance->clear orphan files_. Choose a location for the files to go to, and then wait for it to finish. Browse through them to verify what you are looking at, and then either delete them or reimport them. clear orphan files seem to be under file maintenance, rather than db maintenance
>>20405 On that note, is it possible to import in the order of created / modified time, rather than filename? That would make reimporting files much less messy
>>20406 It appears that turning off "sort path as they are added" does help in getting the file import order to match the explorer order.
>>20375 >This is true but that doesn't stop faggots from only uploading to it. Sankaku often has more art by a given artist than better sites, and they don't seem to give a shit about deleting paywalled uploads either. Why is this? I just don't understand. I would like to move to a different site, but I can't find any with as much content as Sankaku. And they totally fucked searching to the point where you can't find shit anymore. You want incest, yuri, and loli tags all at once? Too bad!
>>20402 >>20399 Big butan go brrrt. The "regular dialog" entry had "start maximised" checked. Unchecking it fixed it. Thank you so much!
(111.66 KB 1353x915 hydrus dark mode implying.jpg)

(110.70 KB 480x852 my eyes.jpg)

>when you forget to renew your bypass so you lose your post tl;dr I've gotten a new harddrive so I'd like to give hydrus a go to really do some comfy image sorting stuff. But how do I get dark mode to work? My OS has a dark theme, is set to dark mode default and yet still only does this weird halfsie mode and as such is defiling the dim serenity of my lair with its unsightly glare.
>>20410 file -> options -> colours
>>20410 From what I have heard, darkmode is kind of wip. ( https://github.com/hydrusnetwork/hydrus/issues/756 ) On Linux + KDE, hydrus is able to pick the colors of your global dark theme quite well for the most part tho. (pic related) Because hydrus runs with QT 6.5 now, more seemless Windows darkmode support might be coming in the future. https://www.qt.io/blog/dark-mode-on-windows-11-with-qt-6.5 But even that is still a fairly new development and has supposedly a few bugs here and there. Until then you can still customize colors in the options yourself.
While I'm processing my duplicates, I have encountered a pair of alternates. They are mostly the same. One is simply the colored version of the other. The uncolored pic has many tags and notes, while the colored one has none. What should I do here? Can I click "This is better, and delete the other" here? There are not really duplicates, but the tags of the uncolored pic also apply to the colored version. What happens when I "commit" my changes after processing, does hydrus tell the PTR that I marked these alternates as duplicates? Does hydrus automatically commit the merged tags of my "better" color version, after I finish with duplicate processing?
>>20413 >What should I do here? Can I click "This is better, and delete the other" here? I only do that if I don't want the uncolored version, which is up to you. Not sure how this affects syncing with the PTR though, since I don't use it. If you're worried about your decisions syncing to the PTR, you could just set them as alternates, copy all the tags over at once, sans any "black and white" tags, then delete the one you don't want afterwards from your own collection.
The issue with Hydrus selecting the wrong files via drag and drop when files share both the same filename tag and filetype extension was not entirely fixed. First pic related. In version 534, this fix was added, >when you drag and drop thumbnails out of the program while using an automatic pattern to rename them (options->gui), all the filenames are now unique, adding '(x)' after their name as needed for dedupe. previously, on duplicates, it was basically doing a weird spam-merge This only works to prevent exporting the wrong file, if you are drag and dropping multiple files at once, and the first file included happens to be the file Hydrus would otherwise replace any other files with. I'll call the file Hydrus picks the "master file". If the master file is not included, or is not the first file in a selection of files being dragged and dropped, the first file will be replaced with the master file. Here are some examples. >2nd pic: Random non-master file. >3rd pic: Multiple random non-master files. >4rth pic: Master file included, but not as the first file. As a side note, it appears Hydrus does some rearranging here. I don't like when it does this, but it doesn't always do it, and for the sake of this example it's fine since the rearranged order still did not place the master file first 5th pic: Master file included as the first file in the selection. This is the only situation for which the fix prevents the weird spam-merge. I think drag and drop rearranged the files and caused something even weirder to happen. I no longer know the logic this error is operating under.
>>20362 Yeah, Sankaku went really bad.
Is there a way to bind a shortcut to open media viewer? I found one to close media viewer, but I would like to add an additional bind (in addition to Enter) to open from the thumbnail page.
(615.98 KB 900x1123 900860-yukari_39.jpg)

1000 images (and video). Must be efficient way to tagging them manually!
(27.01 KB 605x123 Capture.JPG)

Panic!
(44.68 KB 1018x710 harmony has been achieved.jpg)

>>20411 >>20412 The answer wasn't in file>options>colours, that was where I looked to turn it to that weird half dark mode in the first place. For now the answer was deeper in the hydrus blog related to changing to a qss through options>styles. It is in fact the default hydrus qss that was causing this as it seems to control the foreground element whereas colours controls fields and such. Anyway, I just made my own custom qss and now it looks how it should. 3.6 Roentgens: not seamless, not terrible.
I have an exciting new error: http://sprunge.us/0Anc62 It basically freezes if I even as much as select a video file that opens with mpv in the file viewer. On wayland of course. With X11 there is no issue. Fugg The problem is on Wayland other stuff I need works. Please save my shitposting. Also: is there anything I could have improved about this attempt to report a bug/error/problem?
>>20398 >The .txt case is a funny one. A core predicate of hydrus's current design is that we can always determine a file's filetype from its content... Thanks for spoonfeeding me anon. mlem~
(83.30 KB 370x378 VSA3gOQ.jpeg)

>>20398 >This will likely come hand in hand with arbitrary file support. THIS!!! for the fist stage. Custom thumbnails for the second, if you don't mind.
>>20418 >1000 images (and video). Must be efficient way to tagging them manually! Use something like this - https://github.com/Garbevoir/wd-e621-hydrus-tagger
>>20418 If you use what >>20424 suggested, keep in mind it only works on images, it won't work on videos, nor on animated gifs.
>>20365 >Please, keep the bureaucracy out of our living space. t.malware ad programmer
>>20424 Different anon and your post seems relevant to what I was just about to ask. I've been using the clipboard scanning download mode and it makes my pp the big pp. I can go through a chan or booru and just ctrl-x and forget hydrus is even there until I'm ready to fiddle with the database which so much more ergonomic than I ever even assumed the program would be... BUTttt while I absolutely understand needing to manually tag anything I get from non-booru sources I guess I was under the impression with this being (in effect) a /local booru/ that when grabbing a link from an /online booru/ it would grab the tags of the file from the page it was downloading it from while doing that. Have I done something wrong like forgetting to turn on an option or is an addon like what you've posted required?
(71.42 KB 693x785 10-17:49:12.png)

>>20427 Not him but you need to set the default import options, take gelbooru for instance. If you go to network>downloaders>manage default import options>gelbooru file page>edit, you need to change "Use the default tag import options" to "Set custom tag import options" and then you can specify tag rules for every tag domain, see pic. If you just want all tags, and don't want to mess with tag domains, just enable unnamspaced and namespaced tags for "my tags". I like to keep "my tags" free for more specific tags (reaction image, meme, technical tags for keeping track of where this file came from, etc) so I have "parsed tags" for tags downloaded from the site. If you want fully uniform behavior for all sites you can change the "default for file posts/watchable urls" to whatever and then anything using the default will run with those tag rules.
I had a good week. I fixed a bunch of bugs and optmised some code. Some mpv-related crashes should be completely gone, and there's also support for 'djvu' files. The release should be as normal tomorrow. >>20403 Thanks for this, and well done noticing--it is fixed for tomorrow. It was me being stupid, I just messed up the logic and never had a good IPv6 test situation. Thankfully there's a second check, inside each request, for the non-local IP, so I don't think this exposed anything, but it did break non-local IPv6 requests when they were desired. If you keep poking around here, let me know how you get on--IPv6 is all greek to me.
>>20428 Thanks a bunch. While it's super annoying to not have it on by default I also am very annoyed by the fact boorus have some full retard tags like "fingers_sensually_close_to_navel" which is clearly the taggers weird niche fetish which is only incidentally in an otherwise nice image so it's good to also have these whitelist/blacklist tag controls. So my immediate followup question is... if you've already downloaded files and got them without tags (meaning the program won't download them again because it's either in the DB as a file or a previously deleted file) is there an option for "hey, remember how you have the booru link to this image? Can you 'phone home' real quick and grab the tags this time?". That could also be very useful in the future when I set up the scraping tool because I'm sure tags on newly updated files will update over time too.
>>20429 >Some mpv-related crashes should be completely gone, and there's also support for 'djvu' files. Cool, I'll tell you if my MPV woes >>20421 disappeared Thanks for your work.
(42.50 KB 562x506 11-11:51:36.png)

(30.87 KB 660x144 11-11:53:03.png)

>>20430 You should be able to just put them through the downloader again and it should grab the tags. If that doesn't work you can change the "import options" in the gallery downloader, see pic related. You can disable the checks to force a redownload but make sure to re-enable them before doing any more downloading, since not checking is really slow. Make sure to hover over the check hashes and urls lines to read the tooltip, it's very detailed. If you didn't download the images through a gallery search you can simply select all the images, right click>known urls>copy files' {X}booru urls, then you can paste those into the same or another url import page.
https://www.youtube.com/watch?v=Cjkjba-qbhE windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v547/Hydrus.Network.547.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v547/Hydrus.Network.547.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v547/Hydrus.Network.547.-.macOS.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v547/Hydrus.Network.547.-.Linux.-.Executable.tar.gz I had a good week mostly fixing bugs. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights Most importantly, I banged my head against a persistent knot of mpv crashiness this week, and I actually got somewhere. A variety of situations where mpv could fail to load a weird file, which would then lead to hydrus program instability, are now much much better, usually no crash worries at all. Also, there's a little more error handling, stuff like a placeholder image to show instead of the broken file, and an error popup to tell you what happened. I cannot promise that this improves mpv on any system that already had trouble just getting it working (e.g. macOS, some Linux), but I think that heart-skipping moment of 'will it crash?' lag when an mpv load fails should be solved. Let me know how you get on! Thanks to a user, we have support for 'djvu' files, which are basically an open source pdf format. If you have a client with hundreds of search pages open, please let me know if your session is any less laggy, particularly if you get bumps of lag every five minutes or so. I rewrote some ugly code and think I eliminated some background CPU overhead for your situation. There's a handful of smaller fixes and tweaks too, nothing super important, but check the changelog if you are interested. next week I'm sorry to say my past couple of months have been pretty low productivity overall. Just life stuff cutting at my time and energy, mostly. I'm going to keep my head down and not change anything drastic, but it has been on my mind, and I regret not being able to push on larger projects. We'll see how the new year goes, I think. Anyway, I expect I'll clear out some more jobs like this next week. I've also been dabbling with the idea of a new 'hash sort' to preserve the sort of a 'system:hash' or downloader highlight in various situations, so I might explore that.
Thanks for the update!
(78.16 KB 450x713 about.png)

>>20433 One thing, or bug I noticed with this v547. While importing files from my hard drive (137 files), 3 of them got attached namespaces from files imported 2 years ago. I mean the following: Those 3 files have namespaces for the file names, so they should be tagged as: First file = file name:020,189 - fish Second file = file name:020,202 - many seals Third file= file name:020,284 - some trees But instead they got tagged with the proper actual file name PLUS the namespaces from random files tagged long ago, kinda there is a weird cross-pollination in the database. As follows: First file = file name:020,189 - fish file name:000,574 - samefagging again <--- it belongs to another file Second file = file name:020,202 - many seals file name:001,245 - anon is faggot <--- it belongs to another file Third file= file name:020,284 - some trees file name:004,231 - pol is always right <--- it belongs to another file I have no screenshot this time but as soon as it happens again I will post it. BTW, I took a look at the import files' log, but in the list none was recognized as a file already in the database, so the hash mechanism is working fine,
I get this error when I try to start hydrus: Traceback (most recent call last): File "hydrus/client/ClientController.py", line 2190, in THREADBootEverything File "hydrus/client/ClientController.py", line 1068, in InitModel File "hydrus/client/ClientController.py", line 1002, in InitClientFilesManager File "hydrus/client/ClientFiles.py", line 360, in init File "hydrus/client/ClientFiles.py", line 1097, in _Reinit TypeError: '<' not supported between instances of 'FilesStorageBaseLocation' and 'FilesStorageBaseLocation' How do I fix it? I assume it has to do with me moving my files to a different drive the other day, but I added the new location to hydrus and it was working fine.
>>20436 >not supported between instances >between instances Just guessing, but I think you launched Hydrus while another instance was already running.
(88.82 KB 788x500 20twio.jpg)

>>20436 >>20437 Never mind. I'm retarded and forgot to decrypt and mount the drive with my files.
>>20405 Thanks, fixed the bad link. >>20406 >>20407 If this works, great. >>20408 I'm not intimately involved in the inner workings of the different site management here, but I think this 'they are one of the few who host it' issue is part of the problem--by hosting spicy content, they attract lots of user traffic who are desperate for it because it is so rare. Other sites, even if they are philosophically comfortable with the content, are less enthusiastic about hosting because they then get blatted by the same wave of traffic. If everyone in the ecosystem agreed to openly host spicy stuff without needing login, the expense would be shared, but I guess the initial pressure of various activist groups not liking the content itself got the ball rolling, and we got this pressure-cooker situation that isolates the sites that proactively like it and want it. I remember when all this stuff used to be available on the clearnet, without login. Seems like the bigger sites eventually went to 'you have to login/set this technical cookie to see it', which I presume was to stop automated crawlers and advertisers' 'you have been naughty' detectors of various sorts, and it probably also smooths out the bandwidth/curve of interest from the userbase and allows for default-blacklist rulesets to reduce forum drama. The next stage, as we've seen over and over, seems to be, after admin infiltration by activists and/or becoming a legit business who needs happy advertisers and payment providers, simply banning it to avoid the drama and hosting costs. Pixiv just went through this, and while I'm not read up enough on the current news to talk too confidently, but I'll be interested to see what ultimately happens, since the jap communities are a completely different beast. >>20413 I would click 'they are related alternates'. I personally reserve the 'this is better/they are the same' actions for true duplicates, like resizes or higher/lower quality jpegs of the same master. If something is a WIP or a costume change, it is an 'alternate' for me. >What happens when I "commit" my changes after processing, does hydrus tell the PTR that I marked these alternates as duplicates? There's no automatic remote network stuff here. It is only your local database that learns about file relationships. If you have merge rules set to copy PTR tags from a worse to a better file, then those new tags will get pended and eventually commited to the PTR, but the PTR isn't aware of any file relationships. To check your merge rules, hit up the 'edit default duplicate metadata merge options' button at the top of the normal duplicates page in the main window. Also, not much happens when you set files as 'alternates'. It is mostly just a 'landing zone' where we can collect similar-but-not-duplicate files until I put in a lot of work into writing a proper 'alternate' file relationship system that'll let us categorise them into richer groups with labels and stuff (like WIP, or messy/clean).
>>20415 Thanks. This sounds like some complicated interactions. I will investigate this and see what I can reproduce and what I can easily improve. First step, I think, I can probably figure out some better sorting here. It sounds like the rearranging you are talking about is me throwing files with the same filename into a random set and pulling them out pseudorandomly. I know how to fix stuff like that, but I'll need to look closer to try and figure out what is going on. I can't promise great results here, but I'll see what I can do. >>20417 I don't think you can! I'll see what I can do. >>20419 This may have been a mishandled CloudFlare error response (or another CDN). 403 means forbidden, which these days is as often about not solving a CloudFlare 'I'm not a robot' click-through as much as about login problems. These issues usually happen in waves--does it still happen now, a few days later? If so, can you check your 'install_dir/db' folder for 'client - date.log' and ctrl+f for some of that text? Did I print like the first 4KB or so of it to the log? Is there any more info about what the error was in the HTML (e.g. a captcha note)? You can also send it to me or pastebin post it here if there is none of your login info on it, and I'll have a look. >>20421 Damn. If I try to load mpv in macOS, I get 100% CPU. That traceback is interesting and seems like something I can catch, and especially with my recent error-handling code I may be able to recover your situation at least gracefully. I'll work on this, >>20431 please do, now and in future. >>20423 Yeah, absolutely. We'll want custom thumb tech for CBR and it'd serve for things like swf too I expect. One day it'll happen. >>20430 >>20432 Yes, but you actually want the 'tag import options' setting in pic related, which works more efficiently. Be careful with these settings, they shouldn't be on all the time (they waste CPU, time, and bandwidth). Just set up a one-time redownload page/job with custom 'tag import options' and clear it when you are done.
>>20435 Ah, damn, this sort of thing can be evidence of database damage. Basically some ids getting deleted and then reused later on because it seemed like they didn't exist. Have you ever had a 'malformed database' warning and had to go through the 'help my db is broke.txt' document in 'install_dir/db'? For now, I recommend checking that document just as very vague background reading and then running the 'PRAGMA integrity_check;' stuff it talks about to make sure your four db files are healthy. If they are, boot the client and run database->regenerate->local hash cache, and if you do NOT sync with the PTR, ->tag storage mappings cache (all, with deferred parents and siblings calc). It is possible the bad tags will actually fix themselves, but I worry things may be more complicated. Fingers crossed, this is a one-time weird problem, but if it keeps happening, let me know and I think we'll probably want to email or discord back and forth a bit to investigate this more and maybe run some manual fixes on your database with some custom SQL. Let me know how you get on regardless! >>20436 >>20438 Thanks for this report. You are all good now, right? I got a related report from someone else. I'll see what I can do here--the error should be nicer, at the very least. >>20437 Nah, just python talking about two different 'instances' of an object. I replaced my path-tracking code a few weeks ago with nicer objects instead of raw strings, and something is fucked here while they are trying to sort or something.
>>20441 >For now, I recommend checking that document just as very vague background reading and then running the 'PRAGMA integrity_check;' stuff it talks about to make sure your four db files are healthy. If they are, boot the client and run database->regenerate->local hash cache, and if you do NOT sync with the PTR, ->tag storage mappings cache (all, with deferred parents and siblings calc). It is possible the bad tags will actually fix themselves, but I worry things may be more complicated. I read the 'help my db is broke.txt' document and I followed your instructions to the letter as shown in the pics. >if you do NOT sync with the PTR This Hydrus Client is exclusively used with off-line tasks and have connected to the internet no more than a dozen times in a 3 years period to download a Twitter video once in a while, that's it. According to the 'help my db is broke.txt' document, the database anomaly might be caused by hardware malfunction. The drive where the DB is located is a 4TB unit, so to scan for bad sectors may take more than 20 or 30 hours, so I'm postponing the scan for a while, or, perhaps for a new db anomaly to pop up. Anyway, I always backup the db folder (except the "client_files" folder which is humongous) before each update. Also I keep in a 8TB unit a mirror of all the files in the db, but with their original file names and custom sidecar notes in case a db disaster might happen. In this way I might be able to recreate the db again by re-importing the files, but sadly losing all notes which are in a multi-line format no compatible at this time with the import mechanism. Reeeeeeeeeeeeeeeeee. It is a trade-off until I can get another 8TB unit or bigger for the much needed expansion. I really appreciate your attention devanon and I'll keep you posted on any news.
>>20399 >Thank you for this report. I am sorry for the trouble. Can you try turning off (if they aren't already) database->file maintenance->work file jobs at normal time and tags->sibling/parent sync->sync siblings and parents at normal time? Does that stop the hangs? If it does, please do some normal work with help->debug->profiling->profile mode on, and then try turning the work back on, with profile mode still on, and then send me the profiles. >If turning those maintenance routines off doesn't fix the issue, can you do a profile anyway? Just do some normal work with it on, and then send me the profile. You can check, but there shouldn't be any private data, so you can pastebin it here, or you can email or discord DM it to me. I think you got it right. It's probably hanging for extended periods of time/scrashing because it's working file jobs and parent syncing when I try to run a search. Normally, I'd recall seeing a little message in the bottom right corner that says "db write locked" or some such, which I assumed was this. It never hanged more than a split second longer when I tried to do anything while this message up before, and the message is always only up for a very short time, so I thought it was breaking down the work into very small chunks. After I turned off the normal time maintenance, the hanging/crashing seemed to stop. But then it happened once later. So I turned on profiling mode, but then it seemed to stop again for days. I only just now got the severe hang to happen again by clicking the search box first thing in the morning, so I assuming Hydrus thought I was idling and was still performing maintenance when I clicked. Could this be because I have too many files/tags now while running off an HDD instead of an SSD? I have nearly 20,000 undeleted files and pic related for tags. No PTR, all manually entered. I hope not. It's annoying having parent sync be so delayed, as when I'm messing with a decent web of parent/children relationships, I can't see them immediately update in the tag manager or by right clicking, and instead have to go into the parent/child manager. How do I find the profiling data to send you?**
(5.27 KB 571x116 Capture.PNG)

>>20443 >Could this be because I have too many files/tags now while running off an HDD instead of an SSD? just chiming in to say i doubt this is the issue, personally i have pic related on an HDD >It's annoying having parent sync be so delayed, as when I'm messing with a decent web of parent/children relationships, I can't see them immediately update in the tag manager or by right clicking, and instead have to go into the parent/child manager. you can force parent/sibling sync to update all at once under tags > sibling/parent sync > review current sync. but this can lock up your client for a while if you have a lot.
(96.58 KB 1200x675 Thinking.jpg)

>>20442 >Also I keep in a 8TB unit a mirror of all the files in the db, but with their original file names and custom sidecar notes in case a db disaster might happen. In this way I might be able to recreate the db again by re-importing the files, but sadly losing all notes which are in a multi-line format no compatible at this time with the import mechanism. Same anon here. Is it possible to export only the tag sidecars while skipping the files? In my case, it doesn't make sense to export terabyes of files (which are already mirrored in other drive) when I only need a fresh batch of tag sidecars for backup and tagging update purposes.
>>20421 >>20431 > work. It's nothing, dev-kun is the real astronomican here >>20440 >Damn. If I try to load mpv in macOS […] Just for completeness I re-tested: I can confirm that I still get the same error onv v547. Probably nothing is different but in case you see something I don't that might point to a solution: V546 error: http://sprunge.us/0Anc62 V547 error: http://sprunge.us/WUnEKM Maybe it helps if I say in addition: <the instance of mpv opens even if I so much as select (not double-click to open) the video file <the instance of mpv refuses to be closed <even after quitting/crashing the instance is still there and if I select another video file in the viewer the cycle repeats <I think CPU usage does ramp up here on linSHIT too
Hello friends. I got two quick questions regarding deleted files. 1. When I import a file, put some tags and note on it, then delete the file without saving a deletion record and import the file again, the file will have the tags and notes again. I assume that is an intended behaviour? The "all deleted files" domain has the deleted files still in it and I guess that is where the file gets the information (tag/note) from again. I thought that deleting the deletion record, would also delete this information. Is there a way to clear the deleted files from "all deleted files" too so when I put in the same file into hydrus, it starts as "fresh"? "Clear deleted files record" from "services -> review services" does not do the trick. 2. So what is the "deleted from all deleted files" domain/location exactly for? When do files get into this domain and if they get there, is there a way to delete them too?
>>20419 I avoided a 403 by changing VPN
Looks like Cloudflare is fucking everything up for everyone AGAIN. I feel this is relevant to the collectors here. https://boards.4channel.org/g/thread/96739002
(128.19 KB 600x521 1697409649765317.png)

>>20449 What a fucking faggotry.
I had a good week. I overhauled the thumbnail shortcuts system, fixed some bugs, and improved some quality of life. The release should be as normal tomorrow.
https://www.youtube.com/watch?v=Tf8GG3gwam8 windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v548/Hydrus.Network.548.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v548/Hydrus.Network.548.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v548/Hydrus.Network.548.-.macOS.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v548/Hydrus.Network.548.-.Linux.-.Executable.tar.gz I had a good week. Thumbnail shortcuts are now customisable. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights First off, I cleared out a ton of bad thumbnail keyboard shortcut code and moved it all to the newer system under file->shortcuts. There is a new shortcut set there, called 'thumbnails', that governs opening the media viewer, the archive/delete filter, selecting files, and moving the thumbnail focus. Everything is now customisable (although I'd stay away from the thumbnail focus stuff unless you really want to wade through a big list), and I have added some new commands for 'select inbox', 'select trash' and similar, for more file selection options. This work clears out some really ancient code, and it should be much easier to extend thumbnail shortcuts in future, including hooking mouse-clicks into the customisable system and adding ctrl+selection logic. Thanks to a user, we now have renderable krita files! We've got it set like PSDs for now, where you'll get 'open externally' on the preview and the full image in the normal viewer. As always, if you have files that don't work or render crazy, please send them in! An issue with file export drag-and-drops sometimes giving the wrong file when those files share the same export filenames should be fixed! Just a small thing, but a couple of places that were unintentionally sorting service lists in a random way now sort them alphabetically, as indended. The F9 new page picker was one of these, so if you have multiple local file domains, you may need to learn some new muscle memory! next week With luck I'll have some file storage stuff done, and I'd also like to clear a 'thumbnail fill' rewrite I've been planning.
>An issue with file export drag-and-drops sometimes giving the wrong file when those files share the same export filenames should be fixed! Thanks! Did some testing with my "node.jpeg" files and everything seems to be working perfectly now. This is saves a lot of time making sure things don't have the same filename. As a side note, it seems the file order still shifts randomly now and then, but this might be an issue with 8moe's quick reply receiving multiple files simultaneously and not Hydrus shifting them around.
>>20450 Thank you for this. Helped somewhat but not entirely. I think the FAQs were not updated since the "all deleted files" domain was added. The tutorial (see image first half) did not work for that domain. It works for the "all known files (with tags)" domain, since after you delete the tags, the files won't show there anymore, but the "all deleted files" domain still has every file ever deleted. I have some old deleted files in there, that dont have any tags, notes or thumbnail (only showing red hydrus thumbnail). I guess only the SHA256 and file_id is what is left from it. Since the Blurhash was introduced, the deleted files will have the blurred thumbnail there shown too. If I understand correctly, Hydrus dev will allow to delete everything from this domain in the future too (see image second hafl)? Would be nice if Dev could answer. Secondly the following question remains: So what is the "deleted from all deleted files" domain/location exactly for? When/how do files get into this domain and if they get there, is there a way to delete them too? I have not encountered any files there yet. Also I encountered a description error in the thumbnail viewer right click menu. If you go to "select -> local/not local" and hover over those entries for the description to appear, it says it "removes" the files, even though it obviously only selects them. Thanks in advance!
>>20454 would you ever consider a special type of compound parser where instead of concatenating multiple parses of the same content, the parser operates on the previous parser's output? Certain tumblr posts the parser can't get contain html code as a list item in a json file, and this seems to be the easiest way to solve the problem to me. please also just tell me if i'm a retard and you can already do this because i'd love to know how if so
For some reason if I update past v537 hydrus opens by default in my right monitor instead of left and I get this error message any time I open the image viewer or any windows Cannot figure out what causes this
>>20458 I just click the little arrow and learn to live with the rectangle full of 20+ "rescued from apparent off screen" errors. Have fun. >>>/hydrus/20317 >>>/hydrus/20326 >>>/hydrus/20347
>>20457 i believe you want subsidiary page parsers for this. they let you go between html and json using the separation formula. https://hydrusnetwork.github.io/hydrus/downloader_parsers_page_parsers.html#subsidiary_page_parsers here's an example of a subsidiary page parser i made a while back. it will get all the images in a entire thread of tweets on nitter and give them page tags in order. it uses subsidiary page parsers to split the thread into individual tweets and then again into individual images.
speaking of parsers, i would really like it if it they could have metadata. it would be very helpful when debugging parser issues. imagine if each parser knew the date it was last modified and the version of hydrus it was created in. perhaps this would be stored for each content parser, and the entire parser would display the aggregate of this metadata. and parsers could also have extra info like an optional version number and creator name.
>>20461 I know that at least versioning is a planned feature for downloaders, but as far as I know, no work has been done on that yet.
Does the "related alternates" button in the duplicate filter mark those files somehow? Is there a way that I could view all the files that I have marked as alternates together?
>>20463 Yes. Right click a file -> Manage -> File relationships to see all alternates of a particular file. If you want to run a search for files with =, <, or > "x" alternates, select the search box, then double click "file relationships".
Anyone know why Kemonoparty downloaders 404?
>>20417 >>20454 I am now able to bind a new key to open the media viewer. Thank you very much for adding this.
(1.20 MB 1500x2000 hydrus_error.png)

>>20454 Unfortunately the wayland issue persists. I might just try reinstalling the entire OS to make sure it's not some sort of issue with a wonky installation on my end. I am infact using V548 I run from source Updates were done just using "git pull" (perhaps I have to do some other git operation in addition?) Either way, don't worry too much about me, if it's too complicated I'll just have to suck it up and use X11 again. I don't wanna stress you out. Have a nice day!
(109.79 KB 965x382 Screenshot_20231020_192131.png)

Using v547 from source on Linux. While importing around 300 files, one of them refuses to import (I've tried many times already and it keeps failing). It is the bigger one with a size of 4.5GB. The log complains that there is not enough space, but the 2 drives involved have terabytes of empty space. Pic of the log attached.
Is there a way/addon to designate a typical booru "pool" for images? I have lots of images where the images are just variations of each other and it'd be nice to have them grouped without doing a hack like pool:000001/pool:000002 which would be really hard to keep track of. I've been exploring options but I'm not seeing anything like that at a thorough glance. >>20467 >Also using hydrus to catalog all his 2hus and 2hu accessories What a funky! Anyway, yeah this wayland/x11 shit has both been a long time coming and yet entirely unprepared for... and I keep seeing a lot of drama and issues about it even though we've all known wayland is going to replace x11 and blah blah blah. I haven't started using hydrus on my linux boot yet (wanted to start the DB on the windows boot in a format neutral drive and have both boots pull from it) but, not having tried it myself, maybe try running it under xwayland? https://wayland.freedesktop.org/docs/html/ch05.html I've been seeing a lot of people suggest this for other programs amid the crossfire so it might help you without too much system-level tinkering.
>>20469 >I have lots of images where the images are just variations of each other And they're not alternates?
>>20470 Alternate forms of the otherwise the same image? Some of them are. Some of them are panels/pages of things. As in "Infographic on quick linux configs that are handy for new people 1/3, 2/3, 3/3" could be a pool. The rest of them are intrinsically related in other ways that a tag doesn't quite encompass a la "a collection works by a particular artist that were presented as a set for a particular event", ie: an artist drew up 5 special images for the 10th anniversary of their deviantart / an artist made a small themed dump of content for cirno day 2023, etc. etc. So is there a way to like, when viewing an image get that popup "This image is the parent/child/in a pool with other images" thing? Some boorus show you a mini gallery at the top of the page in a pool, others just link you to the pool, but it's very handy when images are very very tightly linked. I could in theory have a massive library of tags like Pool:Artistman5thAnniversary but that would get unwieldy as I'd have to keep track of what the tags-acting-like-pools are.
>>20471 >Some of them are panels/pages of things. As in "Infographic on quick linux configs that are handy for new people 1/3, 2/3, 3/3" could be a pool. >The rest of them are intrinsically related in other ways that a tag doesn't quite encompass a la "a collection works by a particular artist that were presented as a set for a particular event", ie: an artist drew up 5 special images for the 10th anniversary of their deviantart / an artist made a small themed dump of content for cirno day 2023, etc. etc. Hydrus has options for ordered pages, but isn't well equipped for them, and I never use it. Pools on boorus are mostly just custom sets of images made by users to suit their own tastes. In your case, what I'd do is just make a "set:*" tag and apply it to all images in the set you want. There's really no effective difference between a pool on a booru and a custom tag on a booru other than that you can't search pools and tags together because boorus don't like custom tags with minimal applications. For example, >an artist made a small themed dump of content for cirno day 2023 There's multiple ways tags can easily handle this >creator:* + character:cirno + date:2023-x-x >creator:* + set:cirno day + date:2023 >creator:* + set:cirno day 2023 >set:creator cirno day 2023 Now the example given has enough information to be handled without any set tags. I often encounter situations without such conveniences, and usually a generalized creator:* + set:* is more than enough to handle it for me. Even if you have multiple works by multiple creators, you can just make a set tag. For ordered sets, I retain filenames as a filename:* tag when importing, as most ordered sets I get already have ordered filenames, then "sort by filename:*", though I tend to import in order so "sort by import time: oldest first" tends to also cover this. Really, just make a pool:* tag for whatever things you'd want "pooled" together and put it on whatever you want. Don't underestimate the versatility of tags.
>>20471 >>20472 Ah and of course you want to quickly know if your images are in a pool. To check without jumping into the tag manager ormedia viewer, just type "pool:*" or "pool" depending on your settings in the search box and all "pool:*" tags on any files in the current page will popup in the autocomplete.
(86.05 KB 850x850 beri beri berigoo.jpg)

>>20472 >Don't underestimate the versatility of tags. >examples given I had considered some things like this (though yours are better than mine) but what has stopped me is that people who run boorus themselves like you said tend to frown upon "this tag has and will only ever have 5 entries" which I am assuming is to stop database bloat. If that is a best practice for boorus is it not a best practice for a personal collection? I will say I only have experience as a user/uploader/downloader so I have no real concept of the backend other than what the people who manage them tell me as rules. I've been using this a while but certainly not enough to have run into any "wow I've used hydrus for 5 years and have 400,000 images in it and holy fuck if I had only known that..." issues so I am somewhat wary towards futureproofing as I intend to use it for a long time now that I've gotten into it. >>20473 >pool:*" tags on any files in the current page will popup in the autocomplete. That's actually a really brilliant point on useability though. I hadn't considered using two searches like that at all but it will indeed basically do the same thing. Thanks a bunch.
>>20474 >If that is a best practice for boorus is it not a best practice for a personal collection? No, because you're tailoring it to yourself, not a general audience. Thousands of people's personal "I like these pictures" tags aren't useful to people in general, which is why they're set aside pools instead of tags, but sets you want to group together are useful to you. You don't have to memorize the sets like you would tags either, because, as I said, you can just search "set:*" or "pool:*" or what have you. I haven't been big on future proofing, as then I wouldn't make much progress. I just fix things as I go, and what can't be fixed quickly I put on a backlong to fix en masse when I finally catch up with all the files I already have. > I hadn't considered using two searches like that at all but it will indeed basically do the same thing. If you didn't already know, you can just enter "namespace:*" or "namespace:" and hit enter, and all files with tags under that namespace will appear as well. So you can pull up all sets, or narrow down a half completed search to just the files within it that are part of a set.
>>20474 Also I'd like to note, there's not much reason to make a set:* tag for a set of alternates unless some non-alternates are also included, since the alternates function already groups these together. Though you could save yourself a step in checking for both sets and alternates if you redundantly made all alternate sets have their own set:* tag.
>>20443 Sounds good. Let me know if you have any more trouble. The PTR warning is just because the database tables are huge and a full regen takes like 20 hours. You can do multiple multi-line notes in sidecars, but you have to set a different 'split' than newline. Try '||||' (search this thread for '||||' to see me talking a bit else about it). As long as the separator phrase you use is obscure enough not to be in your notes, they'll all stay nicely separated. >>20445 Not yet. I do plan to slowly expand the sidecar system (generalised metadata migration system) in future, and straight-up 'just import/export sidecars with hash/file_id filenames' sounds like a great idea. I think you might want to check out the Client API. That will let you rip data in any format you like. It more of a palaver to figure out, and moreso to save data back through the API than easier sidecars, and you do need to know a little about scripting and making network connections, but if you just pulled this request https://hydrusnetwork.github.io/hydrus/developer_api.html#get_files_file_metadata for every hash you own and saved it to a .json file, you'd have everything. You can find out every hash you own with this https://hydrusnetwork.github.io/hydrus/developer_api.html#get_files_search_files and 'system:everything'. >>20443 Thanks. This must be frustrating. >>20444 is right, you are fine managing 20k files on an HDD. It does matter having 8ms latency instead of 0.1ms, but only when you are pushing much higher numbers. The profiling data should be in 'install_dir/db/client profile date.log'. Don't run it for too long or you'll get a 300MB log! It zips very efficiently though, if you want to catbox etc.. it. I do break these 'run in normal time' jobs into smaller pieces, but I think we have two issues here: 1) Sometimes I can't break a job any smaller, and it is going to gonk you for a while of CPU/db time. PTR is super bad at this--a couple of sibling calculations on there take like 45 seconds even on a fast machine. 2) Even with broken-up jobs, I think with enough different jobs working at once, the whole program can get into a traffic jam and suddenly access latency spikes. So, I need better controls here, more options for people with different computers to smooth out the computation spikes, and better auto-throttling to deal with competition better in any case, and I also know there is a deadlock somewhere in the repository sync/processing code that can get triggered by a busy db, which is also what appears to cause a memory explosion some users have seen. This last thing isn't relevant to you, but if there's one dumb deadlock in my maintenance threading code, there may well be others. I will keep pushing on all this.
>>20446 >>20467 Jesus! Well, at least the error is caught a bit better now. 'git pull' is correct, that's all you need to do. Yeah, I think this may be beyond my ability to fix now. It is quite possible that my jank code is loading and hooking the mpv window in a way Wayland doesn't like, but I'm not expert enough in this to figure out what would be the correct way, and it seems there is true voodoo going on if you get those artifacts after the failure. I'd personally not bet on reinstalling, unless other things have broken for you and you have suspicions that some GPU driver or .so file is actually broken. I think this is just the case of too many saddlebags on the camel--we've got Linux, Qt, PySide6, python, libmpv, python-mpv, my shitfaced media viewer tech, my duct-taped mpv widget implementation, and then Wayland. My fingers are crossed that whoever is responsible for the linkage that is truly sperging out here can figure out a fix in a future patch. At my safe pythonic level, I usually can't create that sort of graphical artifacting even if I want to. PySide6 did a couple of updates just recently, which I'm planning to roll out 'future' test releases for in the next month or two: https://pypi.org/project/PySide6/#history You might like to play around with these, and ones in future if they don't help--EDIT: I was going to walk you through how to activate your venv and switch out the PySide6 you are using, but I realised I should alter how my setup_venv.sh script works to allow you to put in a specific version. I'll try and roll it out next week, and then you'll be able to just reinstall your venv and put in '6.6.0' and it'll all figure it out for you easy. >>20449 Shame. I really need to get on my automatic duplicate resolution tech, this situation is only going to get worse. >>20455 Yeah, I had a look at the file order thing and I feel pretty good that I am loading the file objects into the drag and drop object in the correct order, so I think there might be an OS thing or a browser thing or a site javascript thing that's shuffling it. Can't be sure, but yeah, if the site code that calculates the thumbnails and stuff for the drag and drop is working asynchronously on several threads and delivers results as they are done, that might be it. Could possibly test this by dropping pairs of (small file, large file) in AB, BA order, and seeing if the small (fast to calculate) file regularly gets switched to the first position. Not sure. >>20456 >The tutorial (see image first half) did not work for that domain. It works for the "all known files (with tags)" domain, since after you delete the tags, the files won't show there anymore, but the "all deleted files" domain still has every file ever deleted. I have some old deleted files in there, that dont have any tags, notes or thumbnail (only showing red hydrus thumbnail). Yeah, sorry for the confusion. That tutorial in your pic is just about how to remove/decouple tags from files when they are deleted and difficult to get to. This does not remove the file from hydrus's knowledge itself. I was just about to tell you, 'right-click on a file in "all deleted files" and you can select "clear the delete record" and it won't turn up there again', but then I tried it and it didn't work, so something there is messing up. I'll make sure to fix that. Note that even if we can purge the file from all your file domains, the master hash definition will remain in client.master.db, as the second half your your pic talks about. I hacked this domain together a while ago to handle some odd jobs, and I never integrated it properly. I basically just holds a union of all the normal file domains' deleted files. "deleted from all deleted files" is a mistake, an entry being automatically generated--thanks for mentioning it, I'll remove it! And thanks, I'll fix that description error!
>>20457 >>20460 Yeah, hacking with the subsidiary parser is the (very) ugly fix for now. I know exactly what you mean about a compound parser that flips between json and html better, and I keep meaning to make it and it gets away from me. There should be a much nicer solution here. >>20458 >>20459 Yeah, sorry for the trouble. The new Qt version, while neatly solving several other problems we had, screwed with a handful of Windows multi-monitor users. The coordinate translations are all fucked up for some reason. Your menus are all probably appearing on the wrong monitor too. Please switch to a 'running from source' user and select the '(m)iddle' Qt6 choice in your advanced setup. >>>/hydrus/20347 Let me know if you still have any trouble with it. >>20461 >>20462 Yeah, this is a dream of mine. I keep scheduling it for 'the next big downloader engine iteration', but I can't say when that will be. I want proper versioning and metanaming and unique IDs for downloaders and hassle-free updating from central repos. It should all just work, the user shouldn't ever have to care other than 'yeah, looks good, update' on a prompt, just like you'd see in any modern vidya mod manager. >>20463 Yeah, 'alternates' is just a landing zone at the moment for 'related files that aren't dupes'. When I do the next large expansion of the file relatonships system, we'll get the ability to handle these properly, organising them with labels (e.g. WIP, clean/messy, costume change) and file orders (for 4-page mini-mangas) and stuff. It is just 'I can't handle this now, and it isn't a dupe, collect them in a group over there now please'. >>20468 Sorry for the trouble. I'm not an expert, but we've seen this before. This is when I copy the file from its source to your temporary folder, which I do during import for convenient technical reasons. Many Linux flavours have a kind of separate partition or ramdisk or something for their temp folder and have a secret limited disk size (it can be as low as 750MB, I guess on a per-process or per-file basis?). I thought I checked this explicitly these days, but the tech is odd and maybe my 'free space on disk check' is slipping through the actual logic. Please check here: https://hydrusnetwork.github.io/hydrus/launch_arguments.html#--temp_dir_temp_dir Run the program with --temp_dir="/path/to/good/ssd/place" and you should be good! Let me know if you still have any trouble, and I'll see if I can detect that error better and report what to do to the user.
>>20478 >Jesus! Well, at least the error is caught a bit better now. I am glad if I could ever so slightly help the development. Thanks that you'd make changes just for me. <3 Also: I started Hydros on X11, everything works just fine. With this opportunity it also has to be said that I greatly appreciate the fact that sound now works for preview. It's great :) Hypothesis: <X11 -I can just select the thumbnail of a video and it starts playing in the preview windoes -I can doubleclick the thumbnail of a video and it opens the viewer <Wayland -I can select the thumbnail of a video and immediately a new window opens in which mpv is playing the file -I can double click the thumbnail of a video but it's already too late to discern differences due to the selction having already spawned a new window Maybe all that is required is some kind of different way of calling it? In the file >hydrus/hydrus/client/gui/canvas/ClientGUIMPV.py It says on lines 242/243: ># this makes black screen for audio (rather than transparent) >self._player.force_window = True Perhaps this is the reason that things go awry? Maybe there is some kind of option to just make it display, I dunno, "black square png" in instances like this? >>20469 I'll look, I didn't even think of the possibility that hydrus is _actually_ running in wayland because beforehand it would automatically run in Xwayland without me doing anything. I guess the default launch options were changed. But it's about damn time. Linux has to get this improved protocol.
>>20477 >Don't run it for too long or you'll get a 300MB log! It zips very efficiently though Yep. I have two logs about 240-300MB. Wew. I zipped one down to about 12MB, but I'm not sure it will be much help if I send you a very huge log to dig through.
>>20478 >ould possibly test this by dropping pairs of (small file, large file) in AB, BA order, and seeing if the small (fast to calculate) file regularly gets switched to the first position Good thought, and I got some really weird behavior. Smaller files maintained their order while the one large file I dropped in was placed first every time. Oh well, it is a mystery. Too many variables.
(119.58 KB 1121x623 Screenshot_20231021_202403.png)

(81.92 KB 465x723 Screenshot_20231021_200449.png)

(382.66 KB 1366x768 Screenshot_20231021_202237.png)

(10.27 KB 370x236 Anonfilly - Praise.png)

>>20479 >https://hydrusnetwork.github.io/hydrus/launch_arguments.html#--temp_dir_temp_dir >Run the program with --temp_dir="/path/to/good/ssd/place" and you should be good! SUCCESS!!!! Cool thing learning code stuff while troubleshooting. Thank you so much OP. /)
Can we get an option in the duplicate filter to only show pairs where the image dimensions are different? These are usually easy decisions of one is better than the other so it would be nice to be able to just do them first to get it over with and free up some disk space.
(7.61 KB 512x154 kemono-updated.png)

>>20465 Kemono appears to have changed how their api works (v1 now). They also migrated to kemono.su from kemono.party. I fixed both issues. You're welcome.
does anyone have a good solution for downloading images with tags/notes to hydrus from social media sites (e.g. twitter/tumblr) that isn't just using a gallery-dl batch script and hydrus watch folders? i rarely download stuff from boorus and mainly use hydrus to archive images from social media sites, but with all of the main sites cracking down on api downloading lately i haven't been able to do that. i would use something like hydownloader but it's a pain in the ass to set up and i would rather use something much simpler. thanks in advance!
Is Sankaku broken for anyone else? I'm getting "This URL appeared to be a "sankaku chan gallery page", which is not a File or Post URL!". It's worked fine for ages and the url schemes don't seem to have changed. From what I can glean from the parsers they fetch correctly too. I have my Sankaku cookies in Hydrus.
>>20485 Thank you very much.
>>20487 Same problem here, also for some reason when importing/pasting Sankaku URLs, these URLs add a “?page=1” at the end for some reason which is something never happened before.
>>20485 NTA but thanks for this, anon. A related question: is it possible to add a redirect or something to the "known urls" system, so that if I already have kemono.party/foo.png it will recognize kemono.su/foo.png as "already downloaded"?
I noticed a subtle thing when selecting many files at once. If you have no files selected and press Ctrl + A, it selects everything instantly. But if you already have some files selected and press Ctrl + A, there's a decent bit of lag. It gets pretty noticeable if there's a lot of files in the page, say 5000 or so. It's especially noticeable if you already have a lot of files selected. I got a huge lagspike (~20 seconds) by selecting half of the files on a page with 20000 files and then pressing Ctrl + A.
>>20449 Expanding on this, it looks like Hydrus strips any URL parameter that's added and will always download the compressed image. Is it possible to change this behavior, or even automate adding a random parameter when a 4chan URL is detected? Right now, in order to save non-compressed images, you have to manually add the parameter, download the image through the browser, and then import it to Hydrus.
Speaking of importing issues... pixiv offers translations for most of their tags. Is there a way for me to import the translations of a tag instead of the original tag? I'm down for a mod or extension if need be. Alternatively/additionally is there a way to whitelist/blacklist entire (linguistic) scripts from being imported? IE: hirigana/cyrillic/hanzi/greek/whatever. From my own exploration of the program it seems I could create a pixiv specific series of rules with only whitelisted tags (pixiv is super guilty about having a lot of random garbage I don't give a fuck about in the tags lol) with english parent/siblings and have them be automatically replaced but I'd have to set up parent/sibling relationships on a tag by tag basis and since pixiv already has done a lot of this work on their end it seems like just grabbing that data would be easiest across thousands of images. Though I'm open to ideas I haven't considered.
>>20489 >>20487 Same here Sankaku broken yet again I got this x 300 I hate that trash site so god damn much Any way to fix it?
How necessary is it to check db integrity after a crash? I just had a pretty bad crash while Hydrus was running. Thankfully I'm on Linux which has very resilient filesystems in my experience, but I still want to be safe, I've put a lot of work into my db, and I do often so even my weekly backups could end up missing quite a lot of files.
(4.26 KB 512x126 sankaku-updated.png)

>>20489 >>20487 >>20494 Sankaku appears to have changed its URL format for some stupid reason. I fixed it too, same as Kemono.
>>20496 Very nice, it works again (for now), thank you anon.
I had a good week. I ended up working on technical improvements, fixing some issues with deleted file tracking and exposing multiple maintenance timings for customisation. There is also a variety of optimisation that should slightly de-lag very heavy clients. The release should be as normal tomorrow. >>20496 Thank you for figuring this out! I will test tomorrow and try to roll this into the release.
>>20496 >>20497 So all I need to do is update hydrus?
https://www.youtube.com/watch?v=VR9oZZcdYNU windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v549/Hydrus.Network.549.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v549/Hydrus.Network.549.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v549/Hydrus.Network.549.-.macOS.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v549/Hydrus.Network.549.-.Linux.-.Executable.tar.gz I had a good week, improving performance and adding some advanced tools for future testing. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights I fixed some deleted file records in the database. When you update, it'll take a moment to recalculate some things. The options->maintenance and processing page has a ton of new settings to tune how hard the various background maintenance systems work and rest. These settings are mostly only for debugging, but if you are an advanced user and feel brave, you can play with them. If you have the 'memory explosion on PTR processing' problem, please check the changelog for instructions. I optimised the taglist sort and update code a bit. Very large pages should be a little less laggy. If you run from source, the 'easy setup' scripts have a couple new choices for setting up Qt, including choosing to type a specific version in. If you had the issue where your menus were in the wrong position, there is now a new (t)est Qt version (6.6.0) to try out, too. next week I got caught up in this 'background' work this week. I'd like to do something more visible and fun next week, but we'll see! >>20499 Yep just get this and it'll roll >>20496 in for you.
(7.51 KB 466x228 1.PNG)

(39.15 KB 677x966 2.PNG)

(37.56 KB 1008x984 3.PNG)

>>20493 >Is there a way for me to import the translations of a tag instead of the original tag? I believe the default pixiv parser is capable of this, you just need to set up an http header. Go to network > data > http headers, and click add. Then fill out the window like pic related 1 and click apply on all the windows. Pixiv pages should now give you both translated and untranslated tags. If you want only translated tags, you could delete the thing that produces the untranslated tags. Go to network > downloader components > manage parsers. Find "pixiv file page api parser" and double click to edit it. Go to the content parsers tab. You should see a screen like pic related 2. Delete the entry called "tags" and leave the entry called "tags (translation)" alone. Then click apply on all the windows. But keep in mind that if a tag doesn't have a translation, you won't get anything for it. >Alternatively/additionally is there a way to whitelist/blacklist entire (linguistic) scripts from being imported? IE: hirigana/cyrillic/hanzi/greek/whatever. No, but now that you mention it I want it too. >>20492 >Is it possible to change this behavior, or even automate adding a random parameter when a 4chan URL is detected? You can edit the file url class to do this. Go to network > downloader components > manage url classes. Double click on "4chan file" to edit it. Under the "parameters" heading, click add. Set the key to something; set the match type to any characters (because it doesn't matter); and set the optional default value to something. You should see something like pic related 3. Click apply on all the windows. Now when you give hydrus a 4chan file url it will automatically add that parameter.
(32.26 KB 416x160 26-12:11:41.png)

>>20501 >Now when you give hydrus a 4chan file url it will automatically add that parameter. Nice! Seems to work. One thing to mention, according to the 4chan thread discussing the compression, parameters have to be fully original, you can add ?penis to an image to get the original, but if someone else also adds ?penis it serves them the compressed one. I'm probably going to set my parameter to the hash of some file to ensure it's unique.
(88.22 KB 582x359 c.jpg)

>>20503 It's not really easy to tell but I was trying to indicate that the parameter setting was giving me different files, which is intentional. One's 448kb and the other's 468kb.
If I use a watcher page in hydrus in order to follow an artist on a booru site, what happens if I close the watcher page? Does hydrus remember my watched urls or do I have to manually re-add my watchers in a new watcher page? Should I keep my watcher page always open?
(24.98 KB 1253x44 26-19:53:12.png)

>>20505 >I use a watcher page in Hydrus to follow an artist on a booru site I can't believe I just learned about this, this will come in handy for boorus where I want every single file. >Does hydrus remember my watched urls or do I have to manually re-add my watchers in a new watcher page? It's not saved, you'd have to re-add them. >Should I keep my watcher page always open? Yes, but you should probably switch to subscriptions. Under Network>manage subscriptions>add you can add subscription queries that basically function like regular tag searches e.g gelbooru tag search: huge_breasts red_hair green_eyes, or; danbooru tag search: krekkov femboy and run on a schedule. These won't get removed since they aren't inside a page. The downside is that they require a tag to search, so you can't setup a subscription to scrape every new file from a booru, also if you have a lot of subscriptions they do a lot of work and can slow down Hydrus. Pic related, I had to switch to hydownloader simply so that the DB locks would let me use Hydrus, if you don't have that many subscriptions it'll be fine, you won't notice them running after the first fetch, I believe the first run grabs all old files and that may end up taking some time, I don't recall for certain though.
I've noticed that after archiving a few files, I no longer have to search or de-search the inbox to refresh the inbox/archive number. Nice.
>>20485 Did i fuck something up when adding this because its not adding clickable urls in the top right like the previous downloader did.
A bit unrelated but do you know if you still can use a RSS for sankakuchannel? I just realized it stopped working I used to have it like https://chan.sankakucomplex.com/post/atom?tags=tag Seems dead now
(39.92 KB 724x441 Capture.PNG)

>>20506 >nitter - 49 working 1 dead - 17,000 items dude... 49 queries in one subscription? i think you're supposed to split those up into individually. no wonder you're getting slowdowns. how are there even 17,000 items in there?
Typo in the subscriptions section of the hydrus guide on the git >One the edit subscription panel, the 'presentation' options let you publish files to a page. "One" should be "on".
I've been going through boorus manually picking out what I like from particular artists, and I'd like to start using subscriptions to stay updated on any of the artists I grabbed lots of files from. I have a couple questions though. I have one artist, which has different pictures uploaded to different boorus inconsistently with lots of overlap. I can add more tags to a subscription for a single site, but I don't see any way to search multiple sites for a single artist tag. I'm not fond of the idea of making 5 or so subscriptions manually per artist for dozens of artists. Is there any solution to this? I just recently manually crawled most of these artists. Setting a low initial first check is close to what I want, but ideally, I'd like set a check that only goes back to a certain date, the date of my most recently saved file from that artist, which I'd enter manually because not everything is in Hydrus yet. The second most ideal initial check I could make stop as soon as it runs into the first file with a hash already in Hydrus, but it seems subscriptions check urls, not files, so I don't think that's possible and would exclude artists I've yet to import. My last option which I may start doing soon is to do one last manual check of each of the artists again and then immediately make the subscription for them with an initial check limit of 1 file.
>>20512 Just tried making subscriptions for one artist. It really is cumbersome setting it up with the all the same settings multiple times per artist. I also have to specify the page it's sent to, because the subscriptions are separated by site, naming them by artist causes each sub to get artist (1), artist (2), and so on for their name, resulting in a different page for each subscription.
Is there a way to make presets for this? Seems pointless and cumbersome to have a dropdown for something that could be a checkbox. Also selecting that just so I can pick a file domain every time is really annoying.
can I make hydrus save stuff from a particular tag over a different location somehow?
Is there any way to show all related (like alternates) of multiple selected files? The manage > file relationships menu only works for a single file.
Christ man, just go to the discord and ask all of your questions instead of shitting up the thread. >>20514 >>20515 >>20516
>>20517 >Recommending anyone use dicksword, ever >>20516 I'd also like to know this. You can narrow a search down to just files that have an alternate relation to other files, but a way to take a selection of files with alternates not present in the current search and open a new page with all the alternates from that selection would be nice.
>>20517 I'm not going to use discord Answer
>>20480 I can't say for sure, but I assume the general problem here is the part where it says-- wid = str( int( self.winId() ) ) --in the main mpv window constructor. What this essentially does is say to the python-mpv library, 'hey, please create a new mpv window and then assign it to this window id' That winId is the Python Qt MPVWidget that I create on my side of things. The python-mpv module then appears to spawn the hardware-accelerated mpv window and tries to assign it that window id I gave it (which in Windows is called variants of hWnd) as its parent. It seems that Wayland will allow the mpv window to spawn as its own entity but will not accept the window-reparenting process. I've seen this before, including a million years ago when my tag autocomplete dropdowns wouldn't position correctly and on some Linux Window managers got the full window frame with minimise and close buttons in the top corner. I'm afraid it is beyond my expertise to talk too cleverly. The magic operation is happening deeper in the library and not in my code, and then it is Linux where I'm even less knowledgeable, and then there's the issue that we are in python and the 'convert to C++' wrapper that actually talks to the libmpv.so file obscure things even more. For instance, this 'wid' argument is being passed straight on to the .so file in a way I can't track natively in my IDE, so I can't see how the actual guts of this assigns things. Maybe there's another flag here that I can call that makes it just work on Wayland, but I suspect it is just a bug or missing parameter somewhere in the chain of five things between my code and the actual Wayland API call. BTW, here's the actual entry for that 'wid' param in the master mpv manual: https://mpv.io/manual/master/#options-wid It talks about X11, but not Wayland. There's a ton of issues here, maybe one is related: https://github.com/mpv-player/mpv/issues?q=is%3Aissue+is%3Aopen+wayland Unfortunately, my best strategy here is to wait for things to update naturally and hope a fix magically appears. Since Wayland seems to have several of these issues going on, I hope someone who is responsible for these interactions can figure it out! >>20481 Big logs are fine, I know how to regex ctrl+f my way through to where I want. >>20483 Great! >>20484 Yeah, we don't have good tech for this yet. For technical reasons it is actually difficult to add comparisons between each pair (rather than A is in this search, B is in this search), but I agree it would be nice to have at least some hardcoded checkboxes for common differences like this. Although it is dumb to say, you may be able to emulate this by simply having two searches, one that is >=1080p and one <1080p. You'll get a lot of garbage downsized B files that are easy to process.
>>20491 Thanks. I tweaked this code (the taglist is the source of the lag) last week, but I know I can do more. I have to basically write a bunch of conditions that predict if it will be faster to edit the existing taglist or just wipe it clean and regenerate everything from zero. I'll play around with it. Maybe I can/should just make the whole damn thing asynchronous. >>20495 If the crash is the program stopping because the code messed up and the process suddenly stops, there should be no problems at all--SQLite is very robust in this situation. If the 'crash' is due to a rough power cut that knocks out the whole computer or a hardware failure like your drive screeching and disconnecting, then yes, you might want to check your integrity. The 'install_dir/db/help my db is broke.txt' document is good background reading here, if you are interested. In all my time working on hydrus, I have never seen a 'malformed' error come from anything but hardware failure of one sort or another. >>20493 >>20501 >>Alternatively/additionally is there a way to whitelist/blacklist entire (linguistic) scripts from being imported? IE: hirigana/cyrillic/hanzi/greek/whatever. >No, but now that you mention it I want it too. Interesting idea! The python unicode tools might be able to do these tests. >>20510 Actually you are good with lots. I know users with 400,000+ items in 1,000+ queries in one subscription. This all used to be super monolithic and caused 'manage subs' to take five minutes to load, but I rewrote it all into objects that are loaded as needed. It still takes a few seconds to load a particular subscription within manage subscriptions, but don't worry too much about spamming things here. Do what is convenient for you. Unfortunately, having lots of downloaders working at once can really blat the program and particularly lag out user media access latency (e.g. you viewing a file in the media viewer). I have some plans to relieve this in the nearish future. >>20511 Thank you!
>>20512 Yeah, it is kind of a hydrus thing to see it as sites-first rather than query first. It helps with various technical calculations and timings. I recommend you make some subscriptions like this: danbooru artists gelbooru artists safebooru artists And then paste the (mostly) same artists into each. The good news is that big sites like these offer the 'md5' hash of their files, so even though you are seeing the same files for the same artists the first time you sync each separate site, hydrus can realise it already has the file and skip the actual file download. There is a way to make a subscription that checks multiple sites in one subscription object (this is called an NGUG, nested gallery url generator), but it can waste bandwidth by overchecking or miss files by underchecking. Best to keep things separate. To handle your processing difficulties here, since you have an advanced setup, I recommend you merge things a little more, perhaps collecting your subscriptions into categories, let's say 'danbooru anime babe artists' vs 'danbooru mecha artists', and then have all the anime babe subs attach the same processing tag or publish to one page called 'anime babes'. Making fifty or a hundred publishing page destinations tends to just get overwhelming, and you spend more time trying to manage the categorisation than you do keeping up with the structure, and then you find you have a hundred and fifty pages. I personally don't publish to any pages any more, I just process my inbox via their normal tags at a future date, makes for a clean session and, with it all out of sight, relieves my anxiety over not keeping up with my inbox. >>20514 No, I want to add a 'favourites' system though so you can save certain common profiles either to load up or use as a quick-load template to edit. >>20515 No, sorry! Maybe in future. >>20516 >>20518 Not yet. The 'alternates' system is just a holding area for files I don't have tech to manage properly yet. In future I will be radically expanding the 'file relationships' system, and we'll get proper viewing and grouping/sorting UI capability.
(66.08 KB 777x279 Untitled.jpg)


>>20520 >window-reparenting process It sounds (I'm not a coder) to a library's smart pointer not being smart enough. >Wayland will allow the mpv window to spawn as its own entity but will not accept the window-reparenting process. Again, I'm not a coder but I see that kind of issues all the time. Linux library maintainers rewrite the libraries left and right, all the time, and regressions and odd behavior are the normal. The net result is what is properly named as The Dependency Hell. REEEEEEEEEEEEEEEEEEEEEE. And BTW, OP did nothing wrong.
(13.76 MB 320x436 olderlog.jpg)

(11.12 MB 320x436 newerlog.jpg)

>>20520 >Big logs are fine, I know how to regex ctrl+f my way through to where I want. Alright then. Have a couple logs. One of them should end in a crash, probably the newer one. Files are in image archive format with hidden 7z files, because the site still has archive file uploads disabled "because people could hide illegal things in them", for no reason as demonstrated by me circumventing this block right now. >>20522 >Yeah, it is kind of a hydrus thing to see it as sites-first rather than query first. Not huge fan of those suggestions. I'd much rather just send each artist to their own page, and when the pages get to be too many I'll group them up into pages of pages under a master subscriptions page, also a with a "set:temp" tag applied to all that I'll keep blacklisted so as to keep my "subscriptions inbox" and my "manual import" inbox separate since it's much easier to process my manual imports in chunks separate from my subscription imports. Actually, I'll apply the set:temp tag to my manual imports until they're archived and just add it to my inbox searches, as it will be far less repetitive.
>>20520 >>20524 I'd also like to note, that while I still get small hangs now and then, I no longer get permanent hangs or crashing since the last update, so far.
>>20520 >Wayland I remember when Dev didn't know what Wayland was until I mentioned it. Good times, good times.
Feature request: Number of matching URLs or URLs as a number. I found this case in e621 [Warning furry, rape, NSFW]: Pool: https://e621.net/pools/19078 1st post: https://e621.net/posts/3760015 2nd post: https://e621.net/posts/3764122 3rd post: https://e621.net/posts/3765969 2nd,3rd source's links to 1st, hydrus will download the first post, skip the others and add an extra "e621 posts page" to the 1st.
>>20525 >>20524 I spoke too soon. Just got a permanent hang.
is there a way to apply hydrus-dd tagging to images already in one's database you think?
>>20528 Also got a permanent hang, not sure if memory leak but it's memory seems to climb after extended usage. from 3k to 6-7k before I got a crash. Just doing gallery downloads. New bug? Or known? I have all of my credentials input correctly.
I added the pixiv cookies to hydrus but its still refusing to download anything R18 What the fuck am I supposed to do
>>20529 That's what it's designed to do. Is it not working for you or something?
>>20531 add user agent header as well
(20.17 KB 758x311 cap.PNG)

>>20527 >2nd,3rd source's links to 1st, hydrus will download the first post, skip the others and add an extra "e621 posts page" to the 1st. Works on my machine. Ensure you have pic related checked in your import options.
>>20534 Weird, it didn't worked for me.
>>20496 added, but still getting "The parser found nothing in the document, nor did it seem to be an importable file!" using url, but galley tag works. any person point in right direction?
Had a weird subscriptions error. I accidentally made a subscription that searched the tag "artist_name", but it failed because the site used "artistname" for the tag. I went back and fixed it, reset the subscription to do another initial check, yet it failed again. I checked the the log and it showed that the url it search was still for "artist_name" instead of "artistname" without the underscore in the middle, and checked the query a third time ensuring I had corrected it. I had to delete and remake the subscription to fix this.
>>20520 >Unfortunately, my best strategy here is to wait for things to update naturally and hope a fix magically appears. Don't worry, I can absolutely understand that conclusion given the conundrum that's being spun there with all those layers making such a wobbly tower.
Could we look into adding support for https://aibooru.online? If not, how would I go about ripping from this site?
>>20539 Someone made one a few threads ago >>>/hydrus/18454 and an update for saving the notes in the last thread >>>/hydrus/20001 but I'm not a fan of the implementation as it modifies the danbooru parser. The note is also added in a not nice looking format where each table cell is on its own line with a lot of extra lines per table row. Since I now have some experience with downloaders, I could try making my own version that's a full separate package with a better note support, where it puts the actual webui compatible prompt in it, like the one you get by clicking the copy for webui button. I could also add some extra namespaced tags from the note itself like seed, sampler, model hash etc., like this hydrus import script does: https://github.com/space-nuko/sd-webui-utilities/blob/master/import_to_hydrus.py but I'm not sure if people would want that.
>>20539 >>20540 Here you go. Use one or the other, depending on what you want. The one with the extra tags also adds the ai metadata table as tags except the first row (prompt). I didn't do the login script, because I don't know how it works. One thing I don't understand though is that why did it merge the gallery parser into the danbooru one despite me giving it an unique name? Is it because they both do the exact same thing?
Sankaku changed something FUCKING AGAIN and hydrus is yet again broken for it Why is people still uploading to this place the most Why
>>20542 Only site that doesn't give a fuck about dmca or loli/shota, other than panda. I think in general that's why most nsfw ends up there. sfw is very very consistent on other sites.
>>20543 What about gelbooru and allthefallen?
>>20544 Gelbooru is inconsistent, atf is toddlercon majority
>>20545 >Gelbooru is inconsistent In terms of lolishota or dmca? >atf is toddlercon majority <14k toddler tags <68k shota tags <400k loli tags How? Tagging is bad often, but not so bad that these numbers wouldn't be strong indicators, let alone entirely wrong.
I had a good week. I fixed several bugs, some important, and reworked 'scale to fill' thumbnails. The release should be as normal tomorrow. >>20524 I am sorry, when I download these, either with the quick download click, or opening the image and then save-as, or with hydrus itself (I was wondering if my browser was doing it), I only get ~100KB image files, despite the metadata on the post saying ~11-13MB. I presume 8chan strips the excess metadata. Any chance you can catbox? You can also email me using a throwaway account (to hydrus.admin@gmail.com), or if you hit me up on discord on a Saturday we can sync live and croc it over. >>20542 Yep, sorry, this is due to last week's update. They wanged their URLs to a new format and I rolled >>20496 in. Your subs are going to go bananas a bit, but they'll refigure themselves. You can dismiss those popup messages.
>>20546 DMCA Literally go order by score it's all toddler. Their definition is clearly loose.
>>20544 >gelbooru Buncha artist have half the sankaku entries >>20544 >allthefallen Literal who and SOMEHOW more censored than sankaku being a literal pedo community
>>20547 >Your subs are going to go bananas a bit, but they'll refigure themselves. You can dismiss those popup messages. They just started redownloading as initial downloads and got 999 files each
>>20548 >Literally go order by score <order:rank for highly scored recent files I think <Check first 5 pages/100 files Not sure it's 100, I think the sorting is wonky and there are repeats <Only two images that could arguably be close to toddler <order:score for all time scored files + toddler <20th top toddler file is scored 397 <order:score <Scores of 397 are at the bottom of the fifth page, so toddler files are 20 of the top 100 <Including some gay montage videos near the top ranks that are 99% loli There's disproportionately many totcons rating files than there are totcon files, but the site is still in the vast majority a loli site. Top rated files are highly unlikely to be undertagged, so it's probably accurate too that there's no untagged toddler files in the top 100 all time rated files. I do have to express my disappointment though that the top two files are a couple of those terrible self-hypno vids. I don't know how you got the impression you did, especially if you sorted by score, but it's clearly wrong.
>>20549 >Buncha artist have half the sankaku entries But are the sankaku entries just full of dupes? >SOMEHOW more censored than sankaku Elaborate.
>>20551 Yeah, this is bait, here's your last (you)
>>20554 Projection. The tot claim is bait, as is this post. It was made quickly and without proof garner a longer response. Now let's stop shitting up the hydrus thread.
(2.35 KB 296x110 capture 1.PNG)

(1.57 KB 297x87 capture 2.PNG)

If you have a subcription popup that contains some files and then delete them all, the next time you click on the popup it will turn into an empty popup. I assume this is related to the fix for (1) in >>20399 which makes it so the subscription popup is updated when a file in it is deleted.
I tried moving my thumbnail folder to a different location and it failed to move some of the files, I then tried to move it back, and it failed again. Now I'm missing a ton of thumbnails and getting subsequent errors. Is there a way to regenerate the thumbnails?
>>20558 >>20557 Following up, seemed to be an error with my external, I unplugged and replugged and it doesn't seem to be erroring anymore. All the numbers on crysdisk info look fine, not sure.
>>20517 what is the point of this thread if not to ask questions and bother the dev without having to join discord?
>>20490 >is it possible to add a redirect or something to the "known urls" system, so that if I already have kemono.party/foo.png it will recognize kemono.su/foo.png as "already downloaded"? I would also like to know this. It'd be very useful now.
Possible suggestion, I'm not sure how github documentation works, but would it be possible to ship a pdf file (or something else) with the portable installation including all of the current documentation? Or an even more complicated solution, one built into the software. Particularly offline accessible. For us datahoarder types..
>>20562 isn't that stuff in the 'help' directory?
>>20490 >>20561 You could try exporting the files with url sidecars, do some multifile search/replace using notepad++ and import back.
>>20563 Ah it is! Thanks. This is perfect.
https://www.youtube.com/watch?v=yj2g8l1JqWI windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v550/Hydrus.Network.550.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v550/Hydrus.Network.550.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v550/Hydrus.Network.550.-.macOS.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v550/Hydrus.Network.550.-.Linux.-.Executable.tar.gz 🎉 Merry 550 🎉 I had a good week. I fixed some important bugs and reworked 'scale to fill' thumbnails. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights I fixed an issue with the login system where, for certain scripts, a job that was supposed to be waiting on an ongoing login process could nonetheless start early. All jobs are more patient about waiting for their specific login process to completely finish, no exceptions. I added a maintenance system to explicitly close and clear persistent network connections. I had underestimated how aggressive my underlying libraries were at closing these no-longer-used 'keep-alive' network connections, and a half-dead object was occupying your OS's network slots for longer than desired. Network connections now close promptly five minutes after the last use. I fixed some thumbnail display bugs, and, particularly, 'scale to fill' thumbnails are much more sensible behind the scenes. Previously, the thumbnail on disk was of the cropped visible region (which was a little complicated to figure out and maintain); now, things are simple--it just stores a full-image thumbnail at the appropriate larger scale and then draws the cropped area of that on demand. There are several technical benefits to this, and it saves some CPU in the long run, fixes some bugs for things like blurhash which need the full image to work properly, and it means an external process like the Client API can now ask for a thumbnail and get the full image rather than an arbitrary crop. If you are set to 'scale to fill', your thumbnails will regenerate as you browse. Fingers crossed, you will not notice any visual difference. Let me know if the regeneration is annoyingly laggy--I didn't want to schedule a ton of regen CPU for you all this update, but I can figure a 'catch up in the background' thing out if it is a problem. next week More small work and bug fixing, I hope. If I have time and energy, some file storage overhaul work. >>20553 Thanks, got them.
(106.95 KB 750x750 17baf.gif)

>>20566 >video
Heads up. Exhentai has changed the Full image's path is now https://exhentai.org/fullimg/ intead of https://exhentai.org/fullimg.php I have been downloading low res for some time.
Is there a way to do tag processing? Ideally this would be just a fullscreen view of the image, and customizable tags tied to shortcuts/gui buttons so I can rapidly go through and tag things as male/female or go through tagging brown/black/blonde/red hair. Is there something like this?
>>20569 Under file > shortcuts, go in the media actions category, and you can add shortcuts for tagging. But these shortcuts will work anywhere in the program, not just in a specific view.
>>20569 >>20570 Actually, I just remembered shortcut sets exist. These are probably closer to what you're looking for. Under file > shortcuts, under the "custom user sets" heading, you can add a custom set of shortcuts (which includes tagging shortcuts). Then once you're in the media viewer, you can click the keyboard icon at the top of the screen to enable or disable these shortcut sets.
I have updated the Visuabusters post parser again, it now gets creator tags from the file uploader, obviously, that assumption only works for a site where the uploader is the creator. It's pretty hacky but hey, it works. It also contains my previous note addition >>>/hydrus/19716. I reiterate, if anyone wants to send it to the Hydrus Presets and Scripts repo, go ahead.
>>20569 >so I can rapidly go through and tag things as male/female or go through tagging brown/black/blonde/red hair. If you mean similar to duplicate processing, I wouldn't recommend it. These seem like things you can tell from the thumbs pretty easily, and when you can't, you open the preview or media viewer real quick without losing a selection. Best to just highlight large groups of files at once with Shift+arrow keys and skip over any necessary files with Ctrl+left click. Tag all at once, then use tag exclusion to narrow your files to what still needs more of the current namespace you're working on. My Xgirls and Xboys tags have female and male tags as parents, so my workflow looks a little like this, >tag all 1girls >+1girl >tag all solo >-solo >tag all 1boys >+1boy >tag all solo >+1boy + 1girl >tag all character count:two >-solo -two >tag all two girls And so on until my current inbox has only a few files with large numbers of characters such that group tagging is no longer efficient. Hair color is a little trickier, but I make use of having first done character count tags and follow a similar process with some negative OR searches. Shortcuts can be useful, but I have so many tags I process this way, that it doesn't seem worth the effort of setting this up. Even if a process similar to duplicate processing could be set up, it couldn't work the same. Duplicate processing saves negative relationships when you say something isn't a dupe or alt. Negative tag relationships don't exist though, so a tag processor would have no way to distinguish between processed files that shouldn't have the tag and unprocessed files that don't have the tag.
547, 2023/11/02 07:07:30: 45869 GET /add_urls/get_url_files 200 in 14.0 milliseconds v547, 2023/11/02 07:07:30: 45869 GET /add_urls/get_url_files 200 in 12.0 milliseconds v547, 2023/11/02 07:07:30: waiting for workers to exit v547, 2023/11/02 07:07:30: waiting for services to exit v547, 2023/11/02 07:07:30: stopping services… v547, 2023/11/02 07:07:30: shutting down db… v547, 2023/11/02 07:07:30: saving and exiting objects v547, 2023/11/02 07:07:31: cleaning up… v547, 2023/11/02 07:07:31: shutting down controller… v547, 2023/11/02 07:07:31: hydrus client shut down v550, 2023/11/02 07:09:34: hydrus client started v550, 2023/11/02 07:09:36: booting controller… v550, 2023/11/02 07:09:36: booting db… v550, 2023/11/02 07:09:37: checking database v550, 2023/11/02 07:09:39: updating db to v548 v550, 2023/11/02 07:09:40: updated db to v548 v550, 2023/11/02 07:09:42: updating db to v549 v550, 2023/11/02 07:09:42: v550, 2023/11/02 07:09:42: Exception: v550, 2023/11/02 07:09:42: TypeError: bad operand type for unary -: 'NoneType' Traceback (most recent call last): File "hydrus\client\db\ClientDB.py", line 9816, in _UpdateDB File "hydrus\client\db\ClientDB.py", line 8388, in _SyncCombinedDeletedFiles File "hydrus\client\db\ClientDB.py", line 1505, in _DeleteFiles TypeError: bad operand type for unary -: 'NoneType' File "threading.py", line 973, in _bootstrap File "threading.py", line 1016, in _bootstrap_inner File "hydrus\core\HydrusThreading.py", line 427, in run File "hydrus\client\ClientController.py", line 2183, in THREADBootEverything File "hydrus\client\ClientController.py", line 1018, in InitModel File "hydrus\core\HydrusController.py", line 564, in InitModel File "hydrus\client\ClientController.py", line 229, in _InitDB File "hydrus\client\db\ClientDB.py", line 246, in init File "hydrus\core\HydrusDB.py", line 266, in init File "hydrus\client\db\ClientDB.py", line 9820, in _UpdateDB File "hydrus\core\HydrusData.py", line 872, in PrintException File "hydrus\core\HydrusData.py", line 901, in PrintExceptionTuple v550, 2023/11/02 07:09:46: updated db to v549 v550, 2023/11/02 07:09:48: updating db to v550 v550, 2023/11/02 07:09:48: updated db to v550 v550, 2023/11/02 07:09:48: initialising managers v550, 2023/11/02 07:09:48: booting gui… v550, 2023/11/02 07:09:49: The client has updated to version 550! v550, 2023/11/02 07:09:49: Trying to resynchronise the deleted file cache failed! This is not super important, but hydev would be interested in seeing the error that was printed to the log.
[Expand Post]v550, 2023/11/02 07:09:49: starting services… v550, 2023/11/02 07:09:49: Running "local booru" on port 40666. v550, 2023/11/02 07:09:49: Running "client api" on port 45869. v550, 2023/11/02 07:09:49: services started v550, 2023/11/02 07:09:49: 45869 GET /add_urls/get_url_files 200 in 10.0 milliseconds
>>20532 Sorry for the late response I guess I'm misunderstanding how I specifically hook it up to Hydrus, I've tinkered with it to the point of finding out how I tag things outside of the database and have used it to sidecar tags when importing, but I don't know how to do it to already imported items
>>20575 DeepDanbooru is quite outdated, use the wd14 tagger: https://github.com/Garbevoir/wd-e621-hydrus-tagger This one is a modified version by an anon from here, the original appears to be gone. Anyway, if you read the original readme, the last method of tagging involves you using a txt file containing hashes, so select some files in Hydrus, copy the hashes and paste them into the txt, then run the command and it should send tags straight into Hydrus.
Is there a way to actually search through the contents of notes? I have some notes that have urls to alternate versions of the the image, but they're not actually visible from the booru itself. It would be great to be able to search for "https://*" or something in these notes but I don't see a way to do that.
>>20570 >>20571 Exactly what I needed anon, thank you. >>20573 Do not doubt my autistic power.
>>20578 >Do not doubt my autistic power. I'm not. I'm telling you there's more efficient ways to be autistic.
did the Sankaku log in script change? getting a "network error 402" that goes away before i can read anything
>>20573 keyboard shortcuts for small but essential namespaces are efficient even with that methodology. You highlight a large group, and then instead of needing to open up the tag editor, you just hit a single button that applies rating:safe or gender:female or whatever
>>20581 For those sorts of ratings, I suppose. I use a "lewdness:" namespace instead of rating. Rating may be the standard term on boorus, but I don't like how the word can mean a variety of things other than how lascivious or explicit something is. Manual male and female tagging seems redundant if you're going to tag the number men and women, as male/female ought to be parent tags for xboys and xgirls. Do you not do that? Also, I do many things by mass tagging and shortcuts would help with all of them and I'd quickly end up with far too many. >Eros:non-erotic, semi-erotic, erotic >Lewdness: Safe, Lewd, Explicit is a parent tag of whatever explicit body part is shown >Character count:solo, two-ten, 1-10girls/boys/futas, many girls/boys/futas, many, none >Hair:long, short, styles, colors >Eye color wouldn't be good for shortcuts because I have to view files too closely too often >Clothes color: >Clothes:alternate outfit, nude >Body:trueflat, very small/small/large/very large breasts, small/large/very large cock >Frame:face, upper body, lower body, face cropped out, multipanel
After the 550 update, I always get this error when going to the mr bones screen v550, 2023/11/04 15:18:13: Uncaught exception: v550, 2023/11/04 15:18:13: TypeError arguments did not match any overloaded call: File "/opt/hydrus/hydrus/client/gui/widgets/ClientGUICommon.py", line 764, in paintEvent self._Draw( painter ) File "/opt/hydrus/hydrus/client/gui/widgets/ClientGUICommon.py", line 798, in _Draw painter.drawPixmap( x_offset, y_offset, self._pixmap ) drawPixmap(self, targetRect: QRectF, pixmap: QPixmap, sourceRect: QRectF): argument 1 has unexpected type 'float' drawPixmap(self, targetRect: QRect, pixmap: QPixmap, sourceRect: QRect): argument 1 has unexpected type 'float' drawPixmap(self, p: QPointF, pm: QPixmap): argument 1 has unexpected type 'float' drawPixmap(self, p: QPoint, pm: QPixmap): argument 1 has unexpected type 'float' drawPixmap(self, r: QRect, pm: QPixmap): argument 1 has unexpected type 'float' drawPixmap(self, x: int, y: int, pm: QPixmap): argument 1 has unexpected type 'float' drawPixmap(self, x: int, y: int, w: int, h: int, pm: QPixmap): argument 1 has unexpected type 'float' drawPixmap(self, x: int, y: int, w: int, h: int, pm: QPixmap, sx: int, sy: int, sw: int, sh: int): argument 1 has unexpected type 'float' drawPixmap(self, x: int, y: int, pm: QPixmap, sx: int, sy: int, sw: int, sh: int): argument 1 has unexpected type 'float' drawPixmap(self, p: QPointF, pm: QPixmap, sr: QRectF): argument 1 has unexpected type 'float' drawPixmap(self, p: QPoint, pm: QPixmap, sr: QRect): argument 1 has unexpected type 'float' v550, 2023/11/04 15:18:14: QBackingStore::endPaint() called with active painter; did you forget to destroy it or call QPainter::end() on it?
>>20524 >>20525 >>20528 Just as a side update here, I haven't looked through your logs properly, but I was told by another user with hangs/crashes during heavy PTR work that increasing the new settings under options->maintenance and processing has relieved his problems. I want to make the program do this stuff automatically, but for now, if you haven't yet, you might like to try increasing the idle/very idle 'repository processing' and 'sibling/parent sync processing' 'rest time percentages' to 300%, and reduce the max work time to 5 seconds or so. >>20527 >>20534 >>20535 Thanks, yeah, 'system:num_urls' would be great. My url searching tech is still slow in several ways, I have some plans to optimise it so we can do more advanced search like this. That 'neighbour-spam' option does its best to recognise bad situations, but the logic here can get tricky. It tends to work better over time, when a second subscription goes over the same area for whatever reason. There aren't excellent solutions to this problem, unfortunately, but it is thankfully rare. >>20537 Thanks, that's weird. It sounds like the subscription saved out of order somehow, like the old one that just ran was still hanging around and saved over the top of the edited one once the dialog exited and freed up the system. I'd normally feel good about that code, and I haven't heard of this before. Let's scratch this up to a weird situation, maybe just a second typo that you didn't notice, or something similarly odd. Let me know if it happens again or you notice anything else about this. >>20550 Yeah, this is one of the reasons I always recommend subscriptions stay lean with small initial file limits. Something can always go wrong. If you can spend the time to edit them, you can go into a subscription query's file log and basically do 'select-all'->skip' to manually fix them.
>>20556 Lol, thanks, I'll fix it. >>20557 >>20558 >>20559 Thank you for this report. Jesus, that printed ugly. I will catch and handle this error more gracefully in future. The file storage work I'm planning should help here, too, letting giant jobs like this cancel and pause on a per-file basis rather than having to work for so long and then run into such large trouble if there is a problem somewhere. WinError 17 as here https://learn.microsoft.com/en-us/windows/win32/debug/system-error-codes--0-499- looks like the helpfully titled 'ERROR_NOT_SAME_DEVICE - The system cannot move the file to a different disk drive.'. Maybe it is something that jumps in during very rare defragging time or something. Or maybe just the USB drive had some funky disconnection or hibernation event. In any case, I'll try to catch it better and promote a nicer error to the user in future. >>20490 >>20561 >>20564 Not yet, but with more of our sites doing this (I failed to add a thread watcher the other day, forgetting that we have 8chan.moe URL classes but not 8chan.se), I think it is a good idea. Ultimately, I think we'll move to URL Classes that just have multiple domains. Then I can have a checkbox that says 'these domains are all the same site', and the comparison logic can take that into account. >>20574 Thank you, I will investigate this! When you update to v551, please run database->regenerate->all deleted files and let me know if there are any more problems. Looking at the code now, you probably also want to run database->regenerate->service info numbers several times, selecting your different file services and 'trash' too.
>>20577 Not yet. iirc I have the database tech all set up--the note text is already entered into a special 'fast text search' zone, but I have to write the system: predicate and actual search code for it all. Please keep reminding me if this feature seems to slip. >>20578 BTW I know a guy who does this on a tablet that VNCs to his client computer. He has a special button overlay that hooks into AutoHotKey somehow and he just has to press buttons to add tags and do archive/inbox. He can process his files and add simple tags while sitting on the couch. >>20580 I think I recently removed the sank login script for new users. It broke and my system couldn't support whatever fix was needed. I recommend Hydrus Companion (which copies cookies and User-Agent from your browser to hydrus) for all complicated logins: https://gitgud.io/prkc/hydrus-companion >>20583 That's not good--that 'endPaint' error actually introduces program instability! I can reproduce it--it seems to be PyQt6 only. I'll fix it, sorry for the trouble!
>>20585 Thanks for the error info on the hard drive, a little scary to see a hard drive tell you 'oops the files aren't here lol'. Yeah. Quite a messy set of events, If there isn't a case to regenerate thumbnails yet, that may be something to look into, as I imagine they're generated purely off the files themselves. Though obviously this would be a hefty job. What ended up happening was 30% would be in the previous place, and 70% in the other, and obviously the 30 was unreadable.
>>20587 Ah, yeah, if you want to regen your thumbs, you can queue it up. Normally, if a thumbnail is missing when called for, hydrus will silently generate it then and there, but this can add annoying lag, so there is a maintenance routine to get it done early. BTW, if you still have access to the 30%, you can just drag and drop them manually in to the correct location. Just shut the client down and move/merge the remains of the old 't1b' subfolder from the bad location to the good one, and then when you boot the client again, it'll just find the thumbs again nice and fast. You can actually move all your subfolders this way, while the client is down, and when you boot it, it'll give you a repair dialog to find the correct locations, or in simple cases it'll give you a popup dialog saying it figured it out for you, and you don't have to bother with the migrate database dialog so much. If you don't have access to the missing thumbs and/or simply want to regen, hit up database->file maintenance->manage scheduled jobs and then, in the 'add new work' tab, do a search for 'system:everything' and hit 'run this search', and then, at the bottom, select the 'regenerate thumbnail if incorrect size' job and click 'add job'. Your client will try to load every thumb for every file you have in the background over the next days/weeks and if it is the wrong size (or missing), it'll regenerate it so it loads fast in future.
>>20588 That will possibly be good to know in the future, once I reconnected the external and ran the move files it fixed itself, but thank you.
>>20586 regarding the endpoint error: The client will eventually crash, the longer you keep Mr Bones open. If you open an image in a media page, the error is also present and crashes the client in about 3 out of 10 times.
>>20586 the endpoint error also happens, if you click once at a flash or pdf file in a file search page. Other "application" file types might also be affected by this.
Ctrl+double clicking swaps between inclusion and exclusion of tags, and reverses some binary system predicates like "has audio" and "no audio". But for some reason, it doesn't swap between "has duration" and "no duration". Can this be amended?
>>20586 >I think I recently removed the sank login script for new users. When can we expect it to be removed for current users?
sankaku decided to start redownloading everything from my 500 entries AGAIN like I just added them, this already happened 2 weeks ago, What the fuck is wrong with this shit site
>>20594 Reminder that they actively don't want people ripping from their site, it's by design
>>20394 >>20396 >>20443 >>20553 Hey, I am looking through this now. I'm regret you've had this trouble. It looks like you are enduring some super super slow jobs here, and I am sorry to say it looks like it is your HDD after all. There are many cases here where reading particular components out of your client.master.db or client.mappings.db files are taking ~700ms, just as part of a small batch of files pull (e.g. the 256-file batches when you load a file search). This kind of delay can happen on the first request on an HDD (where normal latency is ~8m, and hundreds of files means hundreds of sometimes-overlapping disk requests), but normally your disk cache steps in and smooths things out fairly quickly. In these logs, you are getting persistent delay throughout use--I'm sure you are familiar with it. Although you only have 20k files, do you have a very large database, like multiple GB? Let's say one that used to sync to the PTR, but does no longer? I can see some of this delay coming from just having a gigantic old 9GB client.master.db to wade through. But even having said that, I see that something like writing your most recent tags (which just fiddles with a few rows in the small client.db database) is taking 1.7 seconds total. This should normally be a few milliseconds, which suggests the disk is being choked by something other than the hydrus process. So, while I know the hanging and freezing is still a problem and can write some auto-throttles to relieve the situation, I think you are particularly getting the issue because there is some kind of problem with your hard drive access. Many things that should be done in a blink are taking hundreds of milliseconds, which you'd normally see, say, if a defragger was running on the drive at the same time. If you check out your drive under Task Manager and Resource Monitor, does it say like '700ms average response time' and/or the dreaded 'always 100% usage' problem? Any chance the 'dirty bit' is set on the drive? (chkdsk can help detect/clear this iirc) If the drive seems to all be working well, I can't explain this. I haven't run a client off an HDD in some years, but a 20k file db with some clever tag relationships, let's say a total database size of less than 500MB, should get well cached into your OS disk cache within a few minutes of use and run fairly well. I think something is not working correct. The sucky answer, of course, and the easy one for me to say, is, 'get an SSD'. If there is any reasonable way you can wangle it--I even know some guys who store hydrus on a $80 500GB USB SSD--I think it would be worth trying to see if things suddenly clear up. Or if you are on an 8TB NAS HDD or something with unusual write caching tech, maybe try it, just for a little while, on a different HDD. If it helps at all, there's some help here if you have the hardware and want to try putting your database on your SSD but your files on a HDD: https://hydrusnetwork.github.io/hydrus/database_migration.html Let me know how you get on!
>>20596 >do you have a very large database, like multiple GB? 55GB and counting. 42GB of that is files with duration, spread over 3,200+ files averaging 13MB. I love the new "review your fate" info. >Let's say one that used to sync to the PTR, but does no longer? I attempted once at the very beginning, but stopped because I don't have an SSD and PTR tags are too messy for my autism, so now I personally tag everything. Pretty sure I did whatever was necessary to clear the partial PTR sync out of my client. >does it say like '700ms average response time' Don't know where to look for that. >and/or the dreaded 'always 100% usage' problem? CPU is around 20-50%. Memory is around 50-60%. Top CPU processes jump around, but nothing really jumps out except for Ungoogled Chromium, which I have a bunch of tabs on. UnChromium is consistently the top memory using process at about 800-900MB, followed by Hydrus at 300MB, then not much of note besides Wangblows Antivirus at 100MB. Disk usage is usually at 0-2%. I tried doing some 1000 file multitag searches and minor test tagging. CPU and Memory usage had no noticeable changes, and Disk usage was regularly at 30-40%, even on processes I can get to consistently take some time, like entering new tags for the first time since opening Hydrus. > Any chance the 'dirty bit' is set on the drive? (chkdsk can help detect/clear this iirc) Too tech illiterate for this. >The sucky answer, of course, and the easy one for me to say, is, 'get an SSD' So long as crashes are rare, this will eventually be my solution.
Potential idea for auto tagging features in the future, or an add-on. Something that pulls a title of an item, searches for it on IMDB or assorted JAV or Hentai databases to generate tags.
I had a great week fixing a ton of bugs. As well as a variety of small stuff, there's also a fix for the Mr Bones and top-right media viewer hover crashes that PyQt users have seen this past week. The release should be as normal tomorrow.
(9.56 KB 201x79 08-03:35:00.png)

>>20597 >55gb and counting ... spread over 3,200+ files averaging 13MB. I think hydev meant the actual database's file. It's the 'client.db' file in File>Open>Database Directory. You could also go to database>db maintenance>review vacuum data, which shows all databases. This is what my vacuum pane looks like with PTR syncing (the externals).
Is it more optimal to search for dupes at increasing distances (run a search at 0, then 2, then 4, then 8) or just run one long job at 8?
>>20596 >>20600 Ah, I'm retarded. Here is the correct info.
>>20601 It's probably optimal to do one long job. I think the default was 8, but I was missing a lot that way. 12 gets 99% of my dupes, but that also means there's a lot of false positives.
>>20601 It depends on how many total duplicates there are to work though. If it's a large number I prefer to do at increasing distances starting at 0. If it's less than 200 or so I just do it all at distance 10. >>20603 Your sanity is either made of the strongest diamonds or fractured beyond repair. I can't even imagine running dupes at 12. Running at 10 is already about 99% setting things as "not related". At 10 hydrus detects things as dupes that are so inconceivably dissimilar and unrelated I don't even know how to describe it. But rarely it will turn up something to mark as related, so I'll do it. But holy fuck, 12 would probably consider the majority of my database as dupes of itself. No way am I gonna mark a couple hundred thousand things as "not related".
https://www.youtube.com/watch?v=J5I00p3KqvE windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v551a/Hydrus.Network.551a.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v551a/Hydrus.Network.551a.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v551a/Hydrus.Network.551a.-.macOS.-.App.dmg linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v551a/Hydrus.Network.551a.-.Linux.-.Executable.tar.gz I had a great week mostly fixing bugs. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights First off, thanks to a user, you can now turn off the thumbnail 'fade' transitions under options->thumbnails. This will just make any thumbnail appearance or change happen instantly, in one frame. Secondly, users who are on PyQt (macOS, and some who run from source), had some bad crashing issues last week. I changed one tiny thing, and it turns out PyQt had a big problem with it--Mr Bones and 'open externally' panels and the ratings hover in the media viewer all displayed wrong and caused program instability. Thanks for your reports, sorry for the trouble, and should be all fixed now! Other than that, I just fixed a bunch of different little things. Some transparent image handling, a weird timezone calculation, database counting issues, and a ton of file move/copy failure handling. Check the changelog for details. next week I'd like to do something new, so maybe 'system:has transparency'.
>>20605 Could we have an option to view dupes side by side for faster filtering?
>>20604 >a couple hundred thousand things as "not related". I assure it's not that bad, as I tried 14, and that's when it became unmanageable and I had to reset potential dupe relationships. At 12, when I process 200-300 new files being added to my current collection of 17k image files, I get at most about 50 obvious false positives, usually around 20-30, and I can process them really fast. I'd tell you how many not related files I've gone through, but for some reason, Hydrus seems to think I have only one. >>20607 It would be best to have a hotkey for this, as when two files are similar, being able to swap between them while maintaining location and zoom lets you compare quality or look for minor differences in alternate more efficiently. Side-by-side viewing would definitely help with every false positive, and there's far more of those in my case.
>>20608 Or just an option, the ideal situation is it shows the images side by side with stats on each according image in the middle ish, then just left right arrow to pick. (up down for alternate options possibly, with options to add or change hotkeys) Mostly cause half of the dupes are just low res or alt versions.
>>20609 >Mostly cause half of the dupes are just low res or alt versions. I don't get a lot of those, but I've also only just barely started using subs.
>>20610 I've had previous art collections I've now imported into the system (about 400k items) then I basically ripped sort:score order:score from gelbooru of almost every series, then some genres. Art collections contained multiple alternate versions of an image, one with cum, one with pubes, one with both, etc. Gelbooru and Danbooru caught alot of low res dupes of stuff I already had at higher resolutions. I'm sitting at 800k files now, with as seen in >>20605 114k potential dupes. Yes, I'm aware it warned me to not do exactly what I did, rip everything from everywhere. Now I have a couple subs set up, but haven't nearly imported my full list of artists to watch. (This doesn't even cover the idol sankaku ripping I've done) I have a big job ahead of me, and duplicate filtering GUI seems like a great example of a coder making a GUI, kek. No hate, great work with this software, everyone that has contributed and especially the developer. This is what I've needed for upwards of 7 years. Now I have to move everything into it.
>>20611 Additionally, a pretty good number on untagged, only about 55k! The PTR is a lifesaver.
>>20611 >about 400k items) then I basically ripped sort:score order:score from gelbooru of almost every series, then some genres. Well there's you're problem right there. Everything I have on my drive is handpicked and hand tagged except for artist subs, 99% of which I delete immediately. There's too much porn out there for me to hoard it en masse, so I have to be picky. I try to keep quality high and only save things that just give me that special feeling that "this is right, and I don't ever want to lose this". After over a year I'm about 80% done tagging my files that I've accumulated from the last decade, which is 20k out of 25k. Only issue is, with Hydrus making sorting easy, I've begun downloading far more art that I had in the past for the last few months.
>>20613 Well, I wasn't stupid about it, I set limits according to the amount in the genre to grab the cream of the crop. Problem is a bit of a relative word, one of my goals is collecting the best of fanart and porn, general ecchi explicit etc. But I'm only sitting at 3tb, I've been very smart about my collections. Just the initial maintenance and upkeep that has built up for 7 years is a bit of a large step. It's been about 3 weeks of consistent work just getting the initial built, then need to dedupe and hatate or iqdb tag. I care about my files, and my archive. (Size matters)
>>20574 >>20585 Ran the DB>regen last week when you posted this, no warning/erros on upgrade this week.
(6.61 KB 245x122 Clipboard01.jpg)

(5.16 KB 232x73 Clipboard02.jpg)

>>20605 Probably safe to say we're both insane.
(8.28 KB 256x63 10-01:02:01.png)

(8.90 KB 273x68 10-01:02:53.png)

>>20616 >>20605 I envy what little work you have to do
What are your future solutions to the PTR local problem? Are their ways we can anonymize the requests enough so that you're requesting a bulk check of one area of hashes rather than one hash specifically? So noone can tell what you're looking for, but the requests are small enough as to not stress the server? Just curious the thread and developers general thoughts. I only understand the hash system and PTR on a basic level.
(35.41 KB 471x540 10087329._SY540_.jpg)

>>20617 >1.8 million potential pairs Off the charts autism.
(25.83 KB 2146x1259 10-08:26:22.png)

>>20619 Trust me, I don't have the autism for dealing with it, I run through like 100 exact dupe pairs before getting tired. I prefer to just accoomulate.
(75.19 KB 1506x892 Screenshot.PNG)

How come the file history chart is showing all the way back to when I first started using hydrus when I search by import time? What date is the file history chart actually using to draw the lines? I assumed it was import time, but I guess not.
>>20618 Not an expert, so anyone please correct me if I'm wrong, but my understanding is that there's only 2 reasons the PTR connects to the internet. 1) Downloading/updating itself. 2) When you manually decide to submitting new tags. THe reason the PTR is so huge is you just download the entirety of the PTR locally. It does NOT ask the internet for tags for any particular file. You just get the whole damn thing. There's no PTR lookups that occur outside of your system, it's all processed locally. No one can tell what you search your local hydrus files for unless that person is you, looking over your shoulder, or your system is compromised. Submissions to the PTR are anonymized as far as I'm aware, but there can be no guarantees if you're submitting tags to the public. The server doesn't get hijacked by the FBI or some leet hackers or some shit. Don't submit to the PTR any tags for illegal shit, personally identifiable things, or anything you wouldn't want the entire internet to know. Keep those to your own private local tag services. AI tagging has come a long way in a short time. It's not perfect by any means yet, but at least for myself, it's good enough most of the time. It's much faster to sanity check AI tags than to manually tag myself. I expect (and hope) that one day the PTR will be entirely replaced by AI tagging built into Hydrus directly. That would be amazing. >>20617 You have my sympathy! But you'll get there eventually! Probably! Maybe? HydrusDev has mentioned before improving the dupe filter to make differences between images more apparent. I hope that will come sooner rather than later cause it would be a huge boon to us poor souls with way too many dupes.
1. Download an image that has an origin tag that is not "artist". https://www.derpibooru.org/images/2648692 has "artist:catachromatic" and "generator:thisponydoesnotexist" tags. 2. The "artist:" tag is changed to "creator", which is intended. The "generator:" tag is left out, because the only line that collects data-tag-category=origin tags only matches "artist:" tags. It seems that this regexp excludes "artist:" tags and includes "artistother:" tags: ^(?!artist:)[a-z]*:
How do I make it default to "all my files"? I thought no new duplicates were appearing after I created local file domains.
Btrfs and XFS support fast copying of a file that uses no additional space until one of the copies is modified. "cp --reflink=auto" tries to do that. This way you can add or export files to a Hydrus db on the same filesystem and not have to delete or deduplicate them later.
In manage tags, tabs with tags should stand out. Maybe with a different color. The selected tab's title could be bold instead.
Mention in the documentation what F3 does in tag manager. I would like to be able to move files from some services only to some other services, so the menu is shorter. I don't need to move images between a SFW pony service and an NSFW non-pony service.
How do I make known urls more prominent? I want to know where I got a file from. Is there a way to make a downloader always put tags into the tag service for the correspondent site, even if it downloaded in a "url import" tab? Maybe the tag list on the left could show the tag services where the tag is?
Does "side-by-side" only affect the "my tags" tab? I have favorites for the PTR, too. I'd like to be able to see "related" tags "for all my files" in tabs other than PTR. And I'd like to have two accounts. Is there a way?
Try using Zstandard instead of gzip for releases. It seems to take several times less usr time to unpack.
How do I search for "filename:_*", names starting with an underscore?
>>20623 Currently yes, but that can't be a permanent solution, exactly why I made my post. The PTR still misses so many things and is already pushing 60 gb. One of the stated concerns is being anonymous, but if you had them split into subsections you could request full sets from, noone could know which file you were asking for. I trust hatate far more over AI tagging right now.
>>20633 >can't be a permanent solution. ~60GB is trivial as far as storage goes. Until it starts taking up multiple TBs of space, there's no reason it can't be a permanent solution. >PTR misses so amny things If the PTR is missing something you want tag it yourself. It's a collaborative effort. Start collaborating. >One of the stated concerns is being anonymous, but if you had them split into subsections you could request full sets from, noone could know which file you were asking for. ..... wat? Splitting the PTR into sections would create more work for HydrusDev and the janitors for no benefit. If you're so concerned about being anonymous then it sounds like the PTR is not for you. >trust hatate over AI While I can appreciate the reasons for being paranoid about the PTR, why would you trust hatate over a local AI? Sending god knows many images/hashes to websites that could very easily be monitoring exactly what you are searching? You're okay with that, but locally processing those same images/hashes offline is a problem? Sounds like your paranoia is maybe misplaced.
>>can't be a permanent solution. > ~60GB is trivial as far as storage goes. It's on SSD and cannot be used on HDD. >>PTR misses so amny things >If the PTR is missing something you want tag it yourself. He means it's growing. >>One of the stated concerns is being anonymous, but if you had them split into subsections you could request full sets from, noone could know which file you were asking for. >..... wat? Splitting the PTR into sections would create more work for HydrusDev and the janitors for no benefit. > the janitors I think he is talking making the database consist of parts based on, say, the beginning of the file's hash, so Hydrus can request tags for all the files sharing the same prefix as the file to be tagged. The shorter the prefix, the bigger the section, the more anonymous the request, the less likely it is to save space or bandwidth. More work for HydrusDev, but not for the janitors.
>>20634 >..... wat? Splitting the PTR into sections would create more work for HydrusDev and the janitors for no benefit. If you're so concerned about being anonymous then it sounds like the PTR is not for you. Being anonymous is literally mentioned in the stated goals of PTR. >If the PTR is missing something you want tag it yourself. It's a collaborative effort. Start collaborating. You can't just jump in and do so, from what I understand every user is read only until allowed more. >While I can appreciate the reasons for being paranoid about the PTR, why would you trust hatate over a local AI? Sending god knows many images/hashes to websites that could very easily be monitoring exactly what you are searching? >You're okay with that, but locally processing those same images/hashes offline is a problem? Sounds like your paranoia is maybe misplaced. what is a vpn >~60GB is trivial as far as storage goes. Until it starts taking up multiple TBs of space, there's no reason it can't be a permanent solution. 60gb when you're barely covering even a fraction of the internet, is alot. Please shut the fuck up if you don't bother even reading the wiki.
I'm confused. Why did one of these die and the other didn't? Why is it proclaiming it hasn't found a file in 180 days, when it both found a file yesterday, and was created yesterday?
>>20637 Mm, have you checked the tag on the site? Is it identical or does he have an alternate name or tag on that site?
>>20638 Identical. Also, it wouldn't have got the first file if it was wrong in the first place. >>20637 I think I figured it out. It actually checks the date a file was uploaded to the booru. If the latest file it gets is over 6 months old, and it doesn't find anything new on the next check, it dies.
>>20636 >Every user is read only until allowed more Nope, you can pend stuff immediately after you make an access key. >60gb when you're barely covering a fraction of the internet is a lot. I don't think the PTR's goal is to cover the entire internet. In addition, the reason it's so large, as I understand it, is that it keeps "history". It knows every action that's ever been done on it, every pend and every petition. If it didn't have that it would be much smaller, it's just text connected to an image hash after all. >what is a vpn Not bulletproof, that requires putting trust in the vpn provider.
>>20640 >I don't think the PTR's goal is to cover the entire internet. In addition, the reason it's so large, as I understand it, is that it keeps "history". It knows every action that's ever been done on it, every pend and every petition. If it didn't have that it would be much smaller, it's just text connected to an image hash after all. way to run away with my point, that's why I'm bringing it up, to have a discussion about the future of the PTR, because what it does so currently won't be enough well into the future. Could we integrate danbooru md5 hash api into building more of the PTR? Auto apply tags to certain hashes after their tags have settled? Or have a system that slowly crossreferences? >Not bulletproof, that requires putting trust in the vpn provider. Plenty of good quality ones that won't fuck you for a decent price, unless you're a kiddie diddler or something, they don't give a fuck.
>>20641 >Plenty of good quality ones that won't fuck you for a decent price, I can't believe there are people who pay for a VPN. >unless you're a kiddie diddler or something, they don't give a fuck. You have absolutely no proof of that, which is why he said >that requires putting trust in the vpn provider.
>>20642 You do realize their are VPNs that get a court audit to prove they do no logging, right? If you're that autistic that you think you're still getting fucked, there's no saving you. >oh no the paid service knows i jerk off to hentai and download pirated media im headed to jail tomorrow
Is the Baraag downloader on the git deprecated? It's 2 years old and doesn't seem to be working for me.
>>20624 >The "generator:" tag is left out, because the only line that collects data-tag-category=origin tags only matches "artist:" tags. sounds like you know the problem well enough to fix it yourself. >>20628 >Mention in the documentation what F3 does in tag manager. what do you mean, the fact that F3 defaults to opening the tag manager? it's right here in the docs. https://hydrusnetwork.github.io/hydrus/getting_started_tags.html >I would like to be able to move files from some services only to some other services, so the menu is shorter. I don't need to move images between a SFW pony service and an NSFW non-pony service. if you're often moving files between services, you can set up shortcuts. under file > shortcuts, then "media actions, either thumbnails or the viewer", click add and choose the command to be a "local file command". >>20629 >How do I make known urls more prominent? I want to know where I got a file from. they're always shown in the upper right of the media viewer though... you can control what urls are displayed there under network > downloaders > manage downloader and url display.
>>20624 >The "generator:" tag is left out, because the only line that collects data-tag-category=origin tags only matches "artist:" tags. sounds like you know the problem well enough to fix it yourself. I fixed it for myself, but the defaults still have it. >>20628 >Mention in the documentation what F3 does in tag manager. what do you mean, the fact that F3 defaults to opening the tag manager? The fact that it closes it and saves.
>>20641 >Could we integrate danbooru md5 hash api into building more of the PTR? Auto apply tags to certain hashes after their tags have settled? Or have a system that slowly crossreferences? What does any of this mean?
>>20646 >I fixed it for myself, but the defaults still have it. so share the fix then!
>>20643 >You do realize their are VPNs that get a court audit to prove they do no logging, right? Prove it. >oh no the paid service knows i jerk off to hentai I don't need a VPN to jerk off to hentai. >and download pirated media I don't care if a VPN provider knows I download pirated media. >im headed to jail tomorrow Nope.
>>20649 https://securitytech.org/vpn/best/no-logs-vpn/ Simple google search shows this, though I imagine you'll say it's all lying, so I shouldn't bother taking the bait, nonetheless, here's your (you)
Help, I swear using hydrus companion to send files to hydrus (from 4chan) used to send the filename as a tag automatically but it doesn't do it anymore and I'm too smoothbrained to figure out why
>>20636 >60gb when you're barely covering even a fraction of the internet, is alot. >>20640 >I don't think the PTR's goal is to cover the entire internet. In addition, the reason it's so large, as I understand it, is that it keeps "history". It knows every action that's ever been done on it, every pend and every petition. If it didn't have that it would be much smaller, it's just text connected to an image hash after all. The internet is too big for anything to index the entirety of it, even Google. And if what you say is true it sounds like there's lots of room for storage optimization for the PTR. Improvements would always be welcome! But at is is, ~60GB isn't that much these days. That's like one AAA video game. It's not even 1/10 of my hydrus's database size. Storage capacity is always growing and getting cheaper. A 1TB SSD is like $50, which is more than enough storage to house the PTR for ages, even at it's current growth rate. It's also possible to store the database on an SSD and the files on a higher capacity HDD. It's pretty easy to do, and gives more room for the PTR on a SSD. But it does add a little bit complexity to making backups (but only a little). It just doesn't seem like that big of an issue to me. I can see how the PTR might be annoyingly large if you have a small collection and the PTR is like 10x bigger than your collection. But the time saved in tagging would probably make up for it imo. >Could we integrate danbooru md5 hash api into building more of the PTR? Auto apply tags to certain hashes after their tags have settled? Or have a system that slowly crossreferences? Don't know about danbooru, but you can search gelbooru by md5 hash. Gelbooru and danbooru typically have a pretty good overlap of content. Just make a list search urls with the md5 as a search term. Like this: https://gelbooru.com/index.php?page=post&s=list&tags=md5:9d91e7a6da57fe1a09b4f219c9f2a8e6 Can use hydrus to grab a chunk of m5ds, make a big list of urls like the above, and toss them in the hydrus url downloader. It's no API integration and a bit unconventional, but it would the job done. Might be possible in danbooru as well, but I'm too lazy to check right now. >>20644 Yup, Baraag downloader is broken for me too. They must have changed something and broke it.
>>20652 Ah was under the impression it was danbooru's API that allowed that. This would be a really neat way to start indexing and adding to the PTR more organically. Or pull tag data of new files, though I don't know how we can avoid mistagging or how soon jannies go in to clean up the tags on average. It would atleast more realistically cover the hentai side of things. 60 gb in a relative sense is alot, considering how little the PTR currently covers, if I'd guess we're lucky if we've indexed 1-2% of the most popular images and such. Especially considering how small the user base of hydrus is. All of that is sort of an irrelevant tangent. I was really just curious to see what improvements could be brainstormed, I like what >>20635 added to my idea. >I think he is talking making the database consist of parts based on, say, the beginning of the file's hash, so Hydrus can request tags for all the files sharing the same prefix as the file to be tagged. The shorter the prefix, the bigger the section, the more anonymous the request, the less likely it is to save space or bandwidth. More work for HydrusDev, but not for the janitors. Is this a realistic solution for the future or is this too far fetched? I'm too front end oriented to know much..
>>20648 >>>20646 >>I fixed it for myself, but the defaults still have it. >so share the fix then! Import the following from clipboard into content parsers on the content parser tab of "derpibooru.org file page parser" and other philomena file page parsers? [30, 7, ["tags origin (everything else)", 0, [27, 7, [[26, 3, [[2, [62, 3, [0, "span", {"data-tag-category": "origin"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 0, "data-tag-name", [84, 1, [26, 3, [[2, [51, 1, [2, "^(?!artist:)[a-z]*:", null, null, "artistjg:siber"]]], [2, [55, 1, [[], ""]]]]]]]], null]]
>>20654 Well, no, sorry, that regexp is wrong, it does not support digits and other non-alphabetic characters. Maybe I should have used just ".*" instead of "[a-z]*".
>>20655 This should matchh everything not matched by the parser that matches "artist:". [30, 7, ["tags origin (everything else)", 0, [27, 7, [[26, 3, [[2, [62, 3, [0, "span", {"data-tag-category": "origin"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 0, "data-tag-name", [84, 1, [26, 3, [[2, [51, 1, [2, "^(?!artist:)", null, null, "ab?cd8*-56+=6g:-abc*4:44"]]]]]]]], null]]
Hey I messed something up in the directory copy/merge code last week. If you get an error trying to migrate or make a backup, don't panic, it is just a dumb typo that blocks the action and is fixed for next week. If you run from source, it is fixed on master already so just git pull and you'll be sorted. >>20590 >>20591 Thanks, also hits the rating controls in the top-right hover of the media viewer. Sorry again, should be fixed now in v551. >>20592 Thanks, I saw this on Tuesday night. Should be sorted in v551, let me know if it doesn't work! >>20593 I don't like to remove things for existing users too often, just in case they have modified the script or still have some sideway use for it that I didn't think of. You can delete it yourself under network->logins->manage login scripts. >>20597 >>20602 No worries. Yeah, you have some largish files, which is probably adding some HDD latency--however, note that in that vacuum thing, most of your files are empty data! (this would have been PTR stuff that was later deleted) Let's run the vacuum on them all (from that panel), and see what happens. Looks like your external mappings is going to go from 5.4GB to 20MB, which should fit into your OS disk cache real neat. >>20598 I think the future versions of the Client API might facilitate this sort of thing. I want many more ways to suggest tags from outside sources (AI is a particularly exciting avenue for this).
Edited last time by hydrus_dev on 11/11/2023 (Sat) 22:15:00.
>>20601 >>20603 >>20604 >>20605 My two cents here is that the biggest bottleneck in the duplicate system is human processing time. Searching takes a while, but it can find pairs far faster than you can process them, so I recommend searching 0 first. These are easy dupes to figure out. If and when you actually get to the end of your 0 queue, then you can think about making the job more difficult. >>20607 I did have a system that did 'rating derivation' that used side by side UI before. I could never make it work well, and I retired the whole system. I like the overlapping back and forth of the current dupe filter since it highlights little changes, but I think you are right and we should have a side by side mode too, particularly when we integrate videos into the filter. It'll take a bit of work unfortunately--there's a lot of hacky hardcoded stuff in the current media viewer. >>20615 Great, thanks for letting me know. >>20618 >>20633 Yep >>20623 has it. We download everything right now, and per-hash or 'block of hashes' fetching is not feasible technically or for privacy. There is more on the general operation here: https://hydrusnetwork.github.io/hydrus/privacy.html#downloading That whole document, which goes into too much detail tbh, was my attempt to completely specify every scenario I could think of regarding the PTR and anon/privacy, so check it all out if you are interested. I agree that we have a wall of ice headed towards us. The PTR is too huge, and still it only grows. We need some stopgaps and exit ramps. I won't go into exhaustive detail, but in general, my current plans are a mix of: - Add recycle tech to the PTR to wash out old content - Add better filtering tech to get rid of spam and mistakes more efficiently - Add better management and filtering tech so we can have multiple PTRs that cover different and generally non-overlapping content domains - Brace for general purpose AI tagging to wash its blessed waters over us (and indeed use our existing PTR tag data to help train it, which is already happening) and mean we no longer have to share most nouns like 'blonde hair' and 'skirt' on the PTR >>20621 Hmm, I'm not sure. I bet it has an erroneous record there, maybe something that was imported in one domain at one timestamp but is in 'all my files' at another. That is unless I just set it to a blanket value based on your earliest ever import or something, I forget what I do. Thanks for the report--I'll give a look at the code, see what could be going on.
Here are two url classes to redirect direct derpibooru links to file pages. Direct download links seemingly cannot be redirected like that because of recursion. [26, 3, [[2, [50, "derpibooru image direct (no \"view\")", 12, ["857af513fdbcc93abffdc0bcf6d2d77c53846a8af983d99d2de66da35e1dc2c9", 0, "https", "derpicdn.net", [false, false, true, false, false, false, true, false], [[[51, 1, [0, "img", null, null, "img"]], "images"], [[51, 1, [1, 2, 4, 4, "2022"]], null], [[51, 1, [1, 2, 1, 2, "01"]], null], [[51, 1, [1, 2, 1, 2, "12"]], null], [[51, 1, [3, "", 4, null, "example string"]], null]], [], false, [51, 1, [3, "", null, null, "example string"]], [], [55, 1, [[[9, ["https://derpicdn.net/img/20[0-9][0-9]/\\d\\d?/\\d\\d?/(\\d+).*", "https://derpibooru.org/images/\\1"]]], "https://derpicdn.net/img/2023/11/11/3238600"]], 0, [55, 1, [[], "https://hostname.com/post/page.php?id=123456&s=view"]], null, null, 1, "https://derpicdn.net/img/2023/11/11/3238600/thumb.png"]]], [2, [50, "derpibooru image direct (with \"view\")", 12, ["fb0b69588b3604c003dbeaf4161958b9729d7625151df92c3903df4a41a25cbd", 0, "https", "derpicdn.net", [false, false, true, false, false, false, true, false], [[[51, 1, [0, "img", null, null, "img"]], "images"], [[51, 1, [0, "view", null, null, "view"]], null], [[51, 1, [1, 2, 4, 4, "2022"]], null], [[51, 1, [1, 2, 1, 2, "01"]], null], [[51, 1, [1, 2, 1, 2, "12"]], null], [[51, 1, [3, "", 4, null, "example string"]], null]], [], false, [51, 1, [3, "", null, null, "example string"]], [], [55, 1, [[[9, ["https://derpicdn.net/img/view/20[0-9][0-9]/\\d\\d?/\\d\\d?/(\\d+)[\\._].*", "https://derpibooru.org/images/\\1"]]], "https://derpicdn.net/img/view/2023/11/11/3238596safe_artist-colon-22tjones_edit_philomena_princess+celestia_storm+king_twilight+sparkle_alicorn_my+little+pony-colon-+the+movie_accessory_canterlot+cas.jpg"]], 0, [55, 1, [[], "https://hostname.com/post/page.php?id=123456&s=view"]], null, null, 1, "https://derpicdn.net/img/view/2023/11/11/3238596safe_artist-colon-22tjones_edit_philomena_princess+celestia_storm+king_twilight+sparkle_alicorn_my+little+pony-colon-+the+movie_accessory_canterlot+cas.jpg"]]]]]
>>20623 Also, the silver bullet with dupe filtering will be, I hope, automatic duplicate resolution. I've said this before in earlier threads if you want to search better explanations, but I'm going to figure out easy pixel perfect resolution tech (best example is if you have a png and a jpeg pixel perfect dupe, then always keep the jpeg and delete the png, trust me it makes sense), and then expand the system to cover more and more easy cases. Human time should only be spent on the most difficult cases here. All auto-resolution tech will be optional and highly configurable. People have very different ideas of how they want dupes handled. >>20624 Thanks, I will figure something out for the defaults. This is AI stuff, right? I will slam that to 'creator:' on our side, and note that when I update this, it will overwrite whatever is called 'derpibooru.org file page parser' in your client, so make a dupe if you have other edits you want to use. >>20625 I think the duplicates page defaults to the domain under options->search, the 'default/fallback local file search location' value. You may or may not want to actually change this, but 'all my files' is probably safe tbh, and appropriate for a client with multiple local file domains. >>20626 I am sorry to say that is a little low level and specific for me to support. I'm not a Linux guy normally, and I can't guarantee good maintenance in future, either. I wonder if there is a library like speedcopy that does similar here? Then, if you are running from source, you could just get it in your venv and it'd patch automatically (maybe with one or two lines from me, like with speedcopy)? https://pypi.org/project/speedcopy/ >>20627 Great idea! >>20628 Thanks, I'll add that F3 closes and saves. Having preferences for which services appear from which services is probably too specific for me to spend time on, but I am always battling to keep menus short, so thanks for reminding me. >>20629 >Is there a way to make a downloader always put tags into the tag service for the correspondent site, even if it downloaded in a "url import" tab? Yes, hit up network->downloaders->manage default import options. You can set up different tag import options for each 'url class' hydrus understands, which will override the defaults you have set for 'file posts' or 'watchable urls'. >Maybe the tag list on the left could show the tag services where the tag is? I'm thinking I may add separate colour settings for different tag services in future. I'd like my 'my tags' to stand out more compared to my downloader/PTR tags.
>>20631 Thanks. I'm no Linux expect, but here's the magic line we use, I think: https://github.com/hydrusnetwork/hydrus/blob/master/.github/workflows/release.yml#L129 What would you recommend instead? Is it going to be available on a normal Ubuntu out of the box, as in github cloud? >>20630 The 'side-by-side' should apply to all the tag services in manage tags, as long as there is stuff to show in them. There isn't good tech for cross-pollinating 'related' tags across services yet. You can set up two accounts for the PTR, but I strongly recommend against it. You will not gain any abilities that a normal user would want, and it will double your database size to like 180GB. >>20633 >>20634 >>20635 >>20636 Unfortunately, btw, this idea of only fetching the subsets of hashes that begin with a certain prefix or similar, while it sounds good (I've had the same thought), it isn't technically feasible. This is because your hashes are effectively random too, so if you have 300,000 files to ask the PTR about, you are going to ask for enough segments that cover 300,000 random sample spaces in the entire hash space. It is very likely that we cover the whole hash space uniformly, so to save time we'd need to have more segments that a typical user has files. How many segments should the PTR split itself into in order to make that efficient? Remember the PTR has something on the order of tens of millions of individual files, so either we have 600k segments and end up requesting half of the total data or we go for 20 million segments and are effectively just asking for hashes anyway. Also, you are now asking the PTR for 300,000 separate network/db hits, and so is every other user, which would kill the server. And then, since we are probably no longer an archiving system but an on-demand one, we'll probably need to keep asking 300k times again every n time units, and for any new file we get, which compounds the spamming problem. The juice isn't worth the squeeze. Unfortunately, the cheapest and most-cheapening-in-the-future PC component right now is hard drive space. This doesn't mean the PTR isn't headed to 'man it is just way too big m8' territory, but it is the best current technical solution to our problem. >>20637 >>20639 Yeah, if the parser is able to pull actual post dates, it uses them instead. So if you hit up a query with only 6-year-old posts, it will recognise that nothing has been posted recently and stop there.
>>20660 >>20624 > Thanks, I will figure something out for the defaults. > This is AI stuff, right? I will slam that to 'creator:' on our side, With AI stuff, it can be "generator" - "solid diffusion" is a neural network, but there are also "bing" and "bing image creator", which may just be user interfaces. "prompter" is the person who used the interface. If the generated image was edited and the artist has an artist tag, will it have four creators? >>20625 > I think the duplicates page defaults to the domain under options->search, the 'default/fallback local file search location' value. Thanks, it does. >>20626 > I am sorry to say that is a little low level and specific for me to support. I'm not a Linux guy normally, and I can't guarantee good maintenance in future, either. I wonder if there is a library like speedcopy that does similar here? All I know is cp --reflink. That's not low-level. They've been discussing its support (or, at first, non-support) for years https://github.com/python/cpython/issues/81338 >Having preferences for which services appear from which services is probably too specific for me to spend time on, but I am always battling to keep menus short, so thanks for reminding me. How do I increase the font size? I always get lost in the parser editor because I don't read the text. >>20645 >>>20629 >>How do I make known urls more prominent? I want to know where I got a file from. >they're always shown in the upper right of the media viewer though... you can control what urls are displayed there under network > downloaders > manage downloader and url display. Thanks. The default color there is too dark for black background and small font. >>20629 >Is there a way to make a downloader always put tags into the tag service for the correspondent site, even if it downloaded in a "url import" tab? Yes, hit up network->downloaders->manage default import options. Oh. I had too many selected for some importers and too few for others, so I customized the settings in my url import tabs and missed the good settings.
>>20661 > >>20631 > Thanks. I'm no Linux expect, but here's the magic line we use, I think: https://github.com/hydrusnetwork/hydrus/blob/master/.github/workflows/release.yml#L129 > What would you recommend instead? Is it going to be available on a normal Ubuntu out of the box, as in github cloud? The compressor command is called "zstd". I think the command is "tar --zstd -cvf". I think it compressed to about the same size by default. It should be available since 2018. >>>20630 >The 'side-by-side' should apply to all the tag services in manage tags, as long as there is stuff to show in them. "favourites" is a column only in my renamed "my tags" tab. It is also duplicated as a tab under the tags column. >There isn't good tech for cross-pollinating 'related' tags across services yet. What if I replace my local tag services with hydrus tag services? That seems to work. >You can set up two accounts for the PTR, but I strongly recommend against it. You will not gain any abilities that a normal user would want, Tagging some public files is not very anonymous. They could be watched by just tens of people. >and it will double your database size to like 180GB. I tried just changing the key in my normal PTR, but got errors, I think.
When I export a certain parser, it contains an example string it extracted for a tag or a note. I changed the example URL and reparsed the example data, but that string is still being exported.
>>20662 >How do I increase the font size? I always get lost in the parser editor because I don't read the text. >The default color there is too dark for black background and small font. you can change some program colours under file > options, in the "colours" section. for some other colors and for font size, you have to modify it using .qss stylesheets. you select them in the "style" section of options. the stylesheets are found in the /static/qss folder. https://github.com/hydrusnetwork/hydrus/tree/master/static/qss check out "Fixed Font Size Example.qss" for an example of a stylesheet to change font size. make a copy of it and change the font size to whatever you'd like. if you add a stylesheet, you have close and reopen options to have it appear in the dropdown. >>20664 do you mean when exporting to clipboard? just directly edit the string to something else before posting. probably won't break the parser. you could also try going in to the offending content parser, going to the string processing steps, and typing something else into the "single example string" box. i think that might work.
Just learned that booru parsers like gelbooru don't grab the ? tag... fucked up if you ask me
Things I can't understand except as technical quirks. To move an archived file to a more permanent file service, I have to inbox it, kind of putting it at risk. It is easier to delete a file without a reason than to trash it without a reason.
I have thousands of files from pastebin websites. Most of the filenames can be easily converted to URLs. What is the best way to add URLs to them? Some are mirrors of other sites, where the original URL can be derived from the mirror URL.
(333.51 KB 609x457 1459370706589.png)

Found an apng file on Gelbooru that doesn't load. Hydrus told me to report it. https://gelbooru.com/index.php?id=9219021&page=post&s=view
Just out of curiosity after seeing >>20669's post, I decided to download all files tagged APNG from Gelbooru, quite a few within the last year were broken in Hydrus, they were also broken in mpv, so I think it's a file issue unfortunately. Since they're all (currently) from the last year maybe some program got a bad update that only causes issues with programs that are sticklers for the APNG spec? In addition I found this one with broken timing, also broken in mpv, but seems to work in the file upload box here? I'm running Gentoo Linux kernel 6.5.7-gentoo with mpv 0.36.0-r1, ffmpeg 6.0-r10, and hydrus 551, if it matters. Here's a pastebin of all broken files I currently have, these APNGs are large and download slowly and I've got things to do so I'll upload another paste if more show up. https://pastebin.com/0NkmjJYN As I was writing this 4 more showed up with busted timing: https://gelbooru.com/index.php?id=2996639&page=post&s=view https://gelbooru.com/index.php?id=2999274&page=post&s=view https://gelbooru.com/index.php?id=3055038&page=post&s=view https://gelbooru.com/index.php?id=2365850&page=post&s=view
>>20658 Speaking on better filtering and management of certain tags, I don't see why certain tags can't be applied automatically, IE medium:video medium:has audio medium:webm etc. Maybe add a button when tagging that auto adds meta tags that hydrus already knows?
(712.94 KB 514x512 timing.webm)

>>20670 Okay, that didn't take nearly as long as I expected, so here's the full list: https://pastebin.com/mS7XKhsx More timing issues showed up as the images got older. Probably some bad encoding at the time if I had to guess. While some are kinda funny they're definitely not supposed to be this fast.
>>20671 >IE medium:video medium:has audio medium:webm etc. Maybe add a button when tagging that auto adds meta tags that hydrus already knows? I'm very confused here. Can you not already do "has duration", "has audio", and pic related?
>>20671 Seconding this, it would be pretty nice to be able to add filetype tags. >>20673 I think he means having them added automatically. E.g. import a webm with sound and it gets given medium:webm, medium:animated, medium:has audio, NOT system:filetype is webm, system:has audio, system:animated. A lot of boorus have something similar but the system tags are kind of removed from "real" tags.
>>20674 System tags are immutable though. If you turn them into regular tags, that introduces the possibility of errors in tagging those things since they can be added manually in addition to being parents of system tags.
>>20667 I stumble upon the archived block in the duplicate filter every time.
It would be helpful if "import options" indicated whether or not the options on all of the tabs in there are "(is default)".
It'd be cool if you could add "frames per second" as one of the duplicate comparison factors. I know that Hydrus doesn't currently scan for video duplicates, but the user-made Hydrus video deduplicator has been working pretty well for me so far, so having some extra things to compare by for videos would be helpful, and it'll be useful when Hydrus finally supports video de-duping natively anyway.
One of my files failed to export on Linux due to a long filename. I think it contained Cyrillic characters. Maybe the length was counted in characters, but the limit is 255 bytes.
>>20673 >>20675 Correct, but that data has utility to janitors and the general use of the PTR, I'm assuming that's why they're tracked on the PTR. These tags would only be for more efficient tagging for the repository.
Careful, this needs bandwidth limits for *.wikia.nocookie.net. The default 1 request per second seems too fast. It downloads images only from pages without ":" or "/", so no Gallery pages (those have way too many images, so better don't change that). puts the image caption into a note "mlp.fandom.com caption" and into a "description:" tag. Most of the images are screencaps, so I'm not sure if that should be "title:". puts the page title into a "site:mlp.fandom.com:" tag. [58, "mlp.fandom.com page parser", 2, ["mlp.fandom.com page parser", "cf5c0601f799c48c01088e88098a4a94be3cd53c8a023b62f32e4969c03dfd26", [55, 1, [[], "example string"]], [[[27, 7, [[26, 3, [[2, [62, 3, [0, "figure", {"class": "thumb"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 2, "href", [84, 1, [26, 3, []]]]], [58, "thumbnail to image", 2, ["thumbnail to image", "7dc6909dd3c4733b22cc17b11ffe6ea36a5f709cec1930afa40b0e2fa6c4654f", [55, 1, [[], "example string"]], [], [26, 3, [[2, [30, 7, ["get caption to note", 18, [27, 7, [[26, 3, [[2, [62, 3, [0, "figcaption", {}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]], [2, [62, 3, [0, "p", {"class": "caption"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 1, "p", [84, 1, [26, 3, []]]]], "mlp.fandom.com caption"]]], [2, [30, 7, ["get caption to tag", 0, [27, 7, [[26, 3, [[2, [62, 3, [0, "figcaption", {}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]], [2, [62, 3, [0, "p", {}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 1, "p", [84, 1, [26, 3, [[2, [55, 1, [[[2, "description:"]], "The main characters visiting the Crystal Empire for the first time."]]]]]]]], null]]], [2, [30, 7, ["get url", 7, [27, 7, [[26, 3, [[2, [62, 3, [0, "a", {}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 0, "href", [84, 1, [26, 3, []]]]], [7, 50]]]]]], [], {}]]]], [26, 3, [[2, [30, 7, ["page title to tag", 0, [27, 7, [[26, 3, [[2, [62, 3, [0, "h1", {}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]]], [2, [62, 3, [0, "span", {"class": "mw-page-title-main"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 1, "href", [84, 1, [26, 3, [[2, [55, 1, [[[2, "site:mlp.fandom.com:"]], "\t\t\t\t\tCrystal Empire\t\t\t\t"]]]]]]]], null]]]]], ["https://mlp.fandom.com/wiki/Crystal_Empire"], {"url": "https://mlp.fandom.com/wiki/Crystal_Empire", "post_index": "12"}]]
(4.80 KB 512x117 4chan parser png.png)

>>20681 Just a heads up, you can export downloader objects as pngs which are easier to share. This is the default 4chan parser as an example. You can also export multiple types (parser, gug, url classes, login scripts, and even bandwidth rules) in 1 png from network>downloaders>export downloaders.
>>20682 Hydrus and my browser are run as different users, so I would have to save the image into a directory accessible to both and make sure it is readable to Hydrus, and I have no idea how to see what the png actually contains and where it is from.
>>20683 I see, fair enough.
>>20683 Besides, I am not allowed to post files here with the only access method that works at all. Here are the URL classes: [50, "mlp.fandom.com page", 12, ["8ac1e89b19b909545b00dbc7f7d9b5221fa78ef51cfe1c4dc984fec8e9c96b2e", 0, "https", "mlp.fandom.com", [false, false, true, true, true, true, true, false], [[[51, 1, [0, "wiki", null, null, "wiki"]], null], [[51, 1, [2, "[^:].*", null, null, "Crystal_Empire"]], null]], [], false, [51, 1, [3, "", null, null, "example string"]], [], [55, 1, [[], "https://hostname.com/post/page.php?id=123456&s=view"]], 0, [55, 1, [[], "https://hostname.com/post/page.php?id=123456&s=view"]], null, null, 1, "https://mlp.fandom.com/wiki/Crystal_Empire"]] [50, "mlp.fandom.com image url (wikia)", 12, ["287e123cfcc4fb595028de14954ae55743b5507f7273317a945bada1c4ae4878", 2, "https", "static.wikia.nocookie.net", [false, false, true, false, false, false, true, false], [[[51, 1, [3, "", null, null, "mlp"]], null], [[51, 1, [0, "images", null, null, "images"]], null], [[51, 1, [1, 1, null, null, "8b"]], null], [[51, 1, [1, 1, null, null, "88b"]], null], [[51, 1, [3, "", null, null, "Future_Diamond_Tiara_ID_Gameloft.png"]], null]], [], false, [51, 1, [3, "", null, null, "example string"]], [], [55, 1, [[], "https://hostname.com/post/page.php?id=123456&s=view"]], 0, [55, 1, [[], "https://hostname.com/post/page.php?id=123456&s=view"]], null, null, 1, "https://static.wikia.nocookie.net/mlp/images/8/88/Future_Diamond_Tiara_ID_Gameloft.png/revision/latest?cb=20200109011459"]]
I'm looking at the skeb.jp artist page parser from the Presets-and-Scripts repo, in short it grabs an <img with the "is-thumbnail" attribute, then goes back up to grab the work page href, "/@artist/works/###". I'm no parser or html expert but it looks like the thumbnails don't even show up in Hydrus' parse. I thought it might be due to Javascript, but even with Javascript disabled in Ublock I can still see links to the "/@artist/works/##" pages in the page source. Anyone smarterer then me know what's going on here? The good news is skeb does seem to work with hydownloader, but I'd much prefer to have it in Hydrus if at all possible.
>>20681 Added support for the infobox image. [58, "mlp.fandom.com page parser", 2, ["mlp.fandom.com page parser", "cf5c0601f799c48c01088e88098a4a94be3cd53c8a023b62f32e4969c03dfd26", [55, 1, [[], "example string"]], [[[27, 7, [[26, 3, [[2, [62, 3, [0, "table", {"class": "infobox"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]], [2, [62, 3, [0, "td", {}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]]], [2, [62, 3, [0, "a", {"class": "image"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]], [2, [62, 3, [1, "td", null, null, 1, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 2, "src", [84, 1, [26, 3, []]]]], [58, "infobox to image", 2, ["infobox to image", "64fbf7c4b62b077c397666f015c66d01c7ec2edc812507a3cd66c2db9d219d3e", [55, 1, [[], "example string"]], [], [26, 3, [[2, [30, 7, ["get alt text", 18, [27, 7, [[26, 3, [[2, [62, 3, [0, "a", {}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]]], [2, [62, 3, [0, "img", {}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 0, "alt", [84, 1, [26, 3, []]]]], "alt text"]]], [2, [30, 7, ["get caption to note", 18, [27, 7, [[26, 3, []], 1, "p", [84, 1, [26, 3, []]]]], "mlp.fandom.com caption"]]], [2, [30, 7, ["get caption to tag", 0, [27, 7, [[26, 3, []], 1, "p", [84, 1, [26, 3, [[2, [55, 1, [[[2, "description:"]], "The main characters visiting the Crystal Empire for the first time."]]]]]]]], null]]], [2, [30, 7, ["get url", 7, [27, 7, [[26, 3, [[2, [62, 3, [0, "a", {}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]]], [2, [62, 3, [0, "img", {}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 0, "src", [84, 1, [26, 3, []]]]], [7, 50]]]]]], [], {"url": "https://mlp.fandom.com/wiki/Lena_Hall", "post_index": "1"}]]], [[27, 7, [[26, 3, [[2, [62, 3, [0, "figure", {"class": "thumb"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 2, "href", [84, 1, [26, 3, []]]]], [58, "thumbnail to image", 2, ["thumbnail to image", "7dc6909dd3c4733b22cc17b11ffe6ea36a5f709cec1930afa40b0e2fa6c4654f", [55, 1, [[], "example string"]], [], [26, 3, [[2, [30, 7, ["get caption to note", 18, [27, 7, [[26, 3, [[2, [62, 3, [0, "figcaption", {}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]], [2, [62, 3, [0, "p", {"class": "caption"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 1, "p", [84, 1, [26, 3, []]]]], "mlp.fandom.com caption"]]], [2, [30, 7, ["get caption to tag", 0, [27, 7, [[26, 3, [[2, [62, 3, [0, "figcaption", {}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]], [2, [62, 3, [0, "p", {}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 1, "p", [84, 1, [26, 3, [[2, [55, 1, [[[2, "description:"]], "The main characters visiting the Crystal Empire for the first time."]]]]]]]], null]]], [2, [30, 7, ["get url", 7, [27, 7, [[26, 3, [[2, [62, 3, [0, "a", {}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 0, "href", [84, 1, [26, 3, []]]]], [7, 50]]]]]], [], {}]]]], [26, 3, [[2, [30, 7, ["page title to tag", 0, [27, 7, [[26, 3, [[2, [62, 3, [0, "h1", {}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]]], [2, [62, 3, [0, "span", {"class": "mw-page-title-main"}, null, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 1, "href", [84, 1, [26, 3, [[2, [55, 1, [[[2, "site:mlp.fandom.com:"]], "\t\t\t\t\tCrystal Empire\t\t\t\t"]]]]]]]], null]]]]], ["https://mlp.fandom.com/wiki/Crystal_Empire"], {"url": "https://mlp.fandom.com/wiki/Lena_Hall", "post_index": "1"}]]
>>20681 >It downloads images only from pages without ":" That will need to be fixed.The problem is that File: pages were downloaded when a limit was missing somewhere in the parser.
I had a great week. I fixed some bugs and added 'system:has transparency' search. The release should be as normal tomorrow.
edit import folder should have an option to skip recently modified files, because they may not be fully downloaded yet.
>>20689 Can watching clipboard be an option for windows users for url import?
https://www.youtube.com/watch?v=VE2NCPnULA0 windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v552a/Hydrus.Network.552a.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v552a/Hydrus.Network.552a.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v552a/Hydrus.Network.552a.-.macOS.-.App.dmg linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v552a/Hydrus.Network.552a.-.Linux.-.Executable.tar.zst I had a great week. There's a bunch of small fixes and improvements, and the addition of 'system:has transparency' search. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html has transparency So, the database can now remember if a file has transparency. This might mean a file with fully transparent pixels, or it might just be an area that is semi-transparent. You can now search it under the new 'system:file properties' entry, selecting 'system:has/no transparency' (or you can just type it). Like 'has exif' and others before it, 'has transparency' will be correct for all new files instantly, but figuring it out for your existing files will take a bit of background maintenance work. It will take some time for the full results to populate. You can review how much it has still to do under database->file maintenance->manage scheduled jobs. Note that this is not a 'this file is RGBA' test. There are a bunch of files out there with an alpha channel that is all 100% opaque, so my 'has transparency' check actually looks for impactful transparency information in the image. I'm also keeping it simple to start with, so we are scanning all images except jpegs and animated gifs. No Krita or PSD or anything yet--I'm not sure if our various rendering hacks are even able to pass along accurate alpha info. As always, I'm interested in seeing any unusual files that fail the test. However, while doing this work, I encountered several files that looked normal but still got the 'has transparency' label. When I inspected them closer, I discovered they had a single 98% opaque pixel, or a border with a slight fade, or just one invisible corner pixel. Maybe some of these pixels are secret artist markers, or perhaps they are tool errors or drawing tablet smudges. They probably aren't really what we are interested in finding with a 'system:has transparency', so be on the watch for them and let me know how bad the problem is in an IRL client. Maybe I can fine-tune this system to say 'image is at least 0.3% transparent'. other highlights I fixed a stupid logical typo in the folder move/copy tech changes from last week. If you couldn't run an internal backup or migrate your hydrus folders around, sorry for the trouble! Should all work correct again. I fixed the 'open externally' thumbnails position if your thumbnail supersampling is >100%, and made it so files without thumbs (e.g. zip, epub) will show their filetype thumb or the hydrus icon. 'system:number of character tags > 4' now parses if you type it in (previously, this didn't support the namespace filter). 'unnamespaced' should work too. 'system:has audio' and the 'system:embedded metadata' stuff, which never had good homes, are now all merged under that new 'system:file properties' entry. If you can't find something weird, check there! next week Everyone around me is sick, and I can feel myself going, so it may be up in the air! In any case, I think I'd like to take a simple code cleanup week. Nothing too clever or amazing, but should be some small fixes and QoL.
>>20692 >has transparency I have a lot of those PNG files. Thanks
>>20692 > 'generator' and 'prompter' namespaces (converting both to the hydrus-appropriate 'creator') I think some prompters and real creators might be unhappy with that. If a generator gives you something really funny, but you would never create that yourself because you don't like it or it is low quality or illegal, but you are still a prompter, you don't want to be called a creator of it. If it generates something too similar to an existing work, the artist could think you are a plagiarist.
https://hydrusnetwork.github.io/hydrus/getting_started_searching.html#the_dropdown_controls > Just searching 'pending' tags is useful if you want to scan what you have pending to go up to the PTR--just turn off 'current' tags and search system:num tags > 0 system:has tags and choose the appropriate tag service.
>>20695 How do I filter only by the stored pending tags, not siblings and parents?
Can't search for files without a rating. Can't export ratings. The only way to get ratings out is to export sets of rated files and all files, then deduplicate.
>>20679 It happens with sidecar files with Cyrillic or Japanese characters.
>>20698 and images themselves, too
System.Windows.Threading.DispatcherUnhandledExceptionEventArgs The remote server returned an error: (500) Internal Server Error. System System.Collections.ListDictionaryInternal System.Net.WebException: The remote server returned an error: (500) Internal Server Error. at System.Net.HttpWebRequest.GetResponse() at Hatate.HydrusApi.DownloadThumbnail(HydrusMetadata hydrusMetadata, String thumbsDir) at Hatate.HydrusApi.<>cDisplayClass10_0.<DownloadThumbnailAsync>b0() at System.Threading.Tasks.Task`1.InnerInvoke() at System.Threading.Tasks.Task.Execute() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Hatate.MainWindow.<MenuItem_QueryHydrus_Click>d__125.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.AsyncMethodBuilderCore.<>c.<ThrowAsync>b__6_0(Object state) at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs) at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler) at System.Net.HttpWebRequest.GetResponse() at Hatate.HydrusApi.DownloadThumbnail(HydrusMetadata hydrusMetadata, String thumbsDir) at Hatate.HydrusApi.<>cDisplayClass10_0.<DownloadThumbnailAsync>b0() at System.Threading.Tasks.Task`1.InnerInvoke() at System.Threading.Tasks.Task.Execute() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.TaskAwaiter.ThrowForNonSuccess(Task task) at System.Runtime.CompilerServices.TaskAwaiter.HandleNonSuccessAndDebuggerNotification(Task task) at Hatate.MainWindow.<MenuItem_QueryHydrus_Click>d__125.MoveNext() --- End of stack trace from previous location where exception was thrown --- at System.Runtime.CompilerServices.AsyncMethodBuilderCore.<>c.<ThrowAsync>b__6_0(Object state) at System.Windows.Threading.ExceptionWrapper.InternalRealCall(Delegate callback, Object args, Int32 numArgs) at System.Windows.Threading.ExceptionWrapper.TryCatchWhen(Object source, Delegate callback, Object args, Int32 numArgs, Delegate catchHandler) Hatate keeps crashing with this error on 551, explain what's wrong to a noob, guys?
Got a serious looking warning when running the new transparency job, it seems gifs with incorrect endings cause the job to trip up. I've reran a full hash scan etc and no issues were encountered so I don't think a hard drive issue is at fault here, the gifs themselves play fine on Hydrus and on other image views have visibly missing bottom sections on the last frame. No doubt a download gone wrong. v552: OSError: image file is truncated (135 bytes not processed) Traceback (most recent call last): File "hydrus\client\ClientFiles.py", line 2752, in _RunJob File "hydrus\client\ClientFiles.py", line 2432, in _HasTransparency File "hydrus\client\ClientFiles.py", line 1899, in HasTransparency File "hydrus\client\ClientVideoHandling.py", line 248, in read_frame File "hydrus\client\ClientVideoHandling.py", line 219, in _RenderCurrentFrame File "hydrus\client\ClientVideoHandling.py", line 105, in _GetCurrentFrame File "PIL\Image.py", line 1702, in paste File "PIL\ImageFile.py", line 266, in load OSError: image file is truncated (135 bytes not processed) File "threading.py", line 973, in _bootstrap File "threading.py", line 1016, in _bootstrap_inner File "hydrus\core\HydrusThreading.py", line 427, in run File "hydrus\client\ClientFiles.py", line 3004, in ForceMaintenance File "hydrus\client\ClientFiles.py", line 2853, in _RunJob File "hydrus\core\HydrusData.py", line 884, in PrintException File "hydrus\core\HydrusData.py", line 913, in PrintExceptionTuple v552: Hey, while performing file maintenance task "determine if the file has transparency" on file [hash], the client ran into a serious I/O Error! This is a significant hard drive problem, and as such you should shut the client down and check your hard drive health immediately. No more file maintenance jobs will be run this boot, and a full traceback has been written to the log.
>>20692 >>20694 And it seems from the list that it still ignores any unknown origin tags.
>>20692 > It will take some time for the full results to populate. You can review how much it has still to do under database->file maintenance->manage scheduled jobs. Anyway this can be added to shutdown jobs? Would be nice to just get it all out of the way at once when I'm done with Hydrus for the day.
>>20701 I just got something like that, too.
It would be nice to see somewhere in the file log pages which parser a url is ultimately being parsed with. I spent almost an hour trying to debug a downloader that wasn't getting any files, going through the API redirector and the url normalizer and all that, only to finally realize that the url was actually just somehow bound to the wrong parser.
>>20703 Reiterating on this, as I left Hydrus running all night and it only went through 800 of the 10000 files it needs to check for transparency during idle time.
>>20703 I'm surprised that Hydrus doesn't allow all of the file jobs to just also be shutdown jobs. It feels like an obvious time to do the work.
Anyone else have these issues with wayland? The manage tags window pops up in the middle of the image, and the mouse cursor disappears when I move it over the recent tag list to select multiple tags.
>>20681 >>20687 >>20688 It still doesn't get some infobox images and images that are not figures, but are in lists/galleries. Errors probably caused by throttling are still happening. Lists may require custom parsers. For the rest, maybe there is an existing parser for MediaWiki. Unless the captions or filenames need to be copied to PTR, I won't edit it.
>>20692 I am using the most recent version and sadly still no dice on wayland but there seems to be some error with python too. http://sprunge.us/PP9VJp I'd like to "solve" the python import issues so I can be sure it's just still the same wayland/constructor problem we'd been talking about a few days ago... Thanks for your attention. :)
>>20692 >>20701 I am getting the same thing. File causing issues for me is here: https://e621.net/posts/883567 I tried deleting and re-importing the file and the same thing happens presumably because you're running the same transparency check when files are imported. Now I just have a missing file I can't re-import. My error was a bit different and very serious looking (my harddrives appear fine and there is nothing in dmesg). Hey, while performing file maintenance task "determine if the file has transparency" on file <hash>, the client ran into a serious I/O Error! This is a significant hard drive problem, and as such you should shut the client down and check your hard drive health immediately. No more file maintenance jobs will be run this boot, and a full traceback has been written to the log.
Hey, if you get "serious I/O Error! This is a significant hard drive problem" popups during the maintenance work this week, this is a false positive! You can ignore it completely. Sorry for the trouble, I will silence them this week. >>20701 >>20704 >>20711 Hey, sorry about this! I got several of these too and got a heart attack. It looks like PIL (Python Image Library) is giving a way too serious error here (OSError) when a file is slightly broken. There are truncated files all over the internet, and normally you'd get 'BustedFileError', but I guess it is sperging for whatever reason here thinking the truncation was due to I/O. Maybe they changed it in a newer version of PIL. That error is propagating up to my maintenance system, which is then panicking. I will silence this, sorry for the trouble.
What's the meta for adding (and importantly tagging) AI generated images? I found https://aibooru.online/ but it's seriously lacking. If I use hydrus companion to add an image from a DALL-E thread on 4chan can I use a preset to auto tag it with "ai generated"?
>>20713 just took a closer look at the options, yes there is a way to do it. Too bad there's no comprehensive booru from what i can tell
Now I know I make mistakes with parents and children now and then, but I think there's some error going on where a group of random tags from a particular file are all suddenly made child tags of some other tag on that file. There's no way I'm accidentally adding a whole bunch of tags from one particular file as child tags to a parent, right? Is there even a way to easily do that? I have some Rivet (Ratchet and Clank: Rift Apart) files. Every once in a while, I'll be tagging something, and notice for some reason the "ip:ratchet & clank series", and "developer:insomniac games" tags have been applied through a child parent relationship. When I go to check what tags "ip:r&c" for short has as children, I notice it seems to be a bunch of tags I'd likely find on any file with Rivet in it. This has happened multiples time now on different versions of Hydrus, and I am currently on the latest version. As far as I know, you have to manually type out and select tags in the parent manager, and the only way to select a bunch of disparate tags at once would be to select them outside of it and open the parent manager. But even then, you'd see a long list of tags on the left, which is something I very rarely ever see when applying new parent child relationships and is always something I do intentionally when I do see it. I just cannot fathom accidentally doing that somehow, multiple times, and just with the same ip:r&c tag as the parent tag receiving bad children tags. I haven't tagged any new Rivet files in over 3 weeks, though I could see myself not noticing the incorrect tags for a time.
>>20662 >If the generated image was edited and the artist has an artist tag, will it have four creators? Yeah. I'm not totally up on the latest formats, but I think the popular anime boorus were going in this broad 'creator:stable diffusion' kind of direction, and maybe the PTR siblings are too. I forget it now, but I think we have 'medium:ai generated' populating too, which is nice as a catch-all. I'm fine with multiple creators; even if the prompter didn't paint the thing, tags are for searching, so they'll let you find who makes AI images in style x or y nice and easy, and also let you exclude anything by ai easy too. That said, a lot of these AI guys make 17 images and disappear, so maybe we'll rethink, or maybe it doesn't matter. I'm in flux about a lot of this stuff atm. >They've been discussing its support (or, at first, non-support) for years https://github.com/python/cpython/issues/81338 Yeah, I think this is python's job to handle. I don't want to get involved in the nitty gritty of filesystems I can't even test on. >>20663 Thanks, did it! Also got to be named .tar.zst, which threw github for a loop when I tried to rename it, ha ha ha. >"favourites" is a column only in my renamed "my tags" tab. It is also duplicated as a tab under the tags column. Ah, you can set different favourites for your different tag services. Hit up options->tag suggestions and then there should be a dropdown under the 'favourites' tab to change the service. Once you set some favourites for the PTR, that column should appear in manage tags. There is also, in a stunningly bad design decision that I still regret and will eventually rework, a second favourite tags system that sits under the autocomplete input under the 'favourites' tab there, under options->tags. >>20666 Ah, I think there's a bit of regex in some of the older downloaders that skips +, -, and ? characters since several boorus have that somewhere in the tag HTML, like they'll have: + - ? skirt And those characters are links to 'include in search', 'exclude from search', and 'go to tag definition page', and since the HTML there is difficult to parse, I think we just ignore that start of the string. We have more parsing tools now and there may be a better solution, if this is indeed what is going on. I'll make a job to check up on it. >>20667 >>20676 >To move an archived file to a more permanent file service, I have to inbox it, kind of putting it at risk. Can you talk about this more? Is it with the archive delete lock on? Damn, I'll fix that. And for trash without a reason, yeah, sorry. I'd say just leave it as the default like 'deleted from a media page' thing. Can you say why you'd like to trash without a reason? Just so I can understand the workflow and problem here better. >>20668 No great answer yet, if you want to pull directly from the filenames in hydrus, but if you can do some scripting, then generate 'filename.txt' sidecar files that include the URLs and then import the files and this URL metadata using the sidecar system. I'd like to add more metadata (beyond just tags) to filename parsing, I'm planning an expansion of the whole system. The new sidecar tech basically allows me to convert whatever to whatever, so I just have to put the work in to update it all. >>20669 >>20670 >>20672 Thank you, this is a useful list! With luck, mpv will have better handling for borked files in future. I'll see if I can improve some of the error handling on my side of things.
>>20673 >>20674 >>20675 >>20680 I don't like these tags myself, since the system tags are better, but I know some users like them nonetheless. I'd say don't put them on the PTR purposefully, but accidentally is fine, and we expect to filter them out once the filtering tools are better. Since some users do like them, and we'll have lots of other uses for this tech, I expect to eventually unveil a 'if file meets this metadata condition, apply this content update to it' capability in future. Then you'll be able to say 'if it has duration, add "has duration"', or whatever you like, to your personal tags. >>20677 Great idea! >>20678 Great idea. I keep meaning to completely overhaul the comparison statement system, but it isn't happening and hardcoding another line is ok when it is so useful. >>20679 >>20698 >>20699 Can you post a traceback of one of these errors? I'll bet there is a simple fix here to add the unicode or whatever support, but I'm just missing it because I'm not a Linux guy. >>20690 At the moment, I think the various scan-for-files-to-be-imported do a thing where they see if a file is currently being used by another process. This generally catches files being downloader by browsers and stuff, but have you noticed it failing for you? Can you talk more about the scenario, if so? A modified check might be a good solution, if there is a problem, but maybe something else would too. >>20691 Can you say more about this? Doesn't network->downloaders->watch clipboard->other recognised urls work? I've always wanted to add a column to the URL Class definitions somewhere to say whether they count for this, but if you just want all URLs to be added, I think that works?
>>20717 > Can you say more about this? Doesn't network->downloaders->watch clipboard->other recognised urls work? I've always wanted to add a column to the URL Class definitions somewhere to say whether they count for this, but if you just want all URLs to be added, I think that works? ah this works, thank you
>>20697 To search for files without a rating (you mean hydrus ratings, right, the little squares and stars and stuff you can click?), hit up system:rating and then click 'is' and leave the rating icon set to the default grey. I'll add some text to say this in UI. I'd like to add ratings to the sidecar import/export system in future! >>20700 Looks like a thumbnail download failed. Can you check your hydrus log--if it is hydrus's Client API giving a 500, it should be logged at the same timestamp. Should be in (install_dir/db/client - date.log). Might be it is Hatate's code breaking though, and I'm afraid I can't help there. >>20703 >>20706 >>20707 My maintenance code is a huuuuuge mess, I'm afraid. I made an attempt a couple years ago to write a whole new system that would allow all jobs to report when they could run and give status updates and stuff, no matter if they were in idle, or being forced to work, or happening in shutdown, but the effort petered out. Mostly I've actually been moving away from the shutdown and more to the 'just work whenever in tiny bits in the background', since it is technically simple and causes less 'you can't use hydrus now' time. If you want to hurry this processing, hit up database->file maintenance->manage scheduled jobs and then click the 'do all work' button. It'll run the list hard. I recommend giving it manual breaks every 10k or 100k files, depending on how powerful your computer is. >>20705 Sorry for the trouble. Try turning on help->debug->report modes->network report mode. It makes a ton of popups and should say which parser is lined up with the URL being worked on. Let me know if it doesn't say enough for you!
>>20708 Sorry, Wayland still seems to have some trouble with our Python Qt. You can ctrl+f this page for more 'wayland' mpv problems, and the previous threads we've had issues too. You might have some success with the 'frame locations' list under options->gui. Hit up 'manage_tags_dialog' (thumbnails) and 'manage_tags_frame' (media viewer), and changing how it decides to initialise its size and position. Maybe telling it to remember the old size and position will work better for you? >>20710 Ah, it looks like you don't have any PySide or PyQt installed in your venv! You need one of PyQt5, PyQt6, PySide2, or PySide6 to run hydrus. I default to PySide6 for most purposes, which is Qt6. There's more info here: https://hydrusnetwork.github.io/hydrus/running_from_source.html Forgive me, but I'm not sure what your situation is, but if you are trying to put a 'running from source' situation together, you'll need to sort this out, maybe in a clever way (maybe Wayland won't install PySide6 off the normal requirements.txt, let's say), but if you are using something like the Arch AUR thing, maybe that's similarly failing to automatically set your venv up. I can help here, but can you talk more about you exact situation? Have you been trying to run from source? Did this once work, but now it fails? >>20713 >>20714 For now, you probably want to do that stuff mostly manually with ctrl+a->F3->ai generated. Everything AI is still in flux, we just don't know where we are settling, but have a play yourself, see what works for you, and as AI boorus become real and popular, I'm sure we'll get some dedicated downloaders. >>20715 >Is there even a way to easily do that? No, shouldn't be. Sorry for the trouble here. Is this for tags on the PTR, or is this like 'my tags', where it is all local and you are the sole user? If this is all yours, this sounds like a bug or a confusion due to bad UI/help. I can't imagine how tags would spontaneously be set as children, so this could be an UI failure (copying/displaying in the wrong way), or a logical bug, or even something like database corruption pointing tags to the wrong place (probably not though, it wouldn't be related tags most likely). If it happens again, can you write down the exact tags involved (or rename them, I don't care about the words I just need the relationships) and tell me what happened, when you last (and how) you changed them yourself, and how you expected it to look? If I know that tag A is appearing in place x instead of y, we'll be able to narrow down on what is going on here. Oh, another side thing, actually, how is your sync under tags->sibling/parent sync->review? All caught up? Ultimately, I think siblings and parents need a complete visual overhaul. I wish we had nice auto-generated tree graphs.
>>20720 My tags, all synced. Very rarely do I make mass relationship edits that require longer syncing. I'll make sure not to immediately fix the tags if it happens again.
Is there a way to have two different parsers for a single url class that I can freely decide which one to use? The only thing I can think of is manually reassigning them in the manage url class links dialog before I paste the url.
>>20713 >>20714 What exactly are you trying to tag? Stuff you see or what is in the prompt? For visual tags you can use an ai tagger. There's one that works with hydrus: https://github.com/Garbevoir/wd-e621-hydrus-tagger If you want gen metadata, there's this script: https://github.com/space-nuko/sd-webui-utilities/blob/master/import_to_hydrus.py but it only works for stable diffusion gens and pngs, and of course you'd have to get the image from catbox or something, because 4chan strips metadata. I also made an aibooru downloader a few weeks ago, that also puts the prompt from the page in a note, if it's available >>20541
>>20713 At derpibooru, I've seen "machine learning generated" earlier, and "ai generated" now. The AI is specified as "generator" and the human as "prompter".
>>>20716 >>>"favourites" is a column only in my renamed "my tags" tab. It is also duplicated as a tab under the tags column. >Ah, you can set different favourites for your different tag services. Hit up options->tag suggestions and then there should be a dropdown under the 'favourites' tab to change the service. Once you set some favourites for the PTR, that column should appear in manage tags. It works, but the tab is very confusing. It has two tag service selectors, the upper of which looks like a menu item out of context (not by color). What are the bottom two tabs for?
>>>20716 >>>>20667 >>>20676 >>To move an archived file to a more permanent file service, I have to inbox it, kind of putting it at risk. >Can you talk about this more? Is it with the archive delete lock on? Damn, I'll fix that. Yes, that. > And for trash without a reason, yeah, sorry. I'd say just leave it as the default like 'deleted from a media page' thing. Oh, I thought "media page" was the source page on a website, so somebody might want to delete it if it was deleted, for some reason, or it could be deleted automatically. What does it mean? > Can you say why you'd like to trash without a reason? Just so I can understand the workflow and problem here better. I don't need the file and I don't want it to be deleted automatically if I redownload it, but I don't want to delete it completely immediately.
>>20716 >then generate 'filename.txt' sidecar files It is good that the suffix is customizable, because there are txt files. 14000 files in a directory is a lot though.
>>>20717 >>>>20679 >>>20698 >>>20699 >Can you post a traceback of one of these errors? I'll bet there is a simple fix here to add the unicode or whatever support, but I'm just missing it because I'm not a Linux guy. v552, 2023/11/18 17:41:18: The export folder "[name]" encountered an error! Please check the folder's settings and maybe report to hydrus dev if the error is complicated! The error follows: v552, 2023/11/18 17:41:18: Exception: v552, 2023/11/18 17:41:18: OSError [Errno 36] File name too long: '[path]/[basename]' Traceback (most recent call last): File "hydrus/client/exporting/ClientExportingFiles.py", line 787, in DoWork File "hydrus/client/exporting/ClientExportingFiles.py", line 624, in _DoExport File "hydrus/client/metadata/ClientMetadataMigration.py", line 193, in Work File "hydrus/client/metadata/ClientMetadataMigrationExporters.py", line 656, in Export OSError: [Errno 36] File name too long: '[path]/[basename]' The base file name is 223 characters and takes about 261 bytes, as `echo -n "basename"|wc -c` shows.
>>20717 >>20728 a name to test with: $ echo "ディートリヒ・ブクステフーデは、17世紀北ドイツおよびバルト海沿岸地域を代表する作曲家・オルガニストである。声楽作品 においては、バロック期ドイツの教会カンタータの形成に貢献する一方"|wc -m 91 $ echo "ディートリヒ・ブクステフーデは、17世紀北ドイツおよびバルト海沿岸地域を代表する作曲家・オルガニストである。声楽作品 においては、バロック期ドイツの教会カンタータの形成に貢献する一方"|wc -c 267
>>>20719 >>>>20697 >To search for files without a rating (you mean hydrus ratings, right, the little squares and stars and stuff you can click?), hit up system:rating and then click 'is' and leave the rating icon set to the default grey. I'll add some text to say this in UI. Oh, thanks. The "unrated" state is totally undocumented: https://hydrusnetwork.github.io/hydrus/getting_started_ratings.html#using_ratings > You can set then them with a left- or right-click. Like/dislike and numerical have slightly different click behaviour, so have a play with them to get their feel.
>>>20716 >>There is also, in a stunningly bad design decision that I still regret and will eventually rework, a second favourite tags system that sits under the autocomplete input under the 'favourites' tab there, under options->tags. When I right-click a tag anywhere and click add or remove under 'favourites", it always affects the bottom favourites tab, even if I clicked it in the column.
I was playing around with a downloader for an API that, on each post, has a bunch of links that go to a bunch of qualities for the same content, pretty randomly. I'm using priority to get them from best to worst, but sometimes (that API's kinda shit) the best quality's link exists, but leads to a bad file that Hydrus can't import (because of the file, not because of Hydrus). It's failing the whole post even though there are other, worse qualities to try, so here's my question: Is there a way for downloaders to fall back to lower priority URLs when higher URLs fail to import for any reason? As right now I believe it's only using lower priorities when a higher one does not exist at all.
There seems to be no actual way to see which of the selected files is focused when setting a relationship.
Deleting a file currently in trash from the media viewer undeletes it.
>>20733 >There seems to be no actual way to see which of the selected files is focused when setting a relationship. Why would there be?
>>20735 "Are you sure you want to set the focused file as better than the other 2 files in the selection?"
>>20736 The file you're right clicking is the focused one.
I noticed that if I edit a file, say in Microsoft paint or GIMP for a quick fix, the thumbnails, and full media viewer will not reflect this. I thought that background file maintenance would eventually catch the discrepancy and fix it, but I wanted to speed it up and tried pressing some buttons. Apparently, regenerating file integrity will cause it to notice the discrepancy in hashes, but then punt the file to an invalid files folder outside of my hydrus files for reimporting, losing all tags in the process. Do these file integrity checks happen regularly in the background and am I risk of losing all the tags for any files I edit after importing them to hydrus? How can I get Hydrus to properly acknowledge a file has been modified without shunting it from my files and clearing all tags? >>20736 Ah, I thought you meant tag relationships.
>>20737 I thought it was the last one selected.
>>20739 Technically yeah, when you right click on the selection, the focus switches to the file that's under your right click. >>20738 When you edit a file, the hash changes, but still has the old hash filename. Hydrus has no way of telling the file changes and still thinks the old one exists and doesn't know there's a new one. I guess it would have to constantly keep rescanning your whole image folder to notice changes and that would take too much resources. That's at least what I think is the case. Even the help page says somewhere that you should 'save as' to another location and then reimport. After that you have to copy all your stuff from the old file to the new. It's pretty annoying, but there's not much else you can do.
>>20740 There really should be an option for "I just modified this file, please update it's display in all media viewers and hash while maintaining tags".
>>20582 it's simple I'm not that autistic. i would never search for half of those name spaces. I never search for a specific number of girls or boys, just gender as a general concept. i use number of character tags to approximate number of characters. lewd vs erotic doesn't matter to me because i'm okay flipping past the occasional random non-erotic comic with sexual dialogue. explicit can't just be a parent tag because the only namespaces i aim for completeness on are creator, character, rating, gender you can't let yourself get bogged down in this kind of bullshit or else tagging becomes a full-time job. shit, even with my sparse tagging it's still neverending, but there's a theoretical end. (i do tag other things but it's inconsistent)
>>20742 >I never search for a specific number of girls or boys Not even 1 girl +multiple men or 1 guy +multiple girls? >you can't let yourself get bogged down in this kind of bullshit or else tagging becomes a full-time job I enjoy the autism. And I do have a theoretical end. It's just months away until I finish going through artists I like and setting up subscriptions for the ones that are worth it. I've tagged to my autistic standards over 20k files in the last year and am outpacing how quickly I add new files.
If anyone here uses hyextract, do you know what issues I'm running into? I set it up and added the API key to the config file, then I tried to run it, but I just keep getting Error: spawn {path to "7z" binary} ENOENT when I try to use it to extract files. I don't even know what the error means, except that it's somehow not seeing "7z", but I don't know how to fix it. I tried going to the file and it's there, so I don't know what's wrong. I'm on Linux so is that it? Does hyextract not work on Linux? If anyone know what's going on, I'd appreciate the help.
>>13775 so true
>>20678 I would also suggest showing if they have different lengths. It's a useful say to show that one might be a slightly extended version of the other and other things like that.
>>>20717 >>>>20690 >At the moment, I think the various scan-for-files-to-be-imported do a thing where they see if a file is currently being used by another process. This generally catches files being downloader by browsers and stuff, but have you noticed it failing for you? Can you talk more about the scenario, if so? A modified check might be a good solution, if there is a problem, but maybe something else would too. I haven't tested it, but if you didn't explicitly do anything about it, it won't notice anything, because files being downloaded are usually readable.
>>20744 EONOENT means Error NO ENTity or something to that effect. That issue caused me to give up on hyextract.
Is it frowned upon to commit my downloader tags to the PTR if I've only been using reputable sites? i.e. sankaku idol, danbooru, gelbooru.. or is it more harm than good?
>>20749 That's should be fine. Just be aware that any tags which may contain sankaku URLs could get you temp banned from the PTR (the url: site: and booru: namespaces are the most likely tags to cause issues). Sankaku url spam would get out of control otherwise.
Is there a reason to veto if no file is found like the example in the manual? It's not like it can assign any tags to something that doesn't exist. Or does it affect performance? I guess it makes the parser tester easier to read, but whatever.
>>20751 vetos are useful in situations where the user might not be logged in and the high res link will 404. for example, if the user forgot to add their login cookies, you might be able to check that so that the file log will say "VETO: not logged in" instead of "404: file not found". they're also useful if you can tell that the link would lead to a generic 404 image. if there's actually just no file produced and you don't feel the need to tell the user something specific, then you don't need to have a veto.
>>20747 >I haven't tested it, but [conjecture] Very helpful post. If you have no testable examples, why even bring this up?
>>20753 Calm down. The code has hardly any comments and doesn't seem to mention anything related, so I just tested it and of course it imports the incomplete file that is being copied from a FUSE device (a network file system). Why would it not? The file is readable, and even if a process is writing into it, hydrus doesn't care, because it only needs to read it.
>>20752 Oh, so it's also used to inform the user about an error. I thought a veto is just supposed to prevent a download if the usual download source isn't what it's supposed to be and other similar rules that would result in something unwanted.
Is it possible to add the "select this file" in the tag selection for a search? Here's what I mean, I have this picture, which has the tag "shark costume". I can obviously scroll through the entire list of tags below the search box but it's easy to miss a specific tag, especially if the list is really long. But with "searching immediately" enabled "shark" brings up "shark costume" in picrel 2 but I can't right click>select>"files with 'shark'" despite the fact it only shows tags of the current page. Thoughts?
I had a great week. I fixed a bunch of bugs--including the recent false-positive I/O error we saw when checking certain gifs for transparency--improved some quality of life, and made the slideshow work much better with videos. The release should be as normal tomorrow.
Anyone else getting this error when trying to run a Sankaku gallery download? NetworkException('422: The server\'s error text was too long to display. The first part follows, while a larger chunk has been written to the log.\r\n<!DOCTYPE html>\n<html lang="en">\n <head>\n <title>Sankaku Channel - Anime, manga & game related images & videos</title>\n <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" />\n <meta name="viewport" content="width=device-width,initi')… (Copy note to see full error) Traceback (most recent call last): File "hydrus\client\importing\ClientImportGallerySeeds.py", line 445, in WorkOnURL File "hydrus\client\networking\ClientNetworkingJobs.py", line 1977, in WaitUntilDone hydrus.core.HydrusExceptions.NetworkException: 422: The server's error text was too long to display. The first part follows, while a larger chunk has been written to the log. <!DOCTYPE html> <html lang="en"> <head> <title>Sankaku Channel - Anime, manga & game related images & videos</title> <meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /> <meta name="viewport" content="width=device-width,initi I've never seen it before. What does it mean?
https://www.youtube.com/watch?v=FhGziRJY730 windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v553/Hydrus.Network.553.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v553/Hydrus.Network.553.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v553/Hydrus.Network.553.-.macOS.-.App.dmg linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v553/Hydrus.Network.553.-.Linux.-.Executable.tar.zst I had a great week. The issues with gifs are fixed, and slideshows with videos are smoother. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights I was happy with the 'has transparency' work last week, but unfortunately some damaged animated gifs (usually where one frame was borked) were raising a serious error when then were inspected. This error, which was interpreted as a hard drive fault, was a false positive! If you got it (should have been a popup talking about an 'I/O Error', and that maintenance work would halt until the next boot), do not worry--it wasn't a big deal. I have fixed up the error handling and a bunch of other gif issues. Sorry for the trouble! There are several new slideshow timing options under options->media. You don't need to tweak them unless you really care, but now, when you do a slideshow that involves video or audio, the transitions should happen earlier or a little later in order to line up with where the video naturally ends. Check the changelog and tooltips of the new widgets for full details. The various 'import options' on downloaders now say '(all default)', '(some set)', and '(all set)' on their labels, so you can quickly, at a glance, see what you have set where. The thumbnail manage->regenerate menu is now called maintenance, and it now lists all the possible file maintenance commands. Should make it easier to fix weird problems. next week I have some server work to do. Hopefully clear out some github bug reports too. I also just realised we only have about four more releases before the year is done--it feels to me like the second half of this year has flown by. I'll try and fit in one more medium-sized job before then.
How do you guys tag characters dressed as another character? Just use both character tags and a cosplay tag or something?
>>20760 Or also tag the real character normally, and the one they're dressed up as "name (cosplay)"
>>20760 I would definitely do something to indicate who is being cosplayed and who is cosplaying. I think the PTR's system is good. Take this image for example, it's 9S cosplaying Astolfo. The tags are "character:yorha no.9 type s", "character:astolfo (fate) (cosplay)" and that 'astolfo (fate) (cosplay)' tag has a parent of "character:astolfo, rider of black". That makes it easy to find pics where someone's cosplaying as a character vs a character cosplaying as someone else.
>>20762 I forgot to mention, "character:astolfo (fate) (cosplay)" also has "cosplay" as a parent tag.
I recently moved my hydrus database to an external drive. There have been a couple times when I've accidentally launched hydrus without the external drive plugged in, and when that happens hydrus doesn't do anything at all. I expected it to try launching and then say "hey the path to the db doesn't exist" but instead nothing pops up and the program just sticks around. I can see it in Task Manager but it doesn't do anything unless I plug in the drive, in which case it starts launching normally. It's a little strange that it works like that, silently hanging around and waiting.
>>20764 when i do that it notices immediately, sends a warning that files are missing, and refuses to start normally
>>20760 I think the danbooru guidelines for this are: >character:real character >cosplayed character (cosplay) So I just do that.
>>20764 What a shy little thing!
My client.mappings.db disappeared, so I replaced it with one from yesterday. db integrity check found 0 errors. Should I run check and repair operations?
>>20768 "repopulate truncated mappings tables" was fast. "database->regenerate->tag display mappings cache (just pending tags, instant recalculation)" and "tag storage mappings cache (just pending tags, instant recalculation)" open a dialogue which says "If you have millions of tags, pending or current, it can take a long time, during which the gui may hang.", which is not about instant recalculation. I only added/deleted tags and files after that backup, so there may be some orphan files. Can they be imported again without tags?
Is it possible to generate tags out of a file URL without a page?
>>20770 The only thing you could really get is the filename. You need the post page because that's where Hydrus fetches the tags from. It then passes those tags to the resulting file that's downloaded from the file page. If you only have the file page, there's no where to get the tags from.
>>20771 It would be useful to parse URLs to get tags and construct other URLs, for pages which are too hard to parse because they require JavaScript.
>>20772 If the url has all the information you can construct a post url from, then you can probably do it using api redirect inside an url class. Though that would most likely create a feedback loop. As an alternative you could export all the files with url sidecars and do a regex find/replace across all files using notepad++ and import back, then copy the links from the manage url menu when you have all your files selected and paste those into an url downloader. If the files are from a booru, you can do an md5 hash gallery search. It just need to do a bunch of things for it to work, but I forgot what it was. If you want to know I can look into it
>>20772 you can do stuff with the url inside a parser by getting it from the context variable. https://hydrusnetwork.github.io/hydrus/downloader_parsers_formulae.html#context_variable_formula for example if the url looks like "example.com/artistname/image.jpeg" you could use a regex to extract "artistname" from that. post the website and maybe someone can help.
>>20760 I do both characters. I have "clothes:[character] outfit" tags all set child to "clothes:alternate character" which is set child to "clothes:alternate outfit". I save the action:cosplay tag for scenarios including real people and settings where the characters being dressed up as are fictional. I don't consider, for instance, something like two characters swapping outfits in the same image to be them "cosplaying" eachother. Generally, cases where a character is mentioned, but not present, are rare among my images where characters matter, so I don't have any lack of presence tag, and it would be cumbersome to add a lack of presence tag for every character it happens to when that's not something I usually look for and is a rare case that doesn't require blacklisting. Characters dressing up as other characters is something I love though. >>20719 >If you want to hurry this processing, hit up database->file maintenance->manage scheduled jobs and then click the 'do all work' button. Ah, I found it. Seems like this window is needlessly long and I didn't notice the scroll bar hiding the buttons. I can't get rid of the scroll bar without extending the window border beyond my screen.
>>20692 >They probably aren't really what we are interested in finding with a 'system:has transparency', so be on the watch for them and let me know how bad the problem is in an IRL client. The new system:has transparency function has helped me find about 80 files I missed manually tagging out of about 480 transparent files. It's accuracy rate for positive identification is a little over 50%, with about 450 false positives. With one special exception of about 390 very particular files, it didn't miss anything I had tagged as transparent myself, and helped me identify 3 files I mistakenly tagged as transparent. The reason I group these 390 or so files it missed together is because they're all files I made. A ton of apngs I made using the program apngasm out of asset files from the game Bug Fables. Here are some examples as the first four files. I don't have a lot of apngs otherwise, and I could only find one I didn't make that had transparency, and Hydrus also missed it too. Fifth pic related. Either Hydrus just doesn't, or can't check apngs for transparency.
Couldnt find any info on this... hope this isnt one of these q's asked too often.. I have a large image collection that I'm planning to tag before importing into hydrus... Since there is a vast amount of diverse images, Id rather not manually tag it all - so I'm running AI classifiers to scan and add exif tags. So my question is - >Can hydrus import tags from exif data while adding images? Also second question, >is there a way to import tags like the first qustion, but after the image is already added (since I may scan more tags later)
Let's try this again I'm getting a TypeError when booting hydrus. From what I can gather (not much) it's due to one of my automated watchers, but I did a check all imports and no error happened. v553, 2023/11/24 21:52:30: services started v553, 2023/11/24 21:52:46: Exception: v553, 2023/11/24 21:52:46: TypeError unsupported operand type(s) for -: 'NoneType' and 'int' Traceback (most recent call last): File "/home/path/to/hydrus/hydrus/client/gui/QtPorting.py", line 1302, in eventFilter event.Execute() File "/home/path/to/hydrus/hydrus/client/gui/QtPorting.py", line 1275, in Execute self._fn( *self._args, **self._kwargs ) File "/home/path/to/hydrus/hydrus/client/ClientThreading.py", line 678, in qt_code self.Work() File "/home/path/to/hydrus/hydrus/core/HydrusThreading.py", line 963, in Work SchedulableJob.Work( self ) File "/home/path/to/hydrus/hydrus/core/HydrusThreading.py", line 929, in Work self._work_callable() File "/home/path/to/hydrus/hydrus/core/HydrusData.py", line 1232, in __call__ self._func( *self._args, **self._kwargs ) File "/home/path/to/hydrus/hydrus/client/gui/pages/ClientGUIManagementPanels.py", line 3225, in Start self._multiple_watcher_import.Start( self._page_key ) File "/home/path/to/hydrus/hydrus/client/importing/ClientImportWatchers.py", line 628, in Start watcher.Start( page_key, publish_to_page ) File "/home/path/to/hydrus/hydrus/client/importing/ClientImportWatchers.py", line 1796, in Start self._UpdateNextCheckTime() File "/home/path/to/hydrus/hydrus/client/importing/ClientImportWatchers.py", line 1040, in _UpdateNextCheckTime self._next_check_time = self._checker_options.GetNextCheckTime( self._file_seed_cache, self._last_check_time, previous_next_check_time ) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/path/to/hydrus/hydrus/client/importing/options/ClientImportOptions.py", line 127, in GetNextCheckTime if HydrusTime.TimeHasPassed( previous_next_check_time - 5 ): ~~~~~~~~~~~~~~~~~~~~~~~~~^~~ TypeError: unsupported operand type(s) for -: 'NoneType' and 'int'
Not sure if anyone else would want this, but would it be hard to implement smooth scrolling? It's really easy for me to lose my position as I scroll, plus it would be easier for the eyes maybe. Also some kind of paging function for system:limit would be nice. My idea would be to give the predicate an extra integer parameter "page", and when you would want to go to the next page, you could easily shift + double click the predicate and increment the number.
>>20779 >paging for system:limit That would be really nice actually, I don't think I'd personally use smooth scrolling though.
>>20778 Oh, forgot to mention, this only started after the latest update, though, if it's an issue with a specific file maybe it's just a new file.
Is there a way to treat a url with fragment as a different url class than one without it? I'm trying to modify the 8chan.moe parser to also save post urls for every file, but even if my parser has associable url spliced together from context url (of the thread) and a post number, it simply won't save it, because it thinks it's the thread url, which doesn't have the fragment. If I enable keep fragment for this class, it still won't save the post urls. Even if I make two url classes where one has it and the second doesn't, they still get treated as the same url class anyway. I also made a similar parser for 4chan archives and it works there, because the fragment is after a '/', which becomes another url component and it gets treated as a different url class. Basically what I'm trying to do is save both https://8chan.moe/t/res/13212.html and https://8chan.moe/t/res/13212.html#13219 (for that one file) when I enter https://8chan.moe/t/res/13212.html into the downloader.
>>20782 Check the option "keep fragment when normalising" and it'll save the part after the #. That's all you need to do. You don't need multiple url classes.
>>20776 APNGs are indeed not checked for transparency yet.
>>20722 Sorry, it isn't that clever yet. If there is a way to programatically determine which parser to use (let's say there's one or another media type, or a html tag you can look for), then your best shot, if you are feeling brave, is to combine both parsers together into one, putting each into its own 'subsidiary parser', and then using a 'veto' ContentParser to veto whichever one is not appropriate. If this is just a human decision though, sorry, no good way atm. >>20725 Thanks for mentioning it. I'll brush up the UI. The top service selector sets which service you are changing your favourites for. The control underneath is the normal tag autocomplete widget you see around the program, which has the tag and file domain selectors just for helping find tags. So, to set tags for the PTR, set the top dropdown to 'PTR', and then search up some tags and hit enter on each, they should go into the list. >>20726 'Media Page' is a name for the thumbnail grid in the hydrus client. 'Media Viewer' is the fullscreen/large viewer you get when you double-click a file. The two simple reasons assigned are just to say 'you deleted this yourself from the thumbnail grid' as opposed to like the duplicate filter. >I don't need the file and I don't want it to be deleted automatically if I redownload it, but I don't want to delete it completely immediately. This sounds quite complicated to me, so I think you want to make a new custom delete reason under options->files and trash that refers to this class of files and then assign that in these cases. >>20728 >>20729 Aha, thank you! I'll see if I can get the filename to clip to a safer length. Might even be some system call I can make to actually find out what the limit for your filesystem is.
>>20730 Thanks, I will write something up. In general, try clicking a like rating twice, or right-click a numerical, and it should reset to 'null'. >>20731 Yeah, I should come up with better names for these and then have nicer menus that say explicitly where stuff is going. The current system is just confusing atm. >>20732 I am sorry to say there is not. I would like to eventually add this--we had a similar problem with Deviant Art, for a while--but at the moment my importer objects can only handle one primary URL. A fallback on 403/404 or whatever, and the parsing pipeline updates to handle that, would be great. >>20733 >>20735 >>20736 >>20737 >>20739 Sorry for the confusion. The 'focused' file should be the one you currently see in the preview viewer. I wonder what is a better way of saying that? Ideally I'd show the file preview in the dialog or something, and that's something I plan for the eventual file relationship expansion, but I just don't have that pretty tech yet. >>20734 I am sorry to say I cannot reproduce this. Can you say more how you get it? If I load a trashed file into the media viewer and hit the delete key, it says 'Permanently delete one trashed file?'. If you do the same, do you get the dialog to undelete? Or are you clicking a button to delete? By default, I think shift+delete does undelete, so is there any chance you have shift pressed in some way? >>20738 >>20740 >>20741 Yes, I am sorry, at the moment hydrus works under the assumption that all of its files are read only. It is an archive for confirmed good things, not a place for files still being edited. Having a button to say 'this file changed, shunt all its metadata to a new file record etc...' is a good idea, and something I can see happening in the future, but I don't have excellent metadata merging tech right now. I hope to have better tech for this when I do the file relationships expansion, and that'll allow various other ideas like 'take all these bloated files and convert them to jpeg XL', which will be cool in various ways, but also something we have to be careful with. When you change a file hash, it becomes untraceable to other users, which eliminates the ability to share tags. Better duplicate tech all around with ease this problem, but I'm not ready yet. To answer your actual problem right now, I think you'd want to export the file (drag and drop to your desktop or something), then edit it there, then import the file. If you like, you can put it beside the old version and the select both and hit the menu and manage->file relationships->set relationship->set this file as better than the other, which should merge according to your current dupe settings in F9->special->duplicates processing. Not great, but it will work and the database won't sperg out and think something was corrupted.
>>20746 Great idea. >>20755 A veto can do some conditional stuff with 'subsidiary parsers' too, basically saying something like 'if the user is logged in, use these content parsers, but if they aren't, then use these'. But it isn't important for normal cases; you are correct that a missing file url link kills that document parse naturally anyway, no worries. >>20747 >>20754 Thanks, this is interesting. The test I do is basically 'rename this file from "a" to "a"' (i.e. not actually changing anything), which is a little trick you can generally use to tell if a file is in use by another process. It doesn't change anything, but it'll raise an error for a file in use. I will investigate why this is failing here. Maybe I am not rigorous enough in the import folder scanning, or maybe that trick doesn't work on a NAS or similar (these system-level calls and their locking prerequisites are often faked over a network connection). Having a behaviour to skip based on modified time would be a good extra solution. >>20756 Great idea, I'll see what I can do. >>20758 I don't remember seeing 422 before! That stands for "422 Unprocessable Content", which seems to be the server saying 'No. Don't send that again.' Sankaku have a history of various custom blocking tech, so I expect it is something like that. Although it is frustrating, the best solution is to see if you can find what you want on a different booru. >>20764 Thank you for this report. Can you talk a little more about this? The splash screen (the little window with hydrus icon where is says like 'booting database...') doesn't turn up at all? It just sits as a hydrus_client.exe process in Task Manager forever, until the drive appears? As >>20765 says, the normal behaviour here is that it says 'hey, x location doesn't exist, fugg' with a nice popup or something, but now I think about it, while I have a lot of tech that governs missing media storage, but not a lot for the db itself being missing. I will investigate this. I would have thought the OS calls I make for like 'is this location read only?' and so on would fail, but perhaps it is more complicated. I wouldn't think Windows would have a ghost entry and an infinitely delayed OS call on a known removable location, but who knows, that sounds like what is going on here.
>>20768 >>20769 Your 'repopulate truncated mappings tables' call is probably all you have to do. You don't have to run the 'regenerate mappings cache' calls--they regen stuff in client.caches.db from stuff in client.mappings.db, whereas the 'repopulate truncated' call does the reverse. You don't have to worry about orphan files, only some missing tag mappings. If you sync with the PTR, you might have lost one or two mappings, but nothing for any files you have right now, and it probably isn't a big deal. Keep an eye on things, see if any autocomplete tag results have any completely whack counts. Most important, I'd say, is figure out how the client.mappings.db disappeared. I assume you know what happened, some storage system issue you have a handle on, but if you don't, then it is time to check your hard drive health. Hydrus will never delete its own db files. >>20776 Thank you, this is useful info, and the transparent apng examples are great. My native apng rendering solution works through an RGB ffmpeg bridge right now, so I just can't detect an alpha channel yet, but I'll play around with these and see what I can do. >>20777 Hydrus can view EXIF manually, but it cannot parse from them yet. This will probably happen in the medium term future. When we do, we'll support retroactive parsing for files you already have, absolutely (there are ways to hack this anyway, no worries). That said, in general, it is not in the hydrus style to alter files before you import them. If these files are personal photos or something, that's fine, since you are probably the only person who is tagging them, but if these are standard booru anime babes, then changing the files' EXIF changes the file content and thus their file hashes, which means hydrus cannot match them up with copies that already exist on the web, and thus a lot of tag-matching tech doesn't work. Same deal for various file resizing and 'optimising' routines one can run--although they might save 5% hard drive space, it isn't worth the new hashes. You might want to try this tech out with a hundred files and see if it works for you, depending on how much you want to download similar files from other sites online, and how many annoying duplicates you end up with if so. Any chance you can get the AI to spam the tags to a sidecar 'filename.txt' file? We have pretty good sidecar import tech that will let you parse these tags straight into hydrus right now. Help here: https://hydrusnetwork.github.io/hydrus/advanced_sidecars.html >>20778 >>20781 Sorry, I screwed something up. This shouldn't super break anything, but your watcher page might not check some pages. You might like to try hitting 'check now' on all your watchers and then restarting the client. I have it fixed for v554 in any case. Thank you for the report.
>>20779 >>20780 Thanks. I've talked with a couple guys before about smooth scrolling. I'm sorry to say that when I last looked at it, I didn't see a super nice way to do it in Qt, so I'd have to write the animation myself. This isn't impossible, but that particular area of code is full of old hacks and needs a hell of a lot of cleaning before I can write a pretty extension to it, so I can't promise this in any reasonable amount of time. That said, I know what you mean about losing your place. You might like to play with the 'EXPERIMENTAL: ... per scroll tick' setting under options->thumbnails. Try setting it to 0.5 or 0.25, and it'll simply make your scrolling more granular (and slower) in a hacky way. Having pagination work that way for system:limit is pretty smart! I'll think about this seriously, as I can hack that into the db tech really quite simply. The UI would be the most tricky part.
>>20783 Well except it won't, as I actually stated. It will only keep the fragment if it's the url you're pasting into the downloader, not for associable urls given by the parser, which means every file it downloads will have the same fragment.
>>20782 >>20786 >The 'focused' file should be the one you currently see in the preview viewer. I wonder what is a better way of saying that? Highlighting it differently. It is also not easy to notice that a file is selected if the aspect ratio is similar to that of the thumbnail.
>>20787 >The splash screen (the little window with hydrus icon where is says like 'booting database...') doesn't turn up at all? That's correct. >It just sits as a hydrus_client.exe process in Task Manager forever, until the drive appears? That's correct. It also uses a lot of CPU while it's waiting, for some reason.
>>20788 >Most important, I'd say, is figure out how the client.mappings.db disappeared. I assume you know what happened, some storage system issue you have a handle on, but if you don't, then it is time to check your hard drive health. Hydrus will never delete its own db files. On the day preceding the update, I made a copy with reflink. That day I also added many tags using the wd ai. Next day, I made another copy, and tried starting either with 552 (or 552a) or a source code 553, but something went wrong. It said 64 tables (or something like that) were deleted. Starting 552 also failed because of a problem with a library or something. Later I found that client.mappings.db is small. The last copy had either no client.mappings.db, or a 0-byte one. Currently a copy with "broken" in its name has only a log file, which could be due to some mistake of mine.
>>>20747 >>>20754 >Thanks, this is interesting. The test I do is basically 'rename this file from "a" to "a"' (i.e. not actually changing anything), which is a little trick you can generally use to tell if a file is in use by another process. What if it's owned by a different user? > or maybe that trick doesn't work on a NAS or similar (these system-level calls and their locking prerequisites are often faked over a network connection). It's on Linux. The file is being copied by a file manager copying the file onto the local fs where the import folder is.
>>20786 >To answer your actual problem right now, I think you'd want to export the file (drag and drop to your desktop or something), then edit it there, then import the file. If you like, you can put it beside the old version and the select both and hit the menu and manage->file relationships->set relationship->set this file as better than the other, which should merge according to your current dupe settings in F9->special->duplicates processing. Not great, but it will work and the database won't sperg out and think something was corrupted. Hopefully internal handling of edited files comes in the next few years so I fix typos in things like copypastas and writefaggotry I happen to notice. Same with handling of txt files, which I currently just hide inside archive files and then tag as meta:text file.
Is there a list of sites with notes about whether the files are worth tagging in the PTR if you don't redistribute them?
>>20785 >If this is just a human decision though, sorry, no good way atm. Yeah, it's that. For context, they're 4/8chan parsers, one downloads images normally, the other one searches for catbox links and images with catbox script compatible filenames. >>20786 >Sorry for the confusion. The 'focused' file should be the one you currently see in the preview viewer. I wonder what is a better way of saying that? As >>20786 said, a different highlight color probably. Also having the highlight be a slight overlay, instead of background color behind the thumb, would be nice too. >>20787 >A veto can do some conditional stuff with 'subsidiary parsers' too, basically saying something like 'if the user is logged in, use these content parsers, but if they aren't, then use these'. How do you actually do that? Is it by putting a veto in both subsidiary parsers that exclude each other? Like both having the same condition, but one triggering when match found and the other when match not found? >>20789 >You might like to play with the 'EXPERIMENTAL: ... per scroll tick' setting under options->thumbnails. Try setting it to 0.5 or 0.25, and it'll simply make your scrolling more granular (and slower) in a hacky way. That feels a bit better, thanks.
(177.13 KB 824x155 26-19:05:18.png)

(17.29 KB 1031x115 26-19:06:52.png)

>>20797 You can set a border for file selection so you can always tell what's selected, I set mine to green, and the background to pink, I never miss a selected file anymore. It's under options>colours
>>20798 Good idea, thanks doc.
>>20798 can you post that teraurge file on /b/? I wanna know what it says and how it related to teraurge.
>>20798 Thanks. Maybe that pane should open with the tab of the current colourset selected.
>>20786 >By default, I think shift+delete does undelete, so is there any chance you have shift pressed in some way? That could be it.
>>20800 I can post it here, it's sfw. It just happened to get posted in a thread about Teraurge and I'm lazy and usually tag entire threads with the franchise, at least until I notice it.
Ever since I updated to 553, none of my Pixiv subscriptions work. They're in a constant state of "waiting to start". Yet, when I manually download the a Pixiv post, it downloads the image(s) just fine. What gives?
>>20804 I'm fairly positive that I haven't touched anything that would cause it to permanently wait to start.
>>20805 Is nobody else having this issue?
(16.86 KB 298x418 1700849803622655.jpg)

v553 still can't start on wayland troonbros, waylandoids, not like this http://sprunge.us/gF7eky
Not even a fresh install works. Subscriptions are broken. I've ported my cookies. If I manually use Hydrus Companion to download an image from a site, i can do that no problem. Every subscription I have no longer works.
>>20774 >context variable Thanks. Sadly, it cannot download from a URL it can parse.
>>20809 or parse a URL it cannot download, so there is no workaround.
I can't be the only one having this problem. I've tried everything I can think of; I even reset my modem.
Why is it that the 'title:' namespace seems to be used for meme captions and other text in the image?
Okay, for anyone else having the issue I had, it won't let you have to manually change the checker options by selecting a different preset. It will not let you uncheck "just check from a static interval", when you do so and apply it ignores you and resets it to checked on. You have to select a different preset, and then it will work.
I had an excellent week. I fixed some important bugs--including a subscription/watcher start bug, and even more damaged-file rendering issues--figured out transparency for APNGs, and wrote basic recognition and thumbnail generation for CBZ and Ugoira files. The release should be as normal tomorrow.
>>20814 CBZ support would actually be a game changer for me, can't wait!!
I think a cool feature to add to Hydrus, given the downloader-centric nature, would be a sort of "tag health" center where Hydrus could give you an overview of the tags in your db, and maybe do things like show you tags that have a low count in your db and might be misspellings, or cases of 2 tags having a very high co-occurance rate, so they might be siblings, and things like that. I don't have many firm idea of how it would actually look or work, but just some kind of tag health and management facility would be cool.
>>20814 Sorry that I have to ask you again. I can't even run Hydrus under X11 anymore. Clean install doesn't seem to help either. Here's the error I get upon trying to setup venv sh Collecting python-mpv==1.0.3 (from -r requirements.txt (line 26)) Using cached python_mpv-1.0.3-py3-none-any.whl (44 kB) Collecting requests==2.31.0 (from -r requirements.txt (line 27)) Using cached requests-2.31.0-py3-none-any.whl.metadata (4.6 kB) Collecting QtPy==2.3.1 (from -r requirements.txt (line 29)) Using cached QtPy-2.3.1-py3-none-any.whl (84 kB) ERROR: Ignored the following versions that require a different python version: 1.21.2 Requires-Python >=3.7,<3.11; 1.21.3 Requires-Python >=3.7,<3.11; 1.21.4 Requires-Python >=3.7,<3.11; 1.21.5 Requires-Python >=3.7,<3.11; 1.21.6 Requires-Python >=3.7,<3.11; 6.0.0 Requires-Python >=3.6, <3.10; 6.0.0a1.dev1606911628 Requires-Python >=3.6, <3.10; 6.0.1 Requires-Python >=3.6, <3.10; 6.0.2 Requires-Python >=3.6, <3.10; 6.0.3 Requires-Python>=3.6, <3.10; 6.0.4 Requires-Python >=3.6, <3.10; 6.1.0 Requires-Python >=3.6, <3.10; 6.1.1 Requires-Python >=3.6, <3.10; 6.1.2 Requires-Python >=3.6, <3.10; 6.1.3 Requires-Python >=3.6, <3.10; 6.2.0 Requires-Python >=3.6, <3.11; 6.2.1 Requires-Python >=3.6, <3.11; 6.2.2 Requires-Python >=3.6, <3.11; 6.2.2.1 Requires-Python >=3.6, <3.11; 6.2.3 Requires-Python >=3.6, <3.11; 6.2.4 Requires-Python >=3.6, <3.11; 6.3.0 Requires-Python <3.11,>=3.6; 6.3.1 Requires-Python <3.11,>=3.6; 6.3.2 Requires-Python <3.11,>=3.6; 6.4.0 Requires-Python <3.11,>=3.6; 6.4.0.1 Requires-Python <3.12,>=3.7; 6.4.1 Requires-Python <3.12,>=3.7; 6.4.2 Requires-Python <3.12,>=3.7; 6.4.3 Requires-Python <3.12,>=3.7; 6.5.0 Requires-Python <3.12,>=3.7; 6.5.1 Requires-Python <3.12,>=3.7; 6.5.1.1 Requires-Python <3.12,>=3.7; 6.5.2 Requires-Python <3.12,>=3.7; 6.5.3 Requires-Python <3.12,>=3.7 ERROR: Could not find a version that satisfies the requirement PySide6==6.5.2 (from versions: 6.6.0) ERROR: No matching distribution found for PySide6==6.5.2
>>20817 Had the same error when upgrading to Fedora 39; was your default python version changed, recently? Fedora 39 upgrades the python and python3 executables to Python 3.12 for reference, which doesn't work with Hydrus yet, I believe. I use python3.9 for everything Hydrus now, which seems to be working well; you could try "python3.9 -m pip install -r requirements.txt" in this case. If this works for you, don't forget to update your start scripts to launch Hydrus with python3.9 as well! From the error message you shared, it might also work with python3.10, but I have not tested it.
how fucked am I?
(2.90 KB 569x130 Screenshot 1.PNG)

(11.80 KB 971x245 Screenshot 2.PNG)

>>20809 >>20810 oh, you're right. hmmm.... for now, one workaround might be to use gallery-dl to batch download the files, then use an import folder with sidecars to parse the url. gallery-dl can do direct image links. you just have to add a bit to the config so that it puts the url in the metadata. for example: screenshot 1 is my gallery-dl.conf file. i run "gallery-dl --write-metadata https://hydrusnetwork.github.io/hydrus/images/example_client.png". screenshot 2 is the json file that gallery-dl creates that can be processed with sidecars.
>>20818 okay, I think I'm just gonna wait then kek. bullseye on your stipulation btw i might still try it if i get impatient enough
https://www.youtube.com/watch?v=HUVtoWIeyp8 windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v554/Hydrus.Network.554.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v554/Hydrus.Network.554.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v554/Hydrus.Network.554.-.macOS.-.App.dmg linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v554/Hydrus.Network.554.-.Linux.-.Executable.tar.zst I had an excellent week. Some important bugs are fixed, and we have some basic support for CBZ and Ugoira files. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html bugs I screwed up two important calls last week, in fixed-period checker timers and job statuses. If you had watchers not start, subscriptions with incorrect check times, and/or popup messages that threw display errors or wouldn't auto-dismiss, I am sorry! I didn't plan the changes here properly, and several things slipped through my tests. All the affected systems have been given proper rework this week and have some unit tests to make sure this doesn't happen again. Please let me know if you still have any problems. animations I fixed up more of the 'while checking for transparency, this file produced x error!' issues. Checking GIFs for transparency should be a bit faster and more fault tolerant too, now. Also, the native GIF renderer has much improved transparency support. The multilayered 'noclip' artifacts should be gone, and damaged GIFs will recover better. GIFs also now get thumbnails that are x% in with proper transparency. Also, APNGs now get transparency: when rendering in the native renderer; in their thumbnails (which are also now x% in); and for 'has transparency' checks. CBZ and Ugoira I am adding basic recognition and thumbnails for these filetypes today. Behind the scenes, both formats are essentially just zips with a list of images, so all your zips will be scanned, and if they look like a CBZ or Ugoira, their filetype will change and they will get a thumbnail. Ugoira thumbnails will be x% in, like for other video. Also, a user is working on true Ugoira rendering now, so I hope we will be able to finally roll this out in the medium term. Unfortunately, neither format has a particularly definitive/unique specification, so while I have tried to be careful, my tests here are imperfect. We can expect a few incorrect determinations one way or the other. If you get an outrageous false positive or false negative here (e.g. something you know is a Ugoira that stays as a ZIP, or a ZIP of misc files that detects as a CBZ), please send in the details, and I'll see if I can tweak my tests. next week I put in extra time this week to figure out CBZ, so I'll let things breathe and just catch up on simple, small work. I have a bunch of interesting quality of life UI items in my immediate todo, so I'll probably focus on that!
>>20822 >Behind the scenes, both formats are essentially just zips with a list of images, so all your zips will be scanned >or a ZIP of misc files that detects as a CBZ Does this mean any .zip files I have that are just images can get the first image turned into a thumbnail now? Or is this just an accident that may only affect some zips at random? I've been waiting for archive thumbs like this for a while. All galleries downloaded from the sadpanda come in zips, and if I can get them to display the first page as thumbnails, Hydrus will become useable for importing doujins en masse.
>>20823 This whole thing is more of an experiment, this week. Try importing some of these files--I'm pretty sure they will be detected as CBZ and get thumbs. I wouldn't suggest moving completely off ComicRack or Calibre or whatever you currently use just yet, as hydrus still doesn't have proper internal browsing tech or anything to actually read the thing, but I'll be interested to hear how this overall goes for you, particularly if any turn up as Ugoira by false positive. I'll keep working on this. I've already got a note to add user-forced filetype override here for users who don't want file A to be a CBZ, and want to force it to stay as a zip. I've also seen some false negative 'Ugoiras' that also have a .gif conversion inside--they technically aren't Ugoiras, but I'll write a loophole to allow them.
>>20822 v554 told me to send some files to you due to DamagedOrUnusualFileException very nsfw apngs https://gelbooru.com/index.php?id=7789734&page=post&s=view https://gelbooru.com/index.php?id=8167700&page=post&s=view
>>20824 >as hydrus still doesn't have proper internal browsing tech or anything to actually read the thing Well there's competing programs and I had to look around for one that fit my preferences eventually settling on CDisplayX. I don't think it would be too bad for you to just have ask "open in external program" if you try to open one. Before I start importing any I want to figure out about making a new local domain/tag service to separate them and their tags, which should be simple. I'll also need to figure out how to make the sadpanda downloader apply those tags to the large number of zips I already have instead of redownloading them all and burning through currency. It's something I'm putting off until I'm done tagging the rest of my images sometime in the first half of next year.
>>20822 >Also, APNGs now get transparency: when rendering in the native renderer; in their thumbnails (which are also now x% in); and for 'has transparency' checks. I still have black backgrounds in the thumbnails. Also can you make the display transparency checkerboard option work on animated images too please? It displays it in the preview/viewer despite me having it turned off.
>>20827 Oh, I just had to regen the thumbnails. Stupid me, should have tried it before posting.
Some of you guys said you were having problems with the program being slow. I had the same problem. Vacuuming your data will clear it up. It defrags your database.
>>20829 Keeps freezing up. Let them all run for over 2 hours earlier before deciding to close Hydrus. Then ran the short ones individually very quickly without issue. tried External Master, and it seemed to freeze up, I only ran it for half an hour before getting impatient that time. I have nearly 400GB of free space.
>>20352 Can I query import time with human friendly strings? In https://hydrus.app I have the default tag search set to: >system:import time ~= 2023-12-02 Which returns all of my files that are approximately close to today's date going back about a month or so. I'd like to not have to update this though and be able to do queries like: >system:import time ~= today Or >system:import time = yesterday Or if you're feeling adventurous, something like: >system:import time ~= last thursday at 2pm How difficult would it be to support something like that? There's probably some python modules out there that can parse dates like that?
>>20792 Thanks. Should be 'fixed' now, in that it'll cancel the boot and just close the process if the desired db folder does not exist. I think it'll write a 'crash log' on your desktop too, which sucks, but at least the error text in it is better now. I'm going to figure out a way of booting the UI despite a cancelled-boot situation so I can present these very early errors to the user visually in future. >>20794 Thanks. I think and hope these 'file is in use' tests work better in v554, and there's also the modified date filter like you suggested. Let me know if you have any more trouble with it! >>20796 I don't think so. The PTR help as written by the current administration of users is here: https://hydrusnetwork.github.io/hydrus/PTR.html I would generally say that if the files are very rare and not likely to be shared amongst our sort of weeb userbase, then no, don't tag them on the PTR. Anything private or so rare and obscure that it isn't worth sharing, just stick it on 'my tags' or another local tag domain you make. >>20797 >As >>20786 said, a different highlight color probably. Also having the highlight be a slight overlay, instead of background color behind the thumb, would be nice too. Thanks. I will plan to add this when I eventually get to heavy user customisation of thumbnail colours and icons and stuff. >How do you actually do that? Is it by putting a veto in both subsidiary parsers that exclude each other? Like both having the same condition, but one triggering when match found and the other when match not found? Yeah. If a subsidiary veto fires, then none of that subsidiary's basket of content parsers is elevated to the final list of results. I can't see an immediate excellent example of this in the defaults, but check the '4chan thread api' and 'Pixiv file page api' parsers examples of this tech in general. You don't always have to veto, since if there is no file result, that usually nullifies the post too (although I don't remember, maybe other metadata gets promoted in some situations). I'll reiterate that this tech is ugly, and I don't like it. It solved a couple of tricky situations long ago, but it is now bodged into several downloaders it shouldn't be. Venture at your own risk. >>20804 >>20805 >>20806 (and >>20808 probably) I am sorry for this trouble. This hit a couple of users last week, specifically if you have fixed-check-period subs that haven't run in a while. Please get v554 if you haven't yet, you should be fixed. Let me know if you have any more problems.
>>20812 'title:' has always had a weird place. I originally wrote it for Hentai Foundry posts, but people started using it for filenames and other descriptions. I have come to regret it. Tags are for searching not describing, and if we want to store titles or other short descriptions, that should be in a different metadata system. Something longer than tags but shorter than notes. I am still thinking about this. >>20816 Great idea. Although hydrus copies many booru-typical features, one thing I have never implemented properly is a tag wiki, a la: https://safebooru.donmai.us/tags This is a surprisingly complicated system, and I'm sure this would be feature slim for a long time, but I probably should put a couple months aside in the coming years to get something working here. Sibling and Parent data would slot in very naturally, too. And UI like tag clouds would benefit from having the core system here to hang off. I'll mention in a btw, though, that in general hydrus tags are overwhelming. It is very easy to run into the hundreds of thousands or millions of unique tags, and trying to manage that with human eyes is tricky. Still, we have no way to 'review' tags right now other than crazy '*' searches in the autocomplete, so I should figure something out and we can iterate. >>20817 >>20818 Man, moving to Py 3.12 already seems ambitious. I'm still worrying about ironing out 3.11 bugs. I think your best bet is to try the 'advanced' setup_venv path and then select a different Qt version. I'm no expert on this, but it looks like that error (and the page here https://pypi.org/project/PySide6/6.6.0/) suggest 6.6.0 is the first PySide6 release that supports Py 3.12, and the new '(t)est' Qt is 6.6.0, so just follow the advanced setup and select (t) Qt, and I think it'll work! I can't promise that you won't get a different error down the line with OpenCV or something though, so this might need a couple attempts. Maybe just try (t)est/(n)ew for all the available options. >>20819 No big worries, this happens in some weird situations. Try running the 'reset downloading->do a full metadata resync' job on that same review services panel. Then maybe hit 'refresh account' to jog it back alive. Your client will do a bunch of resyncing and fingers crossed it'll fix itself. If not, there's a hacky solution that I know has worked for a couple of users: - hit help->advanced mode - open a new file search page, change the 'file domain' from 'my files' to 'repository updates' - search system:everything. you will see a bunch of hydrus-icon files - ctrl+a, delete everything - restart the client - do the metadata resync again Let me know how you get on! I have tried to fix this thing automatically about six times now and I still can't figure out the last case that that 'delete and redownload all updates' hack represents. >>20825 Thanks! You can delete those files if you like, or hang on to them and hope mpv and ffmpeg patch and fix their rendering. >>20826 Yeah, I'm hoping one benefit of simply recognising CBZs (and renaming the .cbz) is now you can super easy link to the external program for them and launch them into there, whereas before it was obviously all zips mixed together, opening in 7z or whatever. >>20827 >>20828 No worries. I should make thumbs auto regen after a 'has transparency' switch anyway. I'm afraid I can't turn off the checkerboard if you are playing the file in mpv--that's built-in in some way. Maybe the mpv.conf has a way to turn it off, although I can't find an appropriate entry in here https://mpv.io/manual/master/ . >>20829 >>20830 Thanks, this is useful info. In the old days, I used to hit this like every 60 days, but as the vacuum time grew with our database sizes, I eventually stopped all automatic work. You can imagine how long it takes to do a 50GB client.mappings.db. Maybe I should highlight this more, when the date becomes 'you haven't done a vacuum in a year', and prompt the user to check it out. Can you say if you are on an SSD or HDD? My '90 seconds to an hour' estimate was obviously guessing too fast in your case. >>20831 Not yet, but we have exactly the library to handle this in the download parsing system, I just need to pipe it into the system predicate parser!
>>20833 >Can you say if you are on an SSD or HDD? My '90 seconds to an hour' estimate was obviously guessing too fast in your case. HDD, but the guesses on the shorter ones were correct. I think something is causing the big one to hang cease doing any work. It'll probably fix itself once I get an SSD and more RAM.
Here's an idea I'm sure has come up before: tag definitions. Sometimes I forget exactly what a tag is supposed to mean when dealing with edge cases and have to pull up some external wiki on a booru page to try and figure out what it should really be used on. It would be nice if I could write my own 'definitions' so that I could just like right click on a tag and hit 'definition' and a box with what I wrote pops up. Maybe it could integrate with the PTR as well so they can clarify what tags are good for what images.
Is there a way to mass delete all files under a tag rather than actually rendering it? Large deletions seem problematic due to hydrus rendering all of the media.
>>20836 you could always use system:limit and just delete in batches
>>20837 Still going to take a while with this process. I feel as if we should have a mass delete process under manage scheduled jobs
>>20835 >tag definitions >Sometimes I forget exactly what a tag is supposed to mean when dealing with edge cases Then namespaces might be what you are looking for. For example, "Brown" has many meanings, but with namespaces its definition is explicit: color:brown name:brown individual:brown code:brown condition:brown character:brown See: https://hydrusnetwork.github.io/hydrus/faq.html#namespaces
>>20839 Things can get far more autistic and specific than just colors. Like names of various sex positions which could be confused for other sex positions by the uninformed.
>>20833 >>20829 >Can you say if you are on an SSD or HDD? My '90 seconds to an hour' estimate was obviously guessing too fast in your case. Mine's on an HD. It was starting to run horribly slow, so I vacuumed. It had been about 9 months since my last vacuum. It took about 8 hours for the biggest db, and I got a "not responding" on hydrus for that time. But afterwards, it ran MUCH faster, and the lesser db's only took a few minutes each to vacuum. My Hydrus Database install is about 40GB now. Around 2 Million files.
>>20840 >Like names of various sex positions I don't think I want to know what Sex Position: Brown is about. :p
":" in tags affects autocomplete, so at first I thought tags were hierarchical and created many tags with colons, together with siblings and parents. Now I want to change the colons to "-", and both the autocomplete difference and the time needed to open the siblings window are making it difficult.
>>20841 >It took about 8 hours for the biggest db Damn. I keep forgetting to run it overnight.
v553, 2023/11/30 15:10:08: Attempting to download an update for public tag repository resulted in a network error: v553, 2023/11/30 15:10:08: 404: {"error": "This update hash does not exist on this service!", "exception_type": "NotFoundException", "status_code": 404, "version": 56, "hydrus_version": 553} so should I just export my local tags and reset the PTR entirely or can i somehow "roll back" to a valid hash?
>>20843 >the time needed to open the siblings window are making it difficult This is a big annoyance of mine too. I wish there was a faster way to apply tag relationships, or that these parent and sibling windows did some kind of "lazy loading" so that if all you want to do is make a new relationship, it can be done quickly.
>>20833 >>>20812 > 'title:' has always had a weird place. I originally wrote it for Hentai Foundry posts, but people started using it for filenames and other descriptions. I have come to regret it. Tags are for searching not describing, I think text within speech bubbles should be in "text" and "meme". This should help you find reaction images. It is just a title that it often does not seem like.
>>20839 namespace bloat does not aid understanding, it just makes things more complicated. more spinning plates that you have to keep consistent.
>>20840 >Things can get far more autistic and specific than just colors. I know, some of my namespaces are 4 words long and the tags might get 10 words long. shoe in mouth:brown nylon bounded:brown interlocked legs:red, orange, and brown >>20848 >namespace bloat does not aid understanding Yeah, it can get messy fast.
(56.43 KB 264x310 confused pumpkin.webm)

>>20849 Can you explain the logic behind those namespaces? And especially behind the last one and its tags?
>>20850 >Can you explain the logic behind those namespaces? The logic is totally autistic and makes sense only to the autistic author. That said, autistic psychoanalysis is not under this thread scope. >And especially behind the last one and its tags? Referring to the tag; it is meant to pop up when searching for red, or for orange, or for brown.
First cbz problem potentially found? Not sure what's causing this, I transfered them from cbr to cbz using comicrack, so it all should be correct. Rest registered correctly.
>>20851 Oh, so you just made those up, or found them in the PTR? Nevermind.
(118.10 KB 600x394 rttyy7fa.png)

>>20853 That's rite. Totally made up autistic inspiration.
I don't know if this would be possible but it would be cool to have an export as option, giving you the ability to export as an alternate filetype. i.e. png>jpeg mp4>webm m4a>mp3 etc.
(73.55 KB 498x306 1701796210.mp4)

>>20833 >I'm afraid I can't turn off the checkerboard if you are playing the file in mpv--that's built-in in some way. Maybe the mpv.conf has a way to turn it off, although I can't find an appropriate entry in here https://mpv.io/manual/master I found the --alpha argument. I tried putting "alpha=yes" into mpv.conf and it doesn't seem to be working (makes the bg black though), so instead I changed it to "alpha=blend" and added "background='#343434'" (the dark theme media bg color), so it looks like it's transparent.
>>20856 Thanks for finding this. While we're talking about mpv, is it possible to make the scanbar be below the frame instead of overlapping it?
When you compare 2 files in the duplicate filter and mark them as same quality, then later one of the files gets compared and marked as worse quality and deleted, does the other same quality file also get deleted?
>>20857 Don't know about that, probably not. But you can make the bar disappear if your mouse isn't hovering over the video in options > media and at the bottom check "no, hide it" near "animation scanbar height while mouse away".
It looks like that comic archive detection might be faulty. Is it supposed to detect zips with video files in it as being comic archives?
When doing a "number of tags" search. Is there a way to exclude namespaces from being counted? I see that you can search for the number of tags in a specific namespace, but I want the opposite. I want the number of tags, except for a few namespaces.
So i've been using Kemono.su downloader someone posted earlier this thread or maybe the last one since the default one stopped working when it changed to .su from party. I dont get clickable urls like i do with pretty much everything else though. Anyone know why or got a solution?
>>20862 Go to network > downloaders > manage downloader and url display, switch to the media viewer urls tab and look for kemono.su post and flip it to yes. If you don't have kemono.su there, you'll have to import that downloader again.
(18.50 KB 448x419 kemono su.png)

>>20863 Post url is already set to yes.
Right now, we can download multiple tags from one site. Could you make a Gallery Download option where instead of supplying multiple tags, we could supply multiple sites. And then instead of specifying one site to download from, we specify one tag to download. Kind of a reverse of the normal Gallery Downloader. Sometimes, I want to download just one tag over about 10 different sites, but I have set it 10 times for 10 different sites. It would be nice if I could just set the tag, and then supply a list of sites (downloaders).
>>20865 See, >>20512 >>20522 Also, you would often run into issues with sites using slightly different tags, with tags only matching across sites for the most generic of tags which you ought not want to grab en masse.
I had a good week. There's some cbz/Ugoira follow-up, nicer system:time parsing, much better boot error-handling, and some improved UI quality of life. The release should be as normal tomorrow.
Hi, I've got 2 questions. 1) Is it possible to make Hydrus list all files with pending tags? 2) Does it matter, if I never commit my pending tags? Will syncing with the PTR work, even if you have thousands of pending tags in your local Hydrus? In the duplicate processor many PTR tags get copied to the better files. Most tags are probably useful, but a lot of tags are "meta:tagme" trash, that I would like to delete before committing.
I don't think ripping from sankaku idol works anymore without a login. Fixes?
Trying to drag a video file to a new tab so I can select it and its duplicate and mark one as better, but it won't let me because it thinks I'm trying to export it. This apparently happens for all files that I haven't given a filename: tag because the hash is too long. Might need to set a default subscription import option so everything gets a random filename attached.
>>20870 This also going to be a problem when I actually do want to export said files for posting.
https://www.youtube.com/watch?v=BE4ptZRUJZo windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v555/Hydrus.Network.555.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v555/Hydrus.Network.555.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v555/Hydrus.Network.555.-.macOS.-.App.dmg linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v555/Hydrus.Network.555.-.Linux.-.Executable.tar.zst GET I had a good week. There's some cbz/Ugoira follow-up, nicer system:time parsing, and much better boot error-handling. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html highlights The CBZ/Ugoira stuff last week went ok! We had a few too many false positive Ugoiras, so that test is tightened up, and those files should soon become CBZs. For false positive CBZs (i.e. just a zip of CG images or similar), we don't have a good automatic solution, but I still plan to roll out 'force this file to be this filetype' tech in future so we can, amongst other things, assign these fixes manually. The various 'system:archived since xxx' predicates' parsing is now plugged into the same excellent date parser we are using in the downloader system. If you type them in, they'll now take all sorts of date phrasing. Try 'since 01/05/2016' or 'before june 2022' into the autocomplete--you may even be able to use your own language! There is still a little work to do with the 'since/before x time units ago' variants, though. In some unclear cases (including foreign languages), the before/since may be flipped to what you type. Also, I have decided to soon migrate these predicates to just store days and hours (no more year/month). You can still enter '1 year ago' with this new parser, but on my end, trying to calculate leap years and weird month durations has caused too many problems, so I am going to simply pull back over the near future and let you put 365 or 30 in yourself! In any case, give this stuff a go and let me know how you get on. When the hydrus client fails to boot really early on, before the main UI system is live, it should now nonetheless pop up a dialog saying what happened! The only way this will fail is if the problem with the boot is the Qt UI library, lol. In either case, the 'hydrus_crash.log' file is still made on your desktop. After talking it out with users, I have decided to move towards dropping the image library OpenCV from the program. It has served us well, but it is often difficult to install and a bloat, and our flexible alternative, Pillow, works extremely well these days. I'm not ready to flick the switch yet, but we have done work here and there, and if you would like to help me test this out, please hit the 'IN TESTING: Load images with PIL' checkbox under options->media and let me know if you have any images that suddenly load incorrectly. The String Splitter and Joiner objects in the parsing system now accept \n and \t for newline and tab. If you need to split or join by \, use \\. To not break any existing parsers, existing objects that have a \ have been updated to have \\. next week I only have two more weeks in the year, so I'll just do some cleanup and little jobs.
>>20829 Let my vacuum run overnight. Freezes still happen. Just got to get a real computer. My browser was hanging as well the vacuum just before bed as I used it a little right after starting the vacuum, which is isn't normal. Both normalized by morning.
>>20872 May the blessings of Hans Camenzind and ideally Santa be with you, my dear dev-kun.
>>20818 my meme addiction withdrawl symptoms were getting too gnawing on my mind, I managed to somehow get it working even though I don't really understand how multiple python versions are stored and even though there's still only a dir called "python3.12" somewhere in the hydrus venv dir. Well, I just edited hydrus_client.sh and now it works under X11 again. I hope that I can just use pythong 3.12 again somewhere in the first half of 2024. :) Merry holidays and take it easy my dudes. And thanks very much for taking the time to instruct me on how to get it running again.
Hi. I'm having trouble making a parser. Can you help? 1. https://www.listal.com/viewimage/22700074 2. https://www.listal.com/viewimage/22700074h Link 1 is a random listal.com image post. If you click the image, you are taken to a similar page with the higher-res version of the image (link 2). What I want to do is download the high-res image only. As well as this, I want to associate both urls with the image. What I thought I could do was use the referral URL function. My thinking was that, when I add link 1 to hydrus, it would make a 'referral URL' with the 'h' appended, request the modified URL to and scrape the high-res image link, download the image, and associate both the original and modified URLs with the image. However, so far it only downloads the lower resolution image and doesn't appear to try the high-res link. Did I completely misunderstand referral URLs? If so, what exactly does the referral URL function do? And how do I get link 1 to automatically download the image from link 2?
>>20864 problem is the parser, the associate URL wasn't updated for the /api/v1 change.
Can you explain. I don't understand :(
>>20877 >>20878 Oops sorry. Didn't realize at first that you were talking to someone else.
>>20876 I think referrals are for when you need to tell a web page that you are visiting from somewhere else, not for redirecting. Anyway, what I would try doing is making two post url classes, one with the h and the other without, then set an api redirect from the lowres url class to the highres one by appending an h to the end, then link the highres one to your parser and leave the other unlinked. It's technically not an api, but I don't think hydrus has a way of knowing. In the parser try adding two associable urls using the context variable formula type (this pulls the current url) with a regex that removes the h at the end while the other adds it.
>>20876 Thanks
A watcher page has two "check now" buttons. The error message "all checkers are paused! network->pause to resume!" appears near the bottom one, which is likely to be hidden if there are about ten watchers.
I'm trying to update my client from v531 to the new v555, but I'm running into some kind of wayland issue. the output is: v555, 2023/12/09 15:52:02: hydrus client started ./hydrus_client: symbol lookup error: Hydrus Network/libQt6WaylandClient.so.6: undefined symbol: wl_proxy_marshal_flags I tried messing around with that file and copied the old one from v531, but a new error from a different file appears. Even after doing the same with that file, I just get a segmentation fault. v531 works fine so not sure what happened between those versions that broke this. Anything I can try to fix this?
>>20834 >>20841 Thanks, interesting. >>20835 Yeah. Have a look at my answer to >>20816 here >>20833 . I'd totally like a tag wiki. What features would you like most--just the text description? Anything else? >>20836 >>20837 >>20838 Yeah, the system:limit hack is the way right now. I hope and expect to write something like tags->migrate tags for arbitrary metadata update. You'd put a file search in and then set up the command and it'd all go through nice and asynchronously. It'll take some backend and UI work, but the various tools here will be useful for all sorts of other jobs, so I do think it'll happen naturally in time. Other option, if we are talking five million files, is using the Client API. There you can do whatever you like. >>20843 >>20846 When I have a bit of time and energy, getting manage siblings and parents to load their data incrementally, on demand, is going to be a high priority. I want those dialogs loading quickly and, if I can figure it simply, staying open while you use the rest of the program. >>20845 Damn, I keep seeing this and I still haven't figured it out. Try hitting 'reset downloading->do a full metadata resync' on your PTR review services page. But I think you'll probably have to do the dumb hack to fix this, which is: - turn on help->advanced mode - open new search, change 'my files' to 'repository updates' - do system:everything, you should see a bunch of hydrus icon files - ctrl+a, delete - restart your client, hit 'refresh account' on your PTR review services page, it should redownload stuff and hopefully 404 problem is gone
>>20845 Oh yeah and let me know if it does/doesn't fix this! >>20852 Can you send me either those files (email or discord DM is fine), or post a screenshot of the internal filenames? My zip vs cbz scanner basically says 'are all the images named similar to each other, and fewer than X% of the files non-images?', so I'm guessing you either have random filenames or a whole bunch of xml and stuff (maybe that's normal for some cbr? or comicrack?). >>20855 Yeah. One day, I think we'll have this. If you'd like to read more, search the thread archive on >>>/hydrus/ for 'exe manager' or 'executable manager', where I talk about this more. >>20856 Awesome, thanks! >>20857 >>20859 Sorry, this UI is mine and it is in limbo a bit. I had to hack a bunch of things when we figured out the mpv player, and I am still in the midst of smoothing out some of the surrounding UI layout tech. There are some hacky options to change behaviour here and there, but I hope to eventually have it more customisible and less ugly in future. >>20858 No. This retroactive action should be possible in the far future, though. >>20860 Like for >>20852, can you send me screenshots of the file lists? I should probably write a detector that assumes any video file extension discludes it from being cbz, but seeing some examples would help.
>>20861 No, I don't think so. I can probably figure out the search tech here, but what's the ideal UI look like, do you think? How would you like the workflow and interface to work here? I can't put loads of time into this, but maybe I allow you to comma-separate the namespaces in the little text field, and also allow you to hyphenate them to discount? >>20868 1) Yep: on a new search page, click 'include current tags', which will disclude them, and then search 'system:number of tags (has tags)'. It'll only search pending tags and show you everything. 2) Yep, should be fine. Any time someone on the PTR commits a tag you have pending, it'll be committed, and if that tag happens to get deleted later, it'll disappear for you, but nothing technically bad should happen. For various reasons, some people have had millions of pending waiting. It can slow some processing down, since there is more stuff to check against, but nothing should break. To solve your duplicate issue, I am sorry we don't have nice defaults here yet, but: in the duplicate page, hit 'edit default duplicate metadata merge options' and then under the various 'tag services' merge options, change the 'edit which tags will be merged' bit so it doesn't merge 'meta' and so on. Pic related. If you only want to exclude a handful of specific tags--or if it would simply just be helpful, then you can type in individual, not the whole namespace. If you build up a nice list of 'these tags are bad for merging', I'd love to see it so I can integrate it into a helpful default here. >>20869 Shame. Maybe there's a hydrus downloader out there, but I'd recommend an external program like gallery-dl, if that works, or just look for the content elsewhere. >>20870 >>20871 Thank you, I will try and auto-cull the filelength here! The dirty secret is if you have 'copy temp files for drag and drop' under options->gui turned on, the program makes the actual copy any time you DnD anywhere. A unfortunate limitation of Qt, that I have to set everything up before the DnD starts, afaik. >>20882 Thanks, I will see if I can add that text for the whole-page button too! >>20883 Hmm, this seems to be different from the other recent Linux problems, which are related to a system-level python update. This looks more like a .so file conflict. This could fundamentally be the same basic thing--that your Linux or the github cloud Ubuntu updated and thus some .so in the build isn't happy with its complement in your system any more. As you've tried, sometimes copying an older .so will work, but these things can get complicated. I think roll back to the functional v531 for now, and then explore running from source, and if it works out well, transfer your db to that new install. I recommend this for all Linux users now, especially if they have any unusual stuff like Wayland. It really does smooth out most of the kinks. The help is here, it should be pretty easy to set up these days, let me know if you have any trouble: https://hydrusnetwork.github.io/hydrus/running_from_source.html Don't forget to make backups before you try/move anything around! And let me know how it goes regardless, please.
>>20886 >but what's the ideal UI look like, do you think? Since namespaces can't start with hyphens (Hydrus just removes the hyphen) prefixing the namespace with that could be how you say "I want to exclude this namespace instead of include it". As for how to separate namespaces so you could include or exclude multiple, I'm not sure what would be good here. I just tested, and it turns out that namespaces can contain both commas and semicolons, which is cool that you get that sort of flexibility, but then I don't know how else to separate the namespaces. I suppose since the 1 thing Hydrus never allows in tags is newlines, maybe having them newline separated like a tag search would work, but maybe that would be too much work for you to be able to do it. >maybe I allow you to comma-separate the namespaces in the little text field, and also allow you to hyphenate them to discount? I think something like this would be fine to start, the UI could be improved whenever you have more time to dedicate to it, but for now being able to type something like "-meta; -artist; -character count" into the field, and have that return a search for the number of tags except for those namespaces, would be very helpful for me.
>>20886 >>20887 Oh, and you can just make it so you can use backslashes to escape the commas or semicolons
>>20884 >open new search, change 'my files' to 'repository updates' I think you lost me on this one, how do I change that? The first method did not fix it sadly
Is there something wrong with instagram username search? It never finds anything
Is the danbooru downloader not supposed to download ugoira webms by default? I noticed it only got the zip file and that there's a second parser that's supposed to get the webm. Do I just switch to that?
>>20869 Don't bother trying to download from Sankakuchan. You have to go through a bunch of hurdles, change headers, and even when you get it all set up, the URL's only last for an hour before they are no good anymore. It's just not worth it. We need a downloader that can automatically change the key every hour.
>>20892 Actually, you can just use the cookies from your browser to download. Or at least, that works for me.
>>20886 >Hmm, this seems to be different from the other recent Linux problems, which are related to a system-level python update. thank you for the quick reply. Running it from source seemed to work, v555 opens now, and I was able to upgrade one of my databases. I'm running into another issue with mpv as stated by >>20446 . It's the same issue where even a single click on a video opens mpv, I cannot control or close the mpv window in any way, and hydrus says that it unloaded it. I followed the messages to see if there was a fix, but it seems that it's just praying mpv updates? I find it strange that mpv works correctly for me when I watch videos outside of hydrus but breaks when hydrus tries to open it. Is there a workaround such as not even invoking mpv to open videos for now?
>>20390 >>20398 && >>20786 Passing through to +1 eventual support for JXL lossless transcoding. Using cjlx and djxl, it's easy enough for me to losslessly encode and decode JPEG and PNG files, so I'd be happy to store as JXL even for my booru for imageboard files I'd need to convert back to older formats before posting. Using the default cjlx effort settings: >smaller hydrus client (various imageboard files) 512 PNG/JPG/GIF files losslessly shrank from 120.0 MB down to 84.7 MB. (70%) >larger client (digital art downloads from boorus) 2,240 PNG/JPG files losslessly shrank from 3.9 GB down to 2.5 GB (64%) That said, I did hit OOM on a couple of files, even one large image (13,000x10,000) went OOM on 16GB memory.
Filtering by the number of urls would help find files worth tagging.
>>20889 In the normal autocomplete search dropdown of any search page, you can click the 'files' or 'tags' domain buttons to alter the search domain. If 'advanced mode' is on, you get a bunch of weird domains too, which is what we want here.
>>20895 >16GB memory Is that including your swap partition, or does it ignore swap space or something like that?
>>20884 Not him but for the tag wiki maybe links? E.g. the description for long brown hair would likely mention long hair, and brown hair, and being able to just click on them and go to their respective wiki entry would be nice. Maybe also some stats for the tag, "you have 95k images with 'Yoshi' tags", "total viewtime of this tag is 7 minutes", "this tag takes up 185 gb disk space"
>>20899 Oh also, wiki support for namespaces. For example: "Tag: medium (namespace) Description: The 'medium' namespace should be used to tag metadata about the physical properties of an image, aspect ratio, duration, what medium it is" or something like that. I don't know if having the wiki system in the PTR is planned but if it is the namespace definitions should probably be locked from changes unless a petition is suitably compelling.
>>20894 >Is there a workaround such as not even invoking mpv to open videos for now? I think you can change that under options > media, under "media viewer filetype handling".
>>20897 I got it now but the client isn't letting me delete the repo updates
(61.23 KB 658x960 bits.jpg)

>>20900 >a wiki While unrealistic, is not practical. The meaning of any given namespace is so arbitrary and diverse as anons' opinions. Add to that who may be on charge to type the meaning of zillions of namespaces... for free. Also >the namespace definitions should probably be locked That's fine until an autistic mob shows up to demand changes that suit them better. Then it is a slippery slope. Again, a wiki requires a fag doing something in the background to keep it updated, also permission to edit a wiki are restricted. So your proposal comes naturally to the logical end: who will recruit, supervise, and pay that fag?
>>20892 Sankaku idol and sankaku chan aren't the same.
>>20903 >Again, a wiki requires a fag doing something in the background to keep it updated, also permission to edit a wiki are restricted. So your proposal comes naturally to the logical end: who will recruit, supervise, and pay that fag? The PTR already has jannies and it appears to work fine.
I installed Feren OS and migrated the DB there via the windows client. WinOS can't read Ext4, so the windows client can access the files running in Feren via wine. I've tried accessing the database via a .tag.gz. installation of the Linux client, by trying to load the DB as a backup. But the client reloads nothing happens. I aslo tryed useing "-d/--db_dir launch" in the cli. But im kinda of retarded can't get any command to work.
(71.92 KB 800x600 555_Timer.jpg)

>>20872 Congrats on the trips release
(508.19 KB 844x937 hydrus example.png)

I'm insane and I'm still on v362. Months ago I tried upgrading to the latest release (526 at the time) and I successfully managed to follow the "upgrade path" > 362 > 376 > 421 > 466 > 474 > 480 > 521 > 526 I did so with no issues. But that's beside the point. I ended up returning to 362... The new features make me wish to upgrade again, however I'm afraid there are huge deal-breakers for me. What made me decide to return to 362 were the "new" tag sibling and parenting systems ("new" in quotes as I'm unsure how new it actually is). Just so I didn't waste your time, I downloaded v555 and tested it again; it still has the same issues. I mean no offense by this; 555 is an excellent program and has many improvements over 362, yet the sibling/parent system holds it back for me. I'll first explain how I use siblings (perhaps I use them in an unorthodox way?), then I'll explain the problem is with the new system. I use siblings as convenient alternate terminologies/spellings for tags so that I don't have to remember the exact tag name. I also use convenient "typo siblings" just in case I make a typo, then it will auto-correct to the right tag. For example, for a tag like "piercing", I may have the typo "peircing" as a sibling. Or "blonde" I may have alt-spelling "blond" or alt-term "yellow hair". For "pigtails" I may have "twintails", "twin-tails", "pig-tails", "pig tails", and many alt terms/spellings. I also have some tags where I use short "abbreviations" like "thh" for "thigh highs" for ultra-efficient tagging. With that said, I like having lots of siblings, and I like how it auto-corrects to the right tag. But, in newer releases, when you type a tag, you'll see in the suggestion box as you type "blon": > blond -> blonde > yellow hair -> blonde It shows the sibling that you entered and what the correct tag is. You will see a ton of tag siblings as suggestions. With the way I use siblings, this floods the suggestion box with a bunch of "garbage" sibling tags that I don't want to see; I only wish to see the correct tag. Perhaps a bigger problem, though, is the "new" parent system (pic related). It sometimes happens where I have a general parent for a tag that fits most images of that tag. Still, sometimes I want to remove the parent from an image, as it doesn't fit in rare instances. For example, imagine I have a tag "character:princess peach". Peach usually has blonde hair, so often so that I gave her "blonde" as a parent tag. But, what if I have an image of retro Peach, who has reddish-brown hair instead of blonde hair? Or perhaps there's an image where she has a goth look with black hair. In those instances, I'd like to remove the tag "blonde" from the image. But, in newer Hydrus versions, parents are "hard-wired" to the child tag; you cannot remove a parent from its child (if you can, it's not very intuitive; I wasn't able to find out how). So, siblings and parents received major downgrades (in my opinion) in newer Hydrus versions, requiring me to stay on an old 2019 version of Hydrus. I use Hydrus all the time and so I'd love to upgrade. But until siblings and parents are fixed so that they act as they do in older versions like 362, I cannot upgrade. I am posting this here just to see if anyone else agrees that this is a problem? If anyone disagrees with my diagnosis that this is an issue, then I mean no ill will, but I thought I'd let you know my perspective as an avid user. TL;DR: Siblings, to me, should be hidden and used as convenient "alt" tags, and parents should be separate from their children just in case an exception occurs where the parent must be removed. This is how it was in old Hydrus versions, and I would greatly appreciate if this was restored.
>>20908 >you cannot remove them, Even if the situation calls for it If the situation ever calls for it, that's not a true parent/child relationship, so you're using it wrong. >It sometimes happens where I have a general parent for a tag that fits most images of that tag. Still, sometimes I want to remove the parent from an image, as it doesn't fit in rare instances. If it doesn't fit 100% of the time, it's an incorrect relationship, and you should remove it. Hydrus doesn't (currently) have a concept of "almost" parents. The fact that you were able to do this in earlier versions was a byproduct of the relationship system being more hacky than it is now. The only solution I can think for you right now would be to make a script that uses the API to autotag any files with 1 tag, with another, and run it every once in a while. (btw, running on such an old version, I'm a bit surprised that the downloaders even still work for you.)
>>20909 >that's not a true parent/child relationship, so you're using it wrong. I suppose you may be right that it isn't a "true" parent/child relationship, but the convenience of having tags auto-added when adding a different tag is just too great. At least, it is with my workflow of manual tagging. The old parent/child system works perfectly fine (I have a database of 300000+ if that means anything), and it allows for parent/child exceptions to exist. This convenience of the older parent/child system has me staying on 362. I can understand that reverting the system back may be too much work, so I understand if it can't be done. I also understand if no one wants it to be done, but I thought I'd weigh in with my perspective just in case. >The fact that you were able to do this in earlier versions was a byproduct of the relationship system being more hacky than it is now. Was the older relationship system less performant? If not, then I would argue that the "hacky" relationship system of having the parent separate from the child was superior, as it intuitively allowed for more customization. What is "hacky" or bad about that, by the way? I don't see anything hacky about adding tags when another tag is entered. >btw, running on such an old version, I'm a bit surprised that the downloaders even still work for you. Well, to be frank, I don't use downloaders, haha... I just import images I have downloaded from my own manual browsing. I also use pixivutil to batch-download my pixiv favorites or entire artist galleries.
>>20910 >Was the older relationship system less performant? The old system seems like it hard-tags files when you apply relationships? This can lead to hard to reverse problems if you ever accidently apply the wrong parent-child relations. Now, you can just decouple them, and any files that were already hard tagged with the parent tag outside of a parent-child relationship won't have the parent removed since hard tagging is on a case by case basis and less likely to be done in error.
(25.92 KB 853x383 Capture.PNG)

>>20908 >With the way I use siblings, this floods the suggestion box with a bunch of "garbage" sibling tags that I don't want to see; I only wish to see the correct tag. i also agree this is very annoying. >>20910 >I suppose you may be right that it isn't a "true" parent/child relationship, but the convenience of having tags auto-added when adding a different tag is just too great. At least, it is with my workflow of manual tagging. the related tags tab helps with this somewhat (pic related). just by having the princess peach tag, it's suggesting earrings, crown, blonde hair, etc. >I would argue that the "hacky" relationship system of having the parent separate from the child was superior, as it intuitively allowed for more customization. i would not agree that your method of using parents is intuitive at all. >Well, to be frank, I don't use downloaders, haha... I just import images I have downloaded from my own manual browsing. why? you realise boorus already have stuff like "blonde hair" tagged for you? why would you manually add tags when other people have done that work for you? it's also more convenient in terms of getting the files into hydrus. >>b-b-but i have my own special tagging methods and i don't want it tainted by other people's tags! you can make siblings that turn common booru tags into your specific tags. you can also hide tags you don't like in the tag display options. just set up a whitelist and hide extraneous tags.
>>20905 >The PTR already has jannies and it appears to work fine. Jannies only supervise activity, but what you want is a data-entry fag rated in the range of not less than 150 characters per minute, as mediocre typist.
>>20906 >WinOS can't read Ext4 Install Ext2Fsd. Problem solved. https://en.wikipedia.org/wiki/Ext2Fsd
Hi. I want my Hydrus to tag 4chan metadata for downloads such as the filenames and board titles. When I looked into the URL parsers, I saw it's in-built. When I test the 4chan api parser, all the metadata gets tagged. However, when I download using a watcher, I never get any tag data. Is this the same for everyone else or is it just me?
>> 14019 Thanks. My main issue is accessing the old DB via a the Linux client. It should be possible to access the data base with a different client , as the docs mention going from migrating from existing Install to Source.
>>20914 Thanks. I could manage to access in Linux useing the win client though wine, but it wasn't very consistent . My main issue is accessing the old DB via a the Linux client. It should be possible to access the data base with a different client , as the docs mention migrating from an existing Install to Source.
>>20911 >This can lead to hard to reverse problems if you ever accidently apply the wrong parent-child relations. imo this is a small price to pay for its benefits of allowing a more flexible parenting-exceptions system, but that is subjective of course. >>20912 >i would not agree that your method of using parents is intuitive at all. It's intuitive because it simply adds more tags (I didn't realize this was so crazy), and allows you to simply remove them when you want to. I don't see anything unintuitive about it? >why? you realise boorus already have stuff like "blonde hair" tagged for you? why would you manually add tags when other people have done that work for you? Because I like tagging lol. I actually enjoy adding my own tags and looking through my image collection (even though I know I'll probably never get through most of my collection, it is what it is). Also I have many non-booru images, but anyways. >it's also more convenient in terms of getting the files into hydrus. It's plenty convenient to import my own files as well, I don't really see a downside to importing my own files. It's nice and fast and I have my own manually curated files that I know I like. Btw I actually didn't realize that Hydrus users didn't use Hydrus as a tagging system for their own files and instead mostly use it as a downloader, that's interesting. I like to browse images and download them on my phone a lot so that's where I get most of my stuff, I just use Hydrus as a convenient storage and tag my favorite images... Anyways, it seems like what I proposed is unwanted by other users and is against the design philosophy, so I'll just stick on 362 as it suits my way of doing things. I apologize for shitting up the thread. If I had to make a suggestion it would be to allow for adding some kind of (optional) "pseudo parenting" system, in addition to the newer coupled parenting system, where you can have tags added separately from "pseudo child" tags. Maybe that would be a nice compromise. However, considering there seemingly isn't any demand for it aside from myself, perhaps my "compromise" can be ignored.
In addition to the discussion of parents and siblings is there a way to cross link certain siblings and parents across tag services? Additionally a way to set up auto migrations of specific tag groups? i.e. Downloader Creator:*anything* to my tags
>>20919 You can set one tag service's relationships to apply to another service's tags in "tags → manage where tag siblings and parents apply". You can also import all the relationships directly by going to the relationship pages, selecting to show all, and exporting them all, them importing them to the other service, but I wouldn't recommend doing this unless you're sure that you want them all, because there's no undoing it.
>>20898 I only had 2 GB swap (silly in hindsight, might expand it soon) and it ate up that as well, so yes it was actually 18 GB memory.
>>20918 >Anyways, it seems like what I proposed is unwanted by other users and is against the design philosophy, so I'll just stick on 362 as it suits my way of doing things. Unfortunately, this really feels like a "xkcd 1172" sort of situation. I wouldn't be against a "fake parent" feature being added, but I would likely never use it. It just doesn't make sense to me, and as >>20912 pointed out, the related tags feature helps with this somewhat anyway. Maybe Dev will have something more to say. The good news for you is that since you don't use Hydrus as a downloader at all (so it doesn't make any network connections) there's probably no security issues with continuing to use that version of Hydrus indefinitely, or at least until it succumbs to bitrot. So staying on the version like you said you would is fine, I think.
"all my files" and "deleted" lines are fine but i don't think "archive" and "inbox" lines are where they are supposed to be.
>>20915 network > downloaders > manage default import options
I had an ok week. I fixed some bugs and wrote a system to force a file to be considered a certain filetype. The release should be as normal tomorrow.
>wrote a system to force a file to be considered a certain filetype Maybe this is wishful thinking, but if this system works the way I think it does, could this be used as a way to add a form of arbitrary file support?
>>20926 meant for >>20925
https://www.youtube.com/watch?v=e3mkeK6pRaY windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v556/Hydrus.Network.556.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v556/Hydrus.Network.556.-.Windows.-.Installer.exe macOS app: https://github.com/hydrusnetwork/hydrus/releases/download/v556/Hydrus.Network.556.-.macOS.-.App.dmg linux tar.zst: https://github.com/hydrusnetwork/hydrus/releases/download/v556/Hydrus.Network.556.-.Linux.-.Executable.tar.zst I had an ok week. I fixed some bugs and added a system to force-set filetypes. You will be asked on update if you want to regenerate some animation thumbnails. The popup explains the decision, I recommend 'yes'. Full changelog: https://hydrusnetwork.github.io/hydrus/changelog.html forced filetype The difference between a zip and an Ugoira and a cbz is not perfectly clear cut. I am happy with the current filetype scanner--and there are a couple more improvements this week--but I'm sure there will always be some fuzziness in the difficult cases. This also applies to some clever other situations, like files that secretly store a zip concatenated after a jpeg. You might want that file to be considered something other than hydrus thinks it technically is. So, on any file selection, you can now hit right-click->manage->force filetype. You can set any file to be seen as any other file. The changes take affect immediately, are reflected in presentation and system:filetype searches, and the files themselves will be renamed on disk (to aid 'open externally'). The original filetype is remembered, and everything is easily undoable through the same dialog. Also added is 'system:has forced filetype', under the 'system:file properties' entry, if you'd like to find what you have set one way or the other. This is experimental, and I don't recommend it for the most casual users, but if you are comfortable, have a play with it. I still need to write better error handling for complete nonsense cases (e.g. calling a webm a krita is probably going to raise an error somewhere), but let me know how you get on. other highlights I fixed some dodgy numbers in Mr. Bones (deleted file count) and the file history chart (inbox/archive count). If you have had some whack results here, let me know if things are any better! If they aren't, does changing the search to something more specific than 'all my files'/'system:everything' improve things? Some new boot errors, mostly related to missing database components, are now handled with nicer dialog prompts, and even interactive console prompts serverside. I _may_ have fixed/relieved the 'program is hung when restored from minimise to system tray' issue, but I am not confident. If you still have this, let me know how things are now. If you still get a hang, more info on what your client was doing during the minimise would help--I just cannot reproduce this problem reliably. Thanks to a user who figured out all the build script stuff, the Docker package is now Alpine 3.19. The Docker package should have newer libraries and broader file support. birthday and year summary The first non-experimental beta of hydrus was released on December 14th, 2011. We are now going on twelve years. Like many, I had an imperfect 2023. I've no complaints, but IRL problems from 2022 cut into my free time and energy, and I regret that it impacted my hydrus work time. I had hoped to move some larger projects forward this year, but I was mostly treading water with little features and optimisations. That said, looking at the changelog for the year reveals good progress nonetheless, including: multiple duplicate search and filter speed and accuracy improvements, and the 'one file in this search, the other in this search' system; significant Client API expansions, in good part thanks to a user, including the duplicates system, more page inspections, multiple local file domains, and http headers; new sidecar datatypes and string processing tools; improvements to 'related tags' search; much better transparency support, including 'system:has transparency'; more program stability, particularly with mpv; much much faster tag autocomplete results, and faster tag and file search cancelling; the inc/dec rating service; better file timestamp awareness and full editing capability; the SauceNAO-style image search under 'system:similar files'; blurhashes; more and better system predicate parsing, and natural system predicate parsing in the normal file search input; a background database table delete system that relieves huge jobs like 'delete the PTR'; more accurate Mr. Bones and File History, and both windows now taking any search; and multiple new file formats, like HEIF and gzip and Krita, and thumbnails and full rendering for several like PSD and PDF, again in good part thanks to a user, and then most recently the Ugoira and CBZ work. I'm truly looking forward to the new year, and I plan to keep working and putting out releases every week. I deeply appreciate the feedback and help over the years. Thank you! next week I have only one more week in the year before my Christmas holiday, so I'll just do some simple cleanup and little fixes.
>>20923 I'm not sure, but I may have fixed your numbers in v556. Please try again when it is convenient and let me know what you see. If it is still broken, try changing the search to something more specific--any better? >>20926 Yeah, fingers crossed. I'm already planning the next expansion of the tech I add today to do this.
(5.64 KB 629x173 12345.PNG)

>>20919 >Additionally a way to set up auto migrations of specific tag groups? i.e. Downloader Creator:*anything* to my tags as far as i know there's no automatic tag migrations. but under network > downloaders > manage default import options, you can choose which tag services get which tags. pic related.
(622.58 KB 910x1024 praise.png)

>>20928 >birthday and year summary I'm forever grateful anon.
>>20928 >hydrus is 12 years old >mfw No but seriously well done devanon, I usually can't keep my attention on a project for more than a week, much less 624 of them. I've bounced off Hydrus 2 or 3 times but I've stuck with it this time and it's great, thanks for all your hard work.
>>20932 KEK Devanon, Your dedication to this project is incredible. Please don't burn out. Take breaks when you need to.
Getting a strange glitch. I don't use downloader tags, but occasionally I may scroll up accidentally in the tag manager and put a tag under downloader tags. This is rare and I usually catch it, however, if I search "smile" in the downloader tags manager, it says there's (14) smile tags. This result will not show up in the normal search, even if I limit the search to downloader tags only, showing zero downloader tags for "smile".
Curious what devanon's thoughts on a tag preset system? Basically being able to set a preset bundle of tags with a name that you can click and apply all of those tags at once. Would make certain routine imports muuuuch easier.
Is there a way to apply a large tag change without using parents/siblings? I swear I've done it before, however I'm racking my brain trying to figure out how to My goal is to change a bunch of files tags from 'filename:*' to 'title:*'
>>20936 >My goal is to change a bunch of files tags from 'filename:*' to 'title:*' No easy way. This seems close to asking for namespace editing, which is still a distant dream, except even that wouldn't change your problem because I assume you only want to do this for a certain set of files.
>>20935 >Would make certain routine imports muuuuch easier You can set specific folders to always have imported files get specific tags on import, which sounds like it'd solve that issue for you.
Anyone here tried importing from iwara? I can't download any restricted content even though I've sent cookies using addon. Do I need to modify the parser somehow to make it use the cookies?
>>20939 Parser is probably fucked since the site underwent a major update some months back. I'm surprised it still works from non-r18 content.
>>20940 I'm using version 2023.04.01 and downloader components list suggests it is using api calls, not site parsing, but did the site change again since then?
So how do I use proxy for the PTR and some sites but not the others?
Is it possible to "zoom in" the file history chart to a more recent date instead of always being the entire history? I can't do a search on import time cuz then those files will be excluded. I just want the graph itself to be constrained time-wise, but all the files to still be accounted for.
>>20941 I think the major site overhaul was a month or two later.
>>20891 Yeah, it is a bit advanced, but you can change it under network->downloader components->manage url class links. 'danbooru file page' >>20894 >>20901 I'm glad running from source works. Yeah, your best shot for now is switching the viewer for animations/video to the 'native viewer', which is an inefficient video player I wrote myself. No audio support on that, so you'll want to set audio to 'show an open externally button'. Then wait for OS/Qt/mpv updates that fix the underlying window parenting issues here, and/or wait for me to one day completely overhaul the whole system and it magically fixes. Sorry I can't offer more help--this is all held together by duct tape in the best of situations. >>20895 Thanks, interesting. >>20896 Hell yeah, I want this myself. There's some tricky database stuff to do first, which is why it hasn't happened yet, but I should just knuckle down and figure it out. >>20899 >>20900 >>20903 >>20905 >>20913 I've never been an active participant in a booru site community, so I can't talk with too much expertise, but my general feeling here would be copying the general functionality of a typical booru tag definition 'wiki', like this: https://danbooru.donmai.us/wiki_pages/samus_aran And then allow local definitions like 'my tags', where only you can see it and you can set what you like, and then perhaps also figure out some sort of PTR-sharing system, with updates and janny filtering and all that. Bear in mind making a new PTR workflow, and for more complicated data, would be a ton of work. ALSO, I'd figure out wiki parsing (e.g. of the samus page above) and easy import/export packaging, so users could just download and share straight-up compilations of everything booru x y or z already has and integrate it into their client so you have a simple easy reference without us having to retype basic data about what a skirt or blue eyes are. Since there are already multiple good compilations, this may obviate the need for PTR style sharing. It'd all be stored and viewable on your client, locally. No URLs to a server anywhere. >>20902 Damn. What happens when you ctrl+a and then click delete and/or right-click->delete from repository updates? Do you get the delete dialog/menu entry, but it doesn't work, or do the files not seem deletable at all? Pic related.
>>20908 Thanks for your feedback. Yeah, I'm afraid when I overhauled the sibling and parent system, I decided to pare back what it could do. Siblings and parents are so complicated under the hood that it stretches my capabilities, and I just could never make a good balance between 'yes they always apply' and 'oh except in these cases'. The system, as you've seen, is now what I called 'virtual', in that siblings and parents always apply, but they do so in a separate 'display' domain. The underlying storage tags that you edit in the 'manage tags' dialog are no longer altered by siblings and parents (which is why you are unable to remove one or two), but the benefit is they are now completely undoable. You can rearrange siblings or repeal a parent, and everything that was changed is cleared out cleanly. I cannot go back to the old way things worked. I have toyed with the idea of introducing a secondary sibling/parent system ever since I moved to the virtual system; something that would enable functionality like you would like, where there are hard-replaces of storage tags based on confirmed siblings and so on, and we may ultimately go there, particularly to fix some bad old PTR tags en masse. But it hasn't happened yet and I can't promise it in any reasonable amount of time. If you ever want to play around a bit more, as >>20912 says, you might like to play with the newer 'related tags' system. It got a big overhaul this year, getting much better statistics, and it should typically always offer 'blonde' when you have 'princess peach'. It might help fill the gap a bit in your old parents system, although I appreciate it isn't the same. For siblings, I agree, I think we need some more display and search options for the autocompletes. Sometimes you just want to see the final siblings and don't want to be overwhelmed with the nitty-gritty of different storage tags. I'll just have to put the work in to add new search and display code. Thanks again, let me know how you get on in future. >>20915 >>20924 Yeah, change the settings here up top, where it says 'default for watchable urls'. 'watchable' are threads, as opposed to booru file posts. Most users don't want watcher filenames parsed, so this is turned off by default. Tell your default watchers to grab everything, and you should get filename tags. >>20916 Yep the database files are completely portable across OSes. You need a new install obviously, for the particular OS, but just move your db directory around and it'll boot anywhere. As well as this page, which I expect you have seen, https://hydrusnetwork.github.io/hydrus/database_migration.html , this one might be helpful too, https://hydrusnetwork.github.io/hydrus/launch_arguments.html , particularly the '-d' argument, which can launch a db folder that is, for whatever reason, outside of the current install directory. >>20919 >>20930 Yeah, nothing repeatable, but the big guy that moves lots of tags around for a one-time job is tags->migrate tags. Be careful, it is super powerful.
>>20944 Fug. But anyway, seems like the changes to the web interface did not affect the API, parsing and downloading works exactly as the expected, however the API simply ignores web login cookies and I can't find any documentation on it.
>>20931 >>20932 >>20933 Thanks lads, keep on pushing. No big worries about burnout--I burned out and drama-bombed on several projects before this, and when I finally figured out and removed what it is that makes me crazy (working in teams, not having tight deadlines), I was able to stick out this lone marathon of weekly releases. Before I knew it, ten years had passed. I'm currently hoping AI programming will get good enough in the next eight or ten years that more weird community projects like this will be feasible, and then I'll be able to fade contently away around v1000. >>20934 When you do the search and get 'smile (14)', is the 'file domain' of that autocomplete dropdown set to 'all known files with tags'? This is an advanced domain that most users only see in 'manage tags'. It includes deleted files. If you hit help->advanced mode and then open a new search page, you can click 'my files' and change it to 'all known files with tags'. Then you can actually search files you don't have, and even edit their tags. If you like, you can remove the 'smile' from them. Or, if you prefer, you can change the default file and tag domains of your autocompletes under options->search and (more importantly) tags->manage tag display and search (for anything but 'all known tags', which isn't under 'manage tags'). >>20935 >>20938 Interesting idea. Yeah, you probably want to have a play with import folders (file->import and export folders). Under the 'metadata import/filename tagging' section, you can just spam a bunch of 'tags for all', which applies those tags to everything passing through that import context. You can do your [ 'wallpapers', 'sort later', 'sfw' ] tag spam differently for the different import folders you set up. Test out the tags and workflows you like with the manual import, and then automate those with import folders. I do the same with a web comic I read--I parse the creator, series name, and page tag from the filename, and then my browser is set to save jpegs from that comic web domain into the 'x web comic' import folder, which runs every 24 hours or so. I don't even think about it, I just get the comics appearing in the background with good tags. >>20936 >>20937 tags->migrate tags is the main tag transferring system, but it doesn't do renames, I don't think. Best solution for now would be the Client API. If you are feeling extremely cheesy, you could try doing a mass export/import with sidecars that applied string processing steps to rename the tags, and then do a 'migrate tags' run afterwards to delete the originals. >>20942 There's the 'no proxy' setting under options->connection, but you have to do some crazy patch of requests that I don't know enough about to talk cleverly on. If you have a proxy manager on your computer that manages the proxy forwarding, your best solution is to see if that can do the domain filtering for you on that end. Anything from 'hydrus_client' executable that goes to x domain gets proxied, all else not. In the future I'd like to have much better options here. >>20943 Not yet, but I'd like it in future.
>>20947 cont I'm no real webdev and might be completely off the mark here but it seems like the website is somehow generating one-time and/or session tokens and sends them to api before accessing any restricted content. Meanwhile the downloader by design completely bypasses the website part and uses api to both look up id and download the content. If that's correct there does not seem to be any simple way to make the existing script work with user logins. And maybe there's no way to download restricted content without browser js at all.
>>20949 Good news. yt-dlp works with logins through api calls alone. Bad news. Each download request really does generate tokens that expire after 1 hour. What would be the proper way to use such login system in hydrus?
Is there a way I can set hydrus to only connect through one port? Would rather it only work when my VPN is active
>>20950 Content parsers can create http headers now. If you can get the token from the api, you can set a cookie http header.
>>20952 Tried backtracking the header-request-response parts in the browser log. The "token" api requests already have some different token filled in "HTTP Authorization request header" presumably generated/received by a heavily compressed JS. yt-dlp takes a different approach and sends plain username and password to api directly to generate one time token. No cookies involved. I could hardcode my credentials in downloader script but that's not exactly a user-friendly solution.
>>20953 Looks like the initial token is indeed generated by clientside JS. That's a dead end, right?
(12.62 KB 392x150 shitkaku.png)

Regarding shikkaku complex not working, that old beta api downloader still works partially: https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/tree/master/Downloaders/Sankaku ... but it parses file urls too soon, so they expire before they can be downloaded and end with a 403 status, or the shitty banner getting downloaded and discarded as known file. Which makes me wonder, if it would be feasible and worth the effort to give Hydrus a blacklist feature for known bad substitute files, so that it can detect this kind of download spoofing as failed download, that needs to be retried. Although probably not worth the effort for this one shithole. Are there any other sites doing this kind of file substitution? PS: I actually spent 10 minutes coming up with that shikkaku (失格) pun, and am unreasonably proud of it. Look it up.
>>20954 Disregard that, the persistent login token is not generated clientside, it's just stored in browsers Local Storage instead of in a cookie. The entire workflow looks like this: >make /user/login API request with login and password as json payload >receive json with long term token, store in Local Storage For every page load: >make /token API request with long term token in http header >receive json with short term token, store in js variable For every video download: >make /video/*ID* API request with short term token in http header >receive hot link to video Can Hydrus work with that?
>>20955 Is that a very new or very old issue? Sankaku worked for me just yesterday, though i had similar issue in october before an update fixed it.
(2.84 KB 579x517 18-02:13:08.png)

Ever since I updated (553>556) some videos act strangely. They "play", the seekbar progresses and if I open them I can hear the audio, but I usually get picrel in the previewer and a blank screen when I fullscreen. And of course now that I try to find some examples all my videos seem fine. I don't think it's a file issue because sometimes regenerating the thumbnail causes it to get fixed, but sometimes it didn't work so IDK. Could it have something to do with changing the thumbnail %? I changed it after updating, and I've been running the job in the foreground. The thumbnail job is done now so the timing does line up with not seeing anymore errors.
Considering the parent and sibling system is managed by janitors on the PTR is it a bad idea to have the parent siblings of the PTR apply to my tags for consistency?
Speaking of the suggested tags tab, is there a way to get it to take every tag service into account? Or at least more than the one you're tagging on
>>20959 Not necessarily a bad idea, but I personally keep them separated to a different tag domain because it might be annoying to have a tag you consider to be 'x' "incorrectly" be sibling'd to 'y' by the PTR. I have one domain that's basically supposed to be the "canonical tags" and that tracks the parents & siblings of the PTR, and almost all tags parsed go there.
>>20961 Oh, I forgot to mention, that's also the suggestion for ai tagging, put ai tags into it's own domain and then inherit parents and siblings from the PTR.
Is there a way to make any tags I commit to the PTR auto commit to my tags, I like contributing but also want them collected personally incase of any tag disputes
>>20956 >Can Hydrus work with that? probably not. at this point i would just use yt-dlp with an import folder instead. one day when arbitrary executable support is added we can all use gallery-dl and yt-dlp and just delete hydrus's entire downloader system. it would be better for everyone if the work on internet content downloaders would be focused on fewer projects rather than spread out.
>>20923 (me) >>20929 Cool, looks like it's fixed now. It's always been broken since the file history chart was introduced but i was too lazy to report until now. Thanks dev.
>>20961 >>20962 I set it up to it's own domain, another silly question, would it be possible to set up a tag domain that pulls the parent siblings, my tags, and downloader tags only? Say if I just wanted the PTR's random tags not cluttering a search?
>>20966 For searching specifically, you can set the query to a specific tag domain, but you can't currently exclude one specific domain AFAIK. I don't know about auto-migrating tags, I don't think that's currently possible either. >>20963 Unfortunately I don't think this is a feature yet, which is a shame because I would also want this.
(64.35 KB 808x446 repo.webm)

>>20945 basically fuckall happens
>>20968 weird. does the archive delete lock apply to repository updates?
I made a parser for sturdychan, it can probably work on any site running shamichan but I'm unaware of any others. Near as I can tell, shamichan doesn't have JSON to work with, so I did it all with html. It is set to associate URLs but it doesn't work the way I expected it to, I thought you'd get a clickable link to the parent thread under known urls. It grabs filenames and tags them under the 'filename' namespace. It gets post time, and has a watchable url entry. Any tips on how to improve it are welcome, and feel free to share it.
>>20970 Whoops noticed some pretty bad bugs. Use this version instead. This one works on the rest of the boards, and it gets the OP post.
>>20970 >>20971 I don't think associating with this website in any way, shape or form is a good idea.
hello, i got some questions, maybe you can help out a beginner, that would be much appreciated. i read the whole documentation but havent found answers to everything. those questions are partly directed to Hydev, since only Hydev can answer parts of it. anyone is invited to answer questions they know the answer to. 1) i know i can set up the media viewer to not show any tags, but if i had a wish: can Hydev please add another button to the upper media viewer bar, that maybe includes several toggles to change the visibility of different information? for example toggle on/off the WHOLE manage tags window on the left, so even when you go over with the mouse, it would act as if there was nothing at all (so no tag window popping). if thats not possible, maybe just turn off the tag visibility then. other toggles under this button could be to turn on/off notes, urls(maybe add a window to display those?),ratings on the right, and also a toggle to turn on/off the information on the very top about the media itself (for example -> 180 KB jpeg (560x1,000) | 50% | added to my files 1 year ago etc....). It is not so pleasant to the eye, if this piece of information isnt hidden by an image completely. Depending on the size of the image, sometimes it is, sometimes not. So a visibility-button with all kinds of toggles like i just explained, would be very awesome. Maybe with an eye icon or so. 2) can i search for files that have number of URLs > 0/1/2 etc. for example? i know you can search for 'system:known url', but that doesnt allow me that, if i just want to search for files that have any number of URLs. if it is not possible right now, that would be also something on my wishlist :) 3) i havent synced with PTR yet, so in general i want to know, is it possible to migrate mappings from the PTR to 'my tags' only for the tags that exist in 'my tags'? i know there is the "migrate tags" feature, where i can also filter the tags (->'tags taken' button). after pressing that button, there is the 'advanced' button next to the 'whitelist' and 'blacklist' buttons with an 'except for these' window. i assume here i could paste all the tags from 'my tags' to only allow these to migrate from PTR to 'my tags', is that correct? if so, can someone tell me a way how i can copy ALL tags from 'my tags' and paste it inside the 'except for these' window i just mentioned (there is also an 'import' button that asks for some JSON object, which i dont know anything about)? maybe search for 'system:everything' and then in the tag window, right click -> copy all tags? would that work reliably if i had thousands of tags? I guess if this works like i just described, doing the same in the future again and again would mean only the mappings that are missing in 'my tags' would get updated and it would not start a huge job and migrate everything for every single file again, right? If you wonder why i would want that: i feel like i would display the tags from the 'my tags' service much more and use the PTR just to get the mappings for files. Maybe because i feel that 'my tags' are "save" and maybe whatever can happen to the PTR tag service. 4) how to transfer all mappings to another client, let's say an offline computer? i know the 'migrate tags' feature can create Hydrus Tag Archives. Can someone explain what i could do with them and if it is possible to import all the mappings/tags from client1 to client2 that way? i know an 'archive.db' file will be created. is this a database file that works like a local tag service? is it possible to update that .db file in the future by selecting it again when chosing the destination or will it be created from scratch again (i dont know how big this file can get)? also it seems i would need to create three seperate .db files: mappings, tag parents, tag siblings. since i can't do them all at once. correct? so in general i want to know, what is the best workflow to get the mappings from client1 to client2(offline) and update it from time to time? sorry for the length, but thanks in advance to everyone who tries to answer some of them!
>>20972 I'm ool, not to derail, but why? It doesn't look any worse than other chans at first glance. And considering 8chan is banned from google results, and this isn't..
>>20969 If I return a single update to inbox, it still won't let me delete it
Hey, my dev machine died today (its GPU actually caught fire when I turned it on!), so no release tomorrow. I'll get a new machine over the holiday and restart for the new year, meaning v557 will be on January 3rd. Thanks everyone, and 𝕸𝖊𝖗𝖗𝖞 𝕮𝖍𝖗𝖎𝖘𝖙𝖒𝖆𝖘!
Is there a way to add system tags to your favourites? I.E: "system:limit is 256" or something? I've been trying to, but can't figure it out. If not I think that would be a great feature.
>>20976 >its GPU actually caught fire when I turned it on! F >𝕸𝖊𝖗𝖗𝖞 𝕮𝖍𝖗𝖎𝖘𝖙𝖒𝖆𝖘! Merry Christmas anon.
>>20976 You know you set it on fire just so you could get a new 4090 for Christmas.
>>20976 F and merry Christmas
>>20957 Could you post your login script for Sankaku? I didn't seem to get an updated version of it when I updated Hydrus.
>>20981 I gave up on login script and used browser extension to copy cookies.
>>20981 Same for me. Only that old downloader from the cuddlebear github is downloading the first few pics before the links it tries are expired. No luck with the included downloader at the moment.
>>20982 Is that something I need to set up? I already use Hydrus Companion, but since it still doesn't work I'd assume that I have to set up something manually to copy the cookies to make it work.
>>20976 Holy shit! I didn't even know that's something that could actually happen. Well uhh, Merry Christmas, Dev!
>>20984 No the companion extension should do it. There's a button in the menu that you click to send your cookies for the site you're on to Hydrus.
>>20983 Links expire after an hour on the new site.
Would really like the ability to 'delete without storing hash' In case there's a file I might want to import later but it's currently in the wrong batch of files/tags
>>20988 Isn't that what "Permanently delete without saving deletion record" does?
>>20989 Didn't see that, does it also remove tag data?
>>20990 Nope, everything linked to that file stays and there's no way to purge that data in a user friendly way. The dev said that he wants to implement a feature that would delete files completely at some point. What you can do though is selecting all the files and just deleting the tags in the tag manager before deleting the files. You can also use the tag migration tool (which also lets you clear deletion records of deleted tags that still linger after you delete them lol). You can do the same for urls, but you can't mass delete notes if you have any. There are probably more things to clear like similar file records, times and I don't know what else, but not sure if there's a nice way to do these.
>>20991 Hmm, okay. There definitely seems to be a problem when I'm importing rips from twitter and such, that I pick up a lot of junk with no real way to clean it without manual labor. Occasionally picking up art that I already have, or gifs that's been transfered to MP4 format and I get 10 or 20 copies of the same 'reaction gif' that aren't even hash checking properly. Left quite a mess on my hands after about 65k rip files. aaaand if I delete them I don't even know if I might want them later, just messy. Appreciate the info tho, fren.
>>20493 Very late reply, but I would highly advice against using Pixiv's tag translations. These are not community translated, or professionally translated, but outsourced to some awful crowdsourcing platform, where they're "translated" by outsiders who don't know shit about anime, games, or the character names. Examples: This character's name is "Earlette", not "Arlette": https://www.pixiv.net/en/tags/%E3%82%A2%E3%83%BC%E3%83%AC%E3%83%83%E3%83%88/artworks Crowdsourced idiots don't know the character, so they just romanized the katakana. This anime is called "Smile Precure": https://www.pixiv.net/en/tags/%E3%82%B9%E3%83%9E%E3%82%A4%E3%83%AB%E3%83%97%E3%83%AA%E3%82%AD%E3%83%A5%E3%82%A2/artworks?s_mode=s_tag Some crowdfunded moron googled it and added the name that they use to market it to 6 year old American girls, not the name that us fat otaku use when we want to fap to Miyuki. Whoever "translated" this is such an idiot: https://www.pixiv.net/en/tags/%E9%9D%A2%E5%85%B7%E5%AD%90/artworks The creator is Chinese, and his OC's name is Chinese too, but they made up some Japanese name. So 面具子 should be "Mianjuzi", not "Mennguko" (wtf kind of name is that anyway) There is no way to correct these mistranslations. You would need to register separately from Pixiv on that shitty crowdsourcing site, and apparently then you can only add a suggestion. So please, for your own sanity, and the integrity of your dataset, please don't use Pixiv's "translated" tags. And if you do, please don't push them to the PTR! That would essentially be data poisoning at this point. Better to make a script that uses deepl or something to translate tags.
>again Can sankaku stop fucking up everything every 2 weeks Last time I had to redownload everything from over 300 tags What happened now?
>>20994 Also, is there a way to mass change sankaku to something else for the day this shithole stops working for good? I have over 400 entries now
>>20994 Also 2 this time isn't just redownloading everything, it stopped working
Is there a 4chan thread watcher that imports posts as notes?
What's the state of AI tagging these days? Any good integrations with images already in Hydrus? I saw there was some good projects for deduping videos being worked on too.
>>20998 I just ran some random pics through deepbooru, and got some insanely specific and accurate results, and some that are just completely wrong. E.g. all the junk it lists for Hibiki's pic, and turning Aria shachou (that white cat) into a black cat for whatever reason. No idea what's necessary to integrate it into Hydrus. I only have it because it comes with the Stable Diffusion WebUI.
>>20994 I'm with you man. They change shit so constantly. Trying to get all my favorites downloaded and move to something new, but they keep breaking the downloader. Does anyone have an updated one that's working now?
>>20998 Go to the bottom of this page. The one guy had one, but deleted it. Now, it seems another guy has taken it up. I was using the first one. I would say it gets the major tags right. If you set it to .10 threshold, it will also detect loli pretty accurately ( but only on drawn / anime, tons of false positive loli on realistic ).
Oh, and this is for Hydrus. It scans the Hydrus database and writes AI derived tags to My Tags.
>>21000 For you and other anons with issues, I've figured out the way to do this. Necessities, auto refresh extension, chrome, and hydrus companion. Set up auto refresh 60, and set hydrus companion to send cookies from chan.sankakucomplex.com every 0.3 days. Lastly import the old downloader png from the archive, occasionally you'll have to retry ignored on some, but mostly works.
>>21004 Yeah this looks like a good reason to abandon that trash site for good I still fail to understand why is the booru with the most images for everything How do i mass change my queries from sankaku to gelbooru? Since gelbooru I think is the second one with the most stuff even when it bans way more stuff and lacks core tags
>>21005 Just copy paste the queries. I personally think the effort is pretty minimal to keep sankaku going, I imagine we could even do this within hydrus if hydrus dev implemented some inline browser to send the request for the cookie, but obviously the effort required at that point is high. Overall pretty easy to manage with this method though.
>>21006 Also to add, I fixed it and found this method with like an hour of troubleshooting, and it's almost completely automatic, so... lol
>>21006 Thing is, I use hydrus over some VM I only run every couple days and hydrus companion refuses to connect to hydrus so I need to manually add the cookies from time to time
>>21008 also, what downloader? Because even adding fresh cookies it refuses to download anything
Sankaku allows pretty much anything It sucks but its the only way to go I hope we get some fix
Guys, it's easy! Search for Cuddlebear92's site. Under downloaders, download Sankaku Downloader. READ THE INSTRUCTIONS. This will tell you how to get the key for the API and load it into the header section in Hydrus. This header will last for about 2 days, depending on the timer. Now you can do gallery downloads as normal with Hydrus. BUT, the URL's only last for an hour, so don't let search go over about 2000-3000 URLs. Pause search when you get there until the downloader catches up. Otherwise, your URL's will time out and fail. Once you catch up, unpause the search, and get another 2000 - 3000 URLs.
(973.21 KB 1402x567 h.png)

About 6 or 7 months ago when Hydrus got updated to a new version of Qt, there was this issue where Hydrus was obeying Window scaling and the main page was scaled up - after a bit this issue got fixed and it went back to the way it was I just copied my entire Hydrus installation to a new computer running Windows 11 and now despite all my other settings carrying over, the scaling issue is back, and the main page is blown up and each image tile takes up way more space. Does anyone remember what this fix was back when Hydrus updated to Qt5 to get the GUI to stop scaling? Or am I misremembering this issue entirely and it was some other scaling issue Pic related, on my old pc there'd be twice as many images per row on the main page. PC scaling is at 125% but I'm pretty sure my old pc was also at either 125% or even 150% so I don't think that's the issue.
>>21012 i realize now that screenshot is really bad because it doesn't show the scale of the rest of the GUI my b
>>21011 This fucks ups automation which is the entire point of this
>>21014 Lol, I think that was why Sankaku implemented it! But, if your patient, you can get about 3000 URL's at a time.
>>21015 Yeah but as said I just use a VPS that automates this all I can't manually refresh the thing for 4xx tags I will hope mr dev fixes it since sankaku is the only booru that matters, sadly or just hop into gelbooru, that still allows most stuff
>>21016 Yeah, it would be nice. It will take more than just a script. The key for the header will have to be fetched when the program detects it's timed out, and new URL's will also have to be fetched every hour, automatically by the program. That's going to take a whole new kind of script.
>>21017 Agree though, that Sankaku probably has the biggest archive of stuff, on par with Pixiv.
(373.53 KB 1600x1238 merry-christmas.jpg)


Merry Christmas anons. Have a comfy night.
>>21018 Is there any reason for this? the site is full of ads and aids as its as user unfriendly as it gets Yet, for some fucking artists it have twice the stuff Pic related, 331 on gelbooru 752 on shitkaku
>>21020 I'm guessing some of the artists are getting paid to post there. The whole site smells of Extreme For Profit.
>>20997 You can edit it yourself. Open the 4chan parser, go to the subsidiary page parsers tab and double click on posts, then there, go to the content parsers tab and add a new one with content type set to notes. At the bottom there, click on change formula type and change it to JSON, then edit it so it grabs "com" entries. That should probably be enough, but the comments will be in html, so you might want to add some regex string processing to remove the tags and then convert "&gt;" into ">" and such.
(3.13 KB 512x124 com.png)

>>20997 >>21023 Or you know what? I made it for you, it also includes the regex, but no guarantee it won't miss something. It should replace <br> tags with new lines, then it removes all html tags, then it converts the most common html character entities such as <, >, &, ", and '. Import it as described in the png.
>>21021 Sankaku ignore all the paywalled content takedown claims and steal all the kemono files that is the main reason it have way more stuff and how they lure people in to keep making money with that trash of a site
>>21020 But how many more of them are low res artifacted dupes?
>>21026 I did a test with one with 180 vs 93 and only found 4 jpg dupes
>>21024 Thanks. I was and am too sleepy to notice that post and made something, too (paste on the page where "file url", "filename", "md5 hash", "post time" are). It only replaces &gt;, the entity for ' and post links, and inserts \n around quote within the quote span, though it will probably make a little mess if there is a link or some other span inside the quote. [30, 7, ["post content", 18, [31, 3, [[[0, [51, 1, [0, "com", null, null, "com"]]]], 0, [84, 1, [26, 3, [[2, [55, 1, [[[9, ["<br>", "\\n"]], [9, ["<a href=\"#p([0-9]+)\" class=\"quotelink\">&gt;&gt;\\1</a>", ">>\\1"]], [9, ["<span class=\"deadlink\">(&gt;&gt;[0-9]+?)</span>", "\\1"]], [9, ["(<span class=\"quote\">)(.*?)(</span>)", "\\1\\n\\2\\n\\3"]], [9, ["&gt;", ">"]], [9, ["&#039;", "'"]]], ""]]]]]]]], "post"]]
>>21028 This puts the post number into a "meta:post:" tag and the flag name into a "meta:posted with flag:" tag. [58, "4chan thread api parser", 2, ["4chan thread api parser", "f050195e3b333f9ec4dae0e7439a85d765c9d0894d2cc88bca8c4ffe8c8bc3b6", [55, 1, [[], "example string"]], [[[31, 3, [[[0, [51, 1, [0, "posts", null, null, "posts"]]], [2, 0], [0, [51, 1, [0, "com", null, null, "com"]]]], 0, [84, 1, [26, 3, []]]]], [58, "first post for html comment for thread title if no subject", 2, ["first post for html comment for thread title if no subject", "c831d69041ea98d3159e4caabb54c6493610b57323153a57380313a0cb59cd20", [55, 1, [[], "example string"]], [], [26, 3, [[2, [30, 7, ["thread page title from comment", 17, [27, 7, [[26, 3, [[2, [62, 3, [0, null, {}, 0, null, false, [51, 1, [3, "", null, null, "example string"]]]]]]], 1, "href", [84, 1, [26, 3, [[2, [55, 1, [[[6, 64]], "parsed information"]]]]]]]], 0]]]]], [], {}]]], [[31, 3, [[[0, [51, 1, [0, "posts", null, null, "posts"]]], [1, null]], 1, [84, 1, [26, 3, []]]]], [58, "posts", 2, ["posts", "3b4f67ee70a0879ec0d5fbd7c18492a1095f7975808079107fbb0a53129a0346", [55, 1, [[], "example string"]], [], [26, 3, [[2, [30, 7, ["file url", 7, [59, 2, [[26, 3, [[2, [60, 2, ["url", [84, 1, [26, 3, [[2, [55, 1, [[[9, ["(.*)a.4cdn.org/(.*)/thread/.*", "\\1i.4cdn.org/\\2/"]]], "https://a.4cdn.org/tg/thread/57806016.json"]]]]]]]]], [2, [31, 3, [[[0, [51, 1, [0, "tim", null, null, "tim"]]]], 0, [84, 1, [26, 3, []]]]]], [2, [31, 3, [[[0, [51, 1, [0, "ext", null, null, "ext"]]]], 0, [84, 1, [26, 3, []]]]]]]], "\\1\\2\\3", [84, 1, [26, 3, []]]]], [7, 50]]]], [2, [30, 7, ["filename", 0, [31, 3, [[[0, [51, 1, [0, "filename", null, null, "filename"]]]], 0, [84, 1, [26, 3, []]]]], "filename"]]], [2, [30, 7, ["flag_name to \"meta:posted with flag\" tag", 0, [31, 3, [[[0, [51, 1, [0, "flag_name", null, null, "flag_name"]]]], 0, [84, 1, [26, 3, [[2, [55, 1, [[[2, "meta:posted with flag:"]], "Discord"]]]]]]]], null]]], [2, [30, 7, ["md5 hash", 15, [31, 3, [[[0, [51, 1, [0, "md5", null, null, "md5"]]]], 0, [84, 1, [26, 3, [[2, [55, 1, [[[5, "base64"]], "8gH1nXsz9z4NOcr4HMvHVA=="]]]]]]]], ["md5", "base64"]]]], [2, [30, 7, ["post content to note", 18, [31, 3, [[[0, [51, 1, [0, "com", null, null, "com"]]]], 0, [84, 1, [26, 3, [[2, [55, 1, [[[9, ["<br>", "\\n"]], [9, ["<a href=\"#p([0-9]+)\" class=\"quotelink\">&gt;&gt;\\1</a>", ">>\\1"]], [9, ["<span class=\"deadlink\">(&gt;&gt;[0-9]+?)</span>", "\\1"]], [9, ["(<span class=\"quote\">)(.*?)(</span>)", "\\1\\n\\2\\n\\3"]], [9, ["&gt;", ">"]], [9, ["&#039;", "'"]]], ""]]]]]]]], "post"]]], [2, [30, 7, ["post number to \"meta:post:\" tag", 0, [31, 3, [[[0, [51, 1, [0, "no", null, null, "no"]]]], 0, [84, 1, [26, 3, [[2, [55, 1, [[[2, "meta:post:"]], "40643692"]]]]]]]], null]]], [2, [30, 7, ["post time", 16, [31, 3, [[[0, [51, 1, [0, "time", null, null, "time"]]]], 0, [84, 1, [26, 3, []]]]], 0]]], [2, [30, 7, ["veto if no file", 8, [31, 3, [[[0, [51, 1, [0, "filename", null, null, "filename"]]]], 0, [84, 1, [26, 3, []]]]], [false, [51, 1, [2, "", null, null, ""]]]]]]]], [], {"url": "https://a.4cdn.org/mlp/thread/40643692.json", "post_index": "40"}]]], [[31, 3, [[[0, [51, 1, [0, "posts", null, null, "posts"]]], [2, 0]], 1, [84, 1, [26, 3, []]]]], [58, "first post for subject", 2, ["first post for subject", "e61b1098e4e759405760bac9c7c2aff0c77c9158998af1b1704c1b1390b37e32", [55, 1, [[], "example string"]], [], [26, 3, [[2, [30, 7, ["thread subject", 17, [31, 3, [[[0, [51, 1, [0, "sub", null, null, "sub"]]]], 0, [84, 1, [26, 3, []]]]], 1]]]]], [], {}]]]], [26, 3, []], ["https://a.4cdn.org/mlp/thread/40643692.json"], {"url": "https://a.4cdn.org/mlp/thread/40643692.json", "post_index": "40"}]]
>>21029 or maybe better to a note (so that it is visible in the viewer and can be clicked to edit and copy): [30, 7, ["post number to \"post #\" note", 18, [31, 3, [[[0, [51, 1, [0, "no", null, null, "no"]]]], 0, [84, 1, [26, 3, []]]]], "post #"]]
Any plans to fix hydrus?
>>21031 Can you elaborate?.
Is "verify database integrity" working? IIRC It used to be fairly quick but now it's been going for 12 hours and hydrus stopped responding long ago, but still uses some CPU. According to Task manager it's still reading/writing from disk but Resource Monitor shows nothing. Had to end task.
>>21002 URL to the new repository for AI tagging. https://github.com/Garbevoir/wd-e621-hydrus-tagger I got this up and running, it works pretty well. YOu'll need to install some Visual Studio bits to get the python venv to build.
>>21034 Thanks! I've got VS on my computer. Will this work for non-furry? Also, is there a way to view what tags are being used for tagging?
Is sort by time: modified time not sorting properly? I just modified these for a bunch of files using the sidecar system so that each file is 30 second apart to force a specific order, but it completely ignores the times even though when I check in manage > times, they are correct. Sorting by import time works fine.
>>21032 Sankaku*
>>21037 Sankaku intentionally breaks things to prevent scraping. I don't use it, but it looks like you can get it working according to >>14130 and >>21004 Sankaku will probably break things again at some point. Best to get used to it breaking or use a less shitty website.
>>21032 >>21038 >use a less shitty website. Any idea of how to change the site massively for all ym tags?
>>21039 Well, what you would have to change to automate it all is Hydrus. The script would have to change the header every 48 hours, as it has the auth key. Then, the script would have to make sure the URLs are less than an hour old when it downloads them. Finally, the script would have to make a new request for the same tag with an offset every 5000 files. This is why standard Hydrus gallery downloader scripts don't work on beta.sankakucomplex.com
>>21040 You have to make all these changes manually as you go. I usually download ~3000 pics at a time. That works well. Just remember to change the header every 48 hours. Link to the downloader --> READ THE INSTRUCTIONS! https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/tree/c2f94b5d6327aac46a6b3e5a82b4a02600998c74/Downloaders/Sankaku
>>21041 I guess you could write a script to do all this using the Hydrus API to automate Hydrus.
(60.05 KB 525x1006 EXCEL_uXpEytGFKV.png)

>>21035 There are two different models trained off of danbooru and e621 respectively. If you have majority anime or majority furry, it would probably be fine to run one or the other and call it good. If you have a huge mixed library of files though, I'd recommend running the Danbooru model first, as that can tag images with "furry", then run images with that tag through the e621 model afterwards. It only shows tags given to files if you give it a single file at a time by hash or filename. If you're asking to view all the tags available to the models, they're listed in the .csv file that's associated with the respective models.
>>21035 >>21043 I ended up adding the ability to view the tag output for bulk hash work as a feature. If you download the latest version of the repository and overwrite the changed files (the main.py of the two taggers) you can use the --privacy flag to control the output to the cli. I've also hidden the tag output as default for both individual hashes as well as bulk hashes. Also for anyone wondering about it, I'm starting the slow process of bumbling my way through python coding to try and merge the two taggers into one super tagger. The goal is for the user to be able to create a folder in the /model/ folder, name it whatever they want, put whatever model they want in it, then load it by giving the program the folder name. Some things to consider include whether or not a model handles ratings, as well as how to keep track of which model is which, since the folder names can be anything. The program expects arguments for the name of the model as well as the repo id associated with it, and I'd like to respect that. My idea to fix both these issues is to have some sort of information file with the models, storing information such as the model name, source, whether or not it handles ratings, and how many tags to consider as ratings tags (since it can differ between boorus; Danbooru has general, sensitive, questionable, explicit, while some other boorus only have safe, questionable, and explicit). I'd include two pre-made ones for the expected models, and for new models the user could simply copy/paste from one of the other models and adjust as needed. As always I'm open to criticisms and suggestions. I haven't started any real work on it, so feel free to suggest drastic idea changes if my current plan doesn't sound ideal (or feasible).
>>21041 So, I did this and still the shit ass downloader just start redownloading the entire thing every single time ignoring past downloads that was the actual issue to begin with
>>21045 Also 2 I don't know what the fuck is this fetching but the downloaded files aren't even the latest files I see on the website Somehow it managed to hop to 2019 and this is a generic tag that have 70k files so its not like its just old files from some random artist This fucking site is cursed
>>21046 >>21045 And despite reloading it now I'm only getting this files FUCK sankaku
>>21044 Thank you for putting this together, and well documenting the process to get it working. I probably has the most difficulty with getting the correct VSC++ bits installed, and had to put clang-cl.exe in the path variable myself. I would love to see additional models be available for tagging use, although I'm pretty ignorant of all the differences between them.
>>21047 Lol, dude, you have to do it manually with Gallery Downloader as described in the instructions. You can't use subscriptions, because the URL's only last for an hour. I've noticed that the first download of about 5000 or so will go pretty fast. However, Sankaku seems to detect that your scraping, and things slow down to about half the speed later on. So, on your next tag scrape with the offset, set search limit to about 2000 files at a time.
>>21049 Your first scrape with gallery downloader should start at the beginning. Then just keep scraping 2000 files or so at a time, and using offsets on the next scrapes, until you get back to where you left off. If for some reason your scrape is jumping to some old time, you can force it to start at the beginning by just using the offset. i.e. bondage id_range:<=34649595 but use your own tag, and whatever the starting number is for the latest file for that tag. Now, on your next scrape, make it bondage id_range:<=34641000 or something like that, whatever the last number is on your last downloaded file using the scrape.
>>21044 Cool! I'm glad your continuing it! I'm using Abtalerico's version, which uses Smilingwolf's WD 1.4 Waifu model. Unfortunately, Abtalerico seems to have erased his version off the net. I've gone through a little over 2 million pics with it, and it seems to work really well. I have threshold set at .10, so it will recognize and tag loli, which I can then delete ( I hate that stuff ). So much loli is going untagged now, and it does a really good job at sensing it.
>>21049 >You can't use subscriptions The entire point of hydrus is to keep scrapping new stuff fromartists, to do 1 time downloads you already have gallerydl
>>21052 Yeah, but your dealing with a site here who has anti-scraping defenses in place. Hydrus won't work as normal here. Everything has to be done manually through Gallery Downloader. Someone might be able to write a python script using the Hydrus API to automate it.
>>21053 Sounds like fixing hydrus will only make them fuck it up again I guess its time to just hop into gelbooru
>>21038 >Sankaku intentionally breaks things to prevent scraping The recent continued breakage isn't to stop scraping according to the mods. It's because the site's going through a huge overhaul and update over the past 2 years. The continued breakage is still annoying though, but there are things on Sankaku that simply aren't anywhere else, so I don't have a choice myself.
>>21055 "according to the mods" Is that why they have keys that only last 48 hours, and urls that only last 1? Lol, no. It's because they don't want the people like us scraping their site.
>>21056 That's not what he said, he said instability and frequent changes in the site isn't to stop scrapers, those features have been there for a year plus, that's not the 'frequent changes'
(765.43 KB 2560x1390 Screenshot 2023-12-29 125217.png)

(150.02 KB 1410x1003 Screenshot 2023-12-29 130057.png)

Out of curiosity I opened all of the 291,281 files (888.1GB) in my database in one tab. Hydrus handled it fine (after several minutes) and was using about 20GB of memory. Then I selected all of the items (took like 10 seconds) and pressed F3 to open the tag manager. Hydrus slowly went up to using about 26GB of RAM then gave up (on a 32GB system, page file set to 4GB, task manager said 90% of memory was used). The tag manager window never appeared but it didn't crash or anything so that's good. Better than I expected.
>>21055 >I don't have a choice myself. And how do you even plan to use it when all you can do with it is manual 1 time downloads
Anyone else suddenly getting 403's from danbooru?
>>21022 Thanks! I don't use it for tag generation very often, so I didn't know that one.
>>21062 Why would anyone want to scrape a site that is worse than gelbooru and way worse than shitkaku
Who scrapes danbooru? They don't have shit.
And if anyone is doing the manual download thing with Sankaku, they only let you do 100 pages of a tag at a time. Then you have to do the id_range:<= thing with another instance of the tag. But, with some patience, it's still scrape-able.
>>21065 I don't really understand what they did Is being able to use subscriptions on sankaku impossible ever again due the url changes thing?
>>21044 I got a wild hair (as well as the flu) and ended up doing the work to get it working. No more separate e621-hydrus-tagger and wd-hydrus-tagger folders, meaning no two separate copies of basically the same exact code. Practically everything has been updated to account for this, as well as some dependencies being updated. Python 3.12 seems to break things, so officially this is a Python 3.11 (and maybe below) piece of software. I'm leaving it in its own branch for now, just in case people find a really bad bug or some use-case I didn't account for. You can find it here: https://github.com/Garbevoir/wd-e621-hydrus-tagger/tree/e621-wd-merge I'll probably merge this into the default branch in about a week or so if nothing is found. As always, criticisms and suggestions appreciated. Praise also accepted but I really don't deserve it. I barely know what I'm doing, if my edits to the code doesn't make it obvious.
So, why is sankaku always broken but gallerydl can still do just fine without updating? Are they somehow targeting hydrus specifically?
So the sankaku issue seems to be pretty minor, they changed the url from sankaku.com/posts/abc123 to sankaku.com/en/posts/abc123. I modified the url class and it seems to work again. [50, "sankaku chan file page", 12, ["3dac81c6ce4ef28a310c2f2dfb55981a8d15f22d59ff0dd6e32daded25830556", 0, "https", "chan.sankakucomplex.com", [true, false, true, false, false, false, true, false], [[[51, 1, [0, "en", null, null, "en"]], null], [[51, 1, [0, "posts", null, null, "posts"]], null], [[51, 1, [3, "", null, null, "375abf2c3143c26aa63c692f88e3c850"]], null]], [], false, [51, 1, [3, "", null, null, "example string"]], [], [55, 1, [[], "https://hostname.com/post/page.php?id=123456&s=view"]], 0, [55, 1, [[], "https://hostname.com/post/page.php?id=123456&s=view"]], null, null, 1, "https://chan.sankakucomplex.com/en/posts/b427a29e4efbae0559946ff4ebf433b0"]] >>21068 I haven't looked at gallery-dl's code so I have no idea how that works, but I think it's due to Hydrus' parser being inflexible, if the example url doesn't match exactly it fails.
>>21069 How do I do this?
>>21066 Well, a big problem is their URL's only last for an hour. The script on Cuddlebear92's site works, it's just you have to do it manually with gallery downloader because you have to watch for the timeouts (i.e 2 day header, 1 hour URL, and 100 page search limit per tag). That's what screws up subscriptions on it.
>>21069 Oh, your using the old chan. I use beta.
>>21072 Use Sankaku Chan Beta Tag Search
>>21071 >Well, a big problem is their URL's only last for an hour. What does that mean? They will change after one hour so subscriptions will not work no matter what?
>>21074 Yeah, the URL's only last for an hour once hydrus downloads the search page, and scrapes the URL's from it. If you only need a few files, and can download them within an hour, I would guess subscription would work. The real problem is the header, which you have to manually search for in the Sankaku web page. So, it would have to be changed every 2 days religiously in order to keep subscriptions working. That's their real defense against subscriptions. You just really can't automate that.
>>21075 Actually, I guess it could be automated, but it would probably take a python script using the Hydrus API, to change the header every day or so.
>>21076 You would have to get the script to search the beta.sankakucomplex.com/ webpage for the authkey, then load it into hydrus' header input, changing the old key out every day or so.
>>21059 I don't use the api. I just download the normal way, and it works fine for me. My subscriptions work.
>>21078 It wouldn't work with Sankaku. Unless you are changing the auth key every 2 days, it's not going to work. You'll get the "snackbar" error. If your getting in without doing that, then your going through some unknown backdoor or such.
>>21078 they don't work they redownload everything every time
Hey everyone, I am going to catch up with my messages now. I might run out of time and have to do the rest tomorrow. I'm not sure what the thread limit is, but if it is strictly 750, it also looks like we'll need a new thread pretty soon, so I'll make sure that happens too. I have a new dev machine! (my last one briefly caught fire right before the holiday lol) There are still a couple things to configure, but everything is good now and I am keen to get back to work. Since I am crazy about backups, I thankfully didn't lose anything but time and money. I otherwise had a good holiday and had the chance to spend some good time with my family. >>20951 There are very very limited proxy settings under options->connection. If your VPN doesn't do SOCKS, does it have the ability to set an application filter or application-based kill switch? If so, everything network-related should be coming from 'hydrus_client.exe' in the main install directory. If you tell your VPN to always catch traffic coming from that exe, I think that'll work for you. >>20958 Thank you for this report. I am sorry for the trouble. This specifically was triggered when you went fullscreen? Is this the 'borderless' fullscreen, and was it for videos that filled the screen completely (e.g. 1080p video on 4k monitor), or on ones that had a gap between the top/bottom or left/ight and the screen edge? I am afraid I cannot reproduce this, and if you cannot either now, I wonder if yes this is somehow related to a one-off call like thumbnail regeneration. I wonder if having FFMPEG inspect the video while it is open in your mpv was causing some lock or 'already in use' state to set and mpv was dumping out? Ah, and now I read again, if you were forcing the 'regen thumbnails' job to run in the foreground, I wonder if that was choking your GPU with a hell of a lot of work, and thus mpv was dumping out? Let me know if this comes back or if you learn anything else about it. >>20960 Not yet, but I think this is a good idea. >>20963 >>20967 Although it is spammy to set up, I think the way to figure this out is, either, to: A) If most of your tags are parsed from downloaders, then set your 'tag import options' to grab tags for both a local 'my ptr tags' and 'ptr' services. Just parse twice. B) If you type a lot of tags, then commit your tags to the local 'my ptr tags' tag service, and then regularly hit up tags->migrate tags and pend everything in there to the PTR. Several people have asked for the ability to save what they commit to the PTR, so I think I agree it would be nice to have some 'tag service sync' tools of some sort, but I'm not totally sure what they would look like. How might you like that sort of thing to work and be set up etc...? >>20965 Hey, fantastic, thanks for letting me know.
>>20968 >>20975 Thanks, that definitely seems wrong! And I'll bet it is related to the actual problem in question here. It seems like hydrus is confused about where its update files actually are. I will review this on my end to see if I can catch and fix it better. Also, you might like to try running database->db maintenance->clear/fix orphan file records. This will try to detect any update files that are in 'repo updates' but not in 'all local files' and make it work better, either deleting the bad records or fixing that 'delete' menu entry so you can do it yourself. >>20973 Thanks for reaching out! 1) Yeah, I agree, I'd like this. A big aim of future UI updates will be to better 'modularise' the UI and add lots of customisability, both in layout and show/hide tech. In the mean-time a simple eye icon button with a dropdown menu to show/hide would be a nice easy compromise. 2) Not yet, but I want this! I have to do some db work before I can. 3) Yes, your idea of pasting them all into the tag filter to make a whitelist would work. HOWEVER, I would not recommend syncing to the PTR just to do a one-time job like this. the PTR is a big investment, and it leaves a big mark on your database that I cannot delete completely yet (your client.master.db file will be something like 9GB bigger), so I think I'd suggest a different solution in general here. Most of the PTR tags are from downloaders doing normal tag parsing. If you look in the advanced 'tag import options', you should see pic related. That sounds like it does similar to what you want without the PTR baggage. ''Also, side thing, a user told me the tag filter works very slow when it has thousands of items, so I am planning to optimise it in the next few weeks. If you do mess around with it, you might want to wait a little for me to fix it up. 4) Yeah, you want to export from Client1 to an HTA .db file exactly as you have done. Then, in the Client2, you open up 'migrate tags' and then select that HTA file as your source. everything you need is contained in that one .db file, so it is completely portable to a different computer or a friend or whatever. I don't think my HTA tech will update an existing file, but maybe it does. I wrote it years ago. Try it out! And yeah you need separate archive databases for parents and siblings. If you want to regularly migrate tags and siblings and parents from one client to another, that's tricky. This HTA migration would work, but I think it would probably be frustrating. Maybe the Client API would work better, but it would be a lot of work. Some years from now, I expect to have direct client-to-client communication so you can browse one client from another, but I can't promise it in any reasonable amount of time. For a one-time migrate, I'd simply do: Client1 -> Three portable archives for mappings, siblings, and parents Move the three files to the Client2 computer. Three portable archives for mappings, siblings, and parents -> Client 2 And then I'd normally delete the portable archive .dbs. You could just do the same thing every three months, but it would take more and more time and probably be frustrating. Let me know how you get on!
>>20978 >>20980 >>20979 >>20985 Thanks all, and hope you had a good holiday. The long story short is I booted it that Tuesday morning and one of the two monitors wasn't working. While trying to figure it out, I unplugged and replugged the DP->HDMI2 cable I was using on the busted monitor, and when it went back in, the fan started blasting foul-smelling burned plastic smoke all over my hand! I jacked out the power real quick, but I also noticed the other monitor hadn't gone to 'this motherboard is dead' static, so I put the whole rig in a safe location and tried booting again, and it worked (while still smelling pretty bad)! I had it up three minutes to ensure my backups were good and put the social posts up, and that was that. I use 'Minis Forum' PCs for all my auxiliary/office computers, and never had a problem like this before. I blame the weird cable setup and the fact this machine had had a 24/7 100% CPU machine sitting on it for a year that perhaps had slightly melted/weakened something internally. When I'm certain I no longer need to recover anything, I'll open it up and we'll see if it was just a capacitor that blew up or if there is a big black scorch on the motherboard. My lessons this past couple weeks have been: - don't use cheap DP->HDMI2 cables to do mickey-mouse 4k@60Hz solutions - don't stack hot machines on top of each other - make sure your backups have regular full coverage on all computers at all times (I separately had a huge fucking pain in the ass yesterday fixing a dead partition header on my download computer usb drive, which was both not well backed up and fully encrypted, making recovery a total-jej-experience) Ultimately, thankfully, this was just a bunch of time/stress and $500. Didn't lose any work or cause a bigger fire. >>20990 You can see this, btw, by turning on 'use the advanced file deletion dialog' under options->files and trash. >>21012 >>21013 I often make mistakes when talking about this subject since I get confused about which thing is too big or too blurry or whatever, but I somewhat recently added an option under options->thumbnails called 'Thumbnail UI-scale supersampling %'. I _believe_ that it is 'proper' for your thumbnails to blow up larger when the UI scale increases, but due to some technical stuff I need you to actually say what your UI scale is in this place, and it'll regen your thumbnails to be crisp on your actual screen. Let me know how it goes--I hate this subject! >>21031 On my end, I am open to adding a little lever or tool here and there to help users figure out future sankaku downloader solutions, but I have written off supporting the site myself. It changes too often, and it actively does not want programs like us to download from it, so it'll never stop being a pain to access. I recommend using gallery-dl or just looking for the same basic content on other sites. Oh yeah, related, you may have seen danbooru has started giving 403 results. They have changed their CDN rules to stop many automated programs from accessing files, which hits us too. We are looking at the problem, may have a solution in the mid-term future, but it is a story in flux and may change in a few days since their rules are breaking all sorts of API access too. I am worried we'll see more of this in future. The phone-using normalfriend marches on. 💯💯💯
>>21058 Thanks, this is interesting! I know the tags manager has really bad performance when you get to about 10k+ files (there's something inefficient in how it builds and analyses the combined tags lists), but it is interesting how much memory it wants to use here. I'd have guessed it would take ten hours and 500MB to open it, not 6GB+. I'm amazed it didn't crash! >>21060 Yeah, it is a shame. We have a technical solution, but I can't implement it quickly. We'll see how much they ease the rule change over the coming days, since it is breaking a bunch of other programs too. Check for the same or similar content on safebooru or gelbooru for now--they should be fairly easy migrations.
>>21081 >>20958 My window manager makes the media viewer open in borderless windowed mode. I can't replicate it now, but it definitely did happen again after I posted, I think it was shortly after importing new files. Both 16:9 and non-16:9 videos caused the behavior.
Can anybody share the quicksync files for the PTR? I can't connect to it without VPN and the purple site no longer has those. Meanwhile the pre-populated DB for novice users is available, but it's only good for starting from scratch. Why would anybody make a choice so bad, sacrificing usability, I wonder.
>>21083 >just looking for the same basic content on other site Sadly, shitkaku still have way more stuff than anything else and I will never undestand why because the site sucks and of course to scrape once gallerydl just werks but subscriptions to get new stuff is what make hydrus the only tool of its kind since artists keep making content and when you are in the hundreds manually scrapping them from time to time is impossilbe
>>21087 You can use hydownloader, it uses gallery-dl under the hood. It has subscription support and I moved all my subscriptions to it. It supports Sankaku and a bunch of the other big sites for subscriptions, and technically supports anything gallery-dl supports for single downloads.
>>21088 Can you directly import them from hydrus or will need to manually migrate 9xx subscriptions? Will that remember past downloads? I tend to curate the scrapes so I delete a lot of stuff
>>21087 If you're looking for Sankaku style hentai, I suggest Pixiv.
>>21090 I think they have a lot more than Sankaku, and pretty much anything goes there, so you'll be sure to find whatever you're into.
>>21091 pro tip - You'll want to look at the japanese tags that are suggested when you type in the romanji name. There's usually more content under the kanji tags.
(69.41 KB 358x195 01-08:24:02.png)

(86.16 KB 803x320 01-08:25:14.png)

>>21089 I don't think you can import them, you will need to manually migrate them unfortunately. Yes, hydl remembers past downloads, and if you pair it with Hydrus Companion which I suggest using anyways, you can even see files you've already downloaded in your browser, see pic related, the blue border indicates "downloaded but not imported".
I've been downloading a lot of heavy files (videos) off sankaku since october and the old links in my gallery queue still work. Can't something be done to generate these old type links from new, volatile ones, before enqueueing downloads?
>>21067 The default tag changed from "wd-hydrus-tagger ai generated tags" to "wd v1.4 vit v2 tagger ai generated tags". Was this intentional? Can this default tag be managed?
Who are the russophobic liberal hypocrites running the ptr.hydrus.network and limiting the access to it by country? They dare to use our Nginx software and sperg about internet freedom and le big brother while doing exactly the same.
>>21093 Either you need high level autism or I'm missing something with hydownloader I installed it, got the GUI thing, tis all green, I add subscriptions (unpaused) and it does nothing ?????
>>20884 >What features would you like most Sorry for the late reply, thought I lost my whole database but found a backup in the dumbest place after a few weeks of giving up. Have multiple backups! It would be cool if we could do booru wiki style entries where you could link to other tags or even URLs, but to me the most important part would be ease of looking up a basic definition for that tag. Maybe type it in to the add tag box and right click a candidate tag and hit 'see definition' or something. >>20912 >why would you manually add tags when other people have done that work for you? I don't use booru tags outside of basic stuff like copyright and characters because of how unreliable and varied booru tags are. Even characters are commonly fucked because different sites disagree on how to tag Japanese names or one appends the series name to the end while others don't. One site may have completely different unnamespaced tags for the same image, or the booru tagger may have made a typo during tagging... I feel like it's easier to just tag from scratch using the tags I know that I use instead of hoping that booru taggers can agree on a single tag for 'girl but with a pp', stop making typos, etc. Maybe I could use a ridiculous amount of tag siblings to fix those issues but I'd constantly be adding new siblings as I noticed more errors. >>20955 Booru shikkaku, by Anonu Devai
>>21097 It's a two part program, there's the GUI & the CLI, they're separate repositories. https://gitgud.io/thatfuckingbird/hydownloader is the main program https://gitgud.io/thatfuckingbird/hydownloader-systray is the gui
>>21094 edit the gallery page parser to remove the new "/en" part
>>21082 Hello Hydev, 14091 here. Thank you a lot for spending your time answering my questions. 1+2) Thank you for considering those features! 3) I have a lot of stuff already in Windows Explorer folders that i would like to import. In the beginning i wouldn't use hydrus downloaders alot. So that's why i kinda have to sync to the PTR to get all the mappings for files i already have. But thank you for reminding me of that importer option, i might use it in future too. Could you please answer this part here regarding this question: "I guess if this works like i just described, doing the same in the future again and again would mean only the mappings that are missing in 'my tags' would get updated and it would not start a huge job and migrate everything for every single file again, right?" So would it mean that the system is smart enough that in future migrations the job would be much faster because it only migrates the mappings for newly imported files and only few missing mappings for old files that got updated? This questions also applies to number 4) kinda, so would it be possible to just create three protable archives from Client1 for mappings, siblings and parents (and leave them there without deleting it, lets say i got enough SSD space), then import them in Client2 and in weeks/months i "update" the .db files (that i havent deleted, because i want to update them) on Client1 again, which would be much faster? Or wouldn't that work for the HTA .db files and they are overwritten/created from new anyway each time. If they are not updatable, maybe you could make them? Or course only if its not too much work. Im completey fine creating them from new, really. No matter what your answer is, i would probably try it out. Shouldn't be that frustrating for me personally but we will see. That it is possible at all is already pretty cool :) Thank you alot again, you are a legend! Happy new year!
>>21101 Hydev, me again. Forget the last part, i just read that you wrote "I don't think my HTA tech will update an existing file, but maybe it does. I wrote it years ago. Try it out!" I now understand that with "existing file" you meant the .db files, got it!
>>21098 >I don't use booru tags outside of basic stuff like copyright and characters because of how unreliable and varied booru tags are. Even characters are commonly fucked because different sites disagree on how to tag Japanese names or one appends the series name to the end while others don't. One site may have completely different unnamespaced tags for the same image, or the booru tagger may have made a typo during tagging... >I feel like it's easier to just tag from scratch using the tags I know that I use instead of hoping that booru taggers can agree on a single tag for 'girl but with a pp', stop making typos, etc. Maybe I could use a ridiculous amount of tag siblings to fix those issues but I'd constantly be adding new siblings as I noticed more errors. i see what you're saying, but keep in mind there's no reason you couldn't download the tags anyway. you can download them to a separate tag repository just as a holding area and make them not affect anything. set up your search and autocomplete to ignore that tag repository. that way it's like they don't exist in normal browsing. if you do this, you always have the option to pick and choose good tags to migrate from the booru tags to your tags at a later time. just because some of the tags are bad doesn't mean you should throw it all away. not downloading the tags seems like a waste to me. you're leaving data on the floor. why not hoard it all, even if you never use it?
>>21100 I mean, if these links do not expire, what is the issue that prevents using them for subscriptions?
>>21099 Yes I know, thatt is why I said I installed it then got the GUI and its al green because its supposed to be working, without the main thing installed and running it will not even launch
>>21103 Not him, but I do something similar with ai tags, where I put them to a separate service and turn off their display, but still keep them searchable, which helps with manual tagging, I just have to make sure to select a different service during search. My inly problem with this is that you can't select multiple services like with domains. It's either one or all.
>>21105 >>21097 Oh, I misunderstood, I think subscriptions only fire off after the set time has passed, there's no initial run to grab the files. If you run a regular download it should start working immediately. You can also run a check of what files are in Hydrus already with the hydownloader-anchor-exporter so that it won't download already downloaded files.
>>21095 Yes, the tag is created using the "modelname" line in the info.json for the tagger. changing this line would change the tag given to processed files. Alternatively, you could just sibling the tag back to the original (or vice versa assuming you only use one wd tagging model.)
>>21108 Good idea on the siblings. Thanks.
>>21107 Do I need to do any extra configurations outside putting the daemon key? This shit is just not doing anything for hours
>>21110 Also, I could use the anchor thing to import 400k hashes from hydrus but I can't somehow export the subscriptions from hydrus?
>>21079 >>21080 I don't know what to tell you other than that I'm not using the api, just scraping, and my subscriptions do work, and it's not redownloading everything every time. It works for me.
>>21112 > I'm not using the api, What does this mean
>>21110 apparently its lacking user/pass since its do loading nothing you could not see without acc but when I edit the config the stupid thing says syntax is wrong How is this supposed to look?
>>21113 The Hydrus API. Application Programmer Interface. A way to control Hydrus through a program, using functions supplied by Hydrus. >>21112 Out of curiosity, what downloader are you using? The Beta Sankaku one, or an old Chan. Sankaku one? I'm guessing you're connecting to some old backdoor that they haven't closed yet, that still uses the old chan rules, which had no keys or timeouts yet. Because yeah, we used to be able to download from them without any problem. Before they changed everything.
>>21114 I think you need to import your cookies from Sankaku. You can use Hydrus Companion or if you use Firefox you can use the extension "cookies.txt" no clue about Chrome, the cookie file should say: # Netscape HTTP Cookie File at the top. >>21111 I don't think so, and I don't think subscriptions are exposed through the Client API either. Maybe a possible feature request, Hydev?
>>20720 >Ultimately, I think siblings and parents need a complete visual overhaul. I wish we had nice auto-generated tree graphs. This script takes exported parents from clipboard (there is code for reading from stdin, but that's less safe), runs dot to generate a graph, and then mimeopen to display it. Requires the pyperclip Python module. #!/usr/bin/python3 import os import subprocess import sys import tempfile import pyperclip # get the exported siblings or parents (a list of lines representing a table) #exported = sys.stdin.readlines() # From stdin. Be careful not to paste dangerous commands into the terminal! exported = pyperclip.paste().splitlines() # From clipboard. # node names are enclosed in double quotes, escape them (not tested) exported = [s.replace('"', '\\"') for s in exported] table = zip(exported[::2], exported[1::2]) # make an actual table result = [] result.append("digraph {") # a directed graph for i1, i2 in table: result.append('"{}" -> "{}"'.format(i1.strip(), i2.strip())) result.append("}") dotinput = bytes('\n'.join(result), encoding='utf8') # sent to dot's stdin, it returns PNG data dotoutput = subprocess.run(['dot', '-Tpng'], input=dotinput, capture_output=True) with tempfile.NamedTemporaryFile(suffix='.png', delete=False) as png: png.write(dotoutput.stdout) pngname = png.name subprocess.run(['mimeopen', pngname]) # Open the viewer. input() # The next line removes the file, so the viewer may lose it. So we wait for user's Enter. os.remove(pngname)
I had a good week back after the holiday. I fixed some bugs and improved some system:hash predicate parsing. The release should be as normal tomorrow. I will try and catch up with the messages here early and make a new thread for it too.
>>21113 Also, he could mean Pixiv, Sankaku, etc. APIs, though I don't use anything like that. Any downloaders I make use Hydrus to parse the web page code, looking for tags, URL's, etc. That is all done within Hydrus.
>>21103 I suppose you're right. Only issue I can see is services getting cluttered which I find annoying. I'll give that a try, I can always delete the tag services later if I want.
>>21119 Dunno Anything I try, sankaku subscriptions just start again and redownload everything every single check Will this be fixed or plain give up on the site?
>>21085 Thanks. The mpv window is embedded into hydrus with duct tape and a prayer, so window manager bells and whistles can sometimes mess with it. You might like to play with the advanced window settings under options->gui->frame locations, the 'media_viewer' set. Maybe setting it to start as maximised/fullscreen--or setting it to not, if it is on--helps the initialisation and the window manager hijack here. Otherwise, let me know how you get on. If it comes back, is there any pattern to why it does? etc... >>21086 I think we spoke elsewhere about this, but in case that wasn't you, this is the latest quicksync afaik: https://breadthread.gay/ . It should have the update files in its client_file directory, which you can import under services->import repository update files, but you'll still have to talk to the PTR for a short time, on VPN or otherwise, to initialise your metadata store in the review services panel. >>21096 I'm not involved in running the PTR day to day anymore, but I haven't heard anything about this. I know plenty of Russian users have to use VPN to access all sorts of spicy locations like boorus, maybe it is the same, some odd ISP-level block to that data center? In any case, I generally recommend everyone use VPN these days, no matter what imageboard-tier stuff you are doing. >>21098 >It would be cool if we could do booru wiki style entries where you could link to other tags or even URLs, but to me the most important part would be ease of looking up a basic definition for that tag. Maybe type it in to the add tag box and right click a candidate tag and hit 'see definition' or something. Thanks--I think this is what I envision for a first version as well. Well done for finding your backup! You've already learned the lesson, but I'll still link you to the secret help page: https://hydrusnetwork.github.io/hydrus/after_disaster.html >>21101 >>21102 >I guess if this works like i just described, doing the same in the future again and again would mean only the mappings that are missing in 'my tags' would get updated and it would not start a huge job and migrate everything for every single file again, right? Yeah in general, I can and do skip this work when possible, and it is usually possible. My general default when there are 'overwrite' conflicts is just to merge, so if you throw 100 tags at a file, and it already has 80 of them, it'll simply add the new 20, no worries, and very fast. In general, 'I don't need to add this because it already exists' rows tend to work about 100-1000x faster than a new data row. So instead of taking 100 time units to do this work, it'll take like 20 + 0.08. My general 'it might take ages if you regularly repeat this' worries are related to it taking a bit of time to set up this job just when you have to click all the UI, and then also when you sync from a gigantic datastore like the PTR to a local store, there is a bunch of unavoidable overhead just from the files you don't have. The PTR has a couple billion mappings, of which maybe a few million might apply to your local collection, so that '100-100x faster' bulk suddenly gets a lot larger. Even if I make the 99.9% of negative matches work very fast, it is still going to probably be ten or twenty minutes of waiting around to go through them, or maybe even hours, and if all that work only adds 100 tags in some fresh sync, I generally think it isn't work it. Logically, though, there's no problem in running updates like this. >>21117 Thank you very much--I will play with this!
How about integrating gallery-dl to the downloader, this could help solve issues with ever changing hostile sites like Twitter.
>>21123 Apparently hydownloader does this but that shit barely works, it would be a godsend
New thread here: >>>/t/14270 I'll post v557 to it later today, and this thread should be migrated to >>>/hydrus/ soon. Thanks everyone!
>>21121 Nothing to fix in Hydrus. It's all on them. The Dev would have to do something internally with new ways of approaching the problem, and then they would just change again. I've had problems with subscriptions for Pixiv as well. Subscriptions work fine if it's just catching up with a few days. But beyond that, things seem to go weird. If you're going back further than that, you need to do a full gallery download, and just watch to see when it catches up (i.e. a lot of already in db status, showing you've caught up to where you stopped downloading the last time)


Forms
Delete
Report
Quick Reply