/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

The Virgin Windows global superpowers vs the Chad Linux imageboard site.

8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

Next Big Job Poll Discussion hydrus_dev 04/10/2019 (Wed) 22:54:53 Id: 187563 No. 12152
Ok lads, as I am now finishing up OR search, I am soon going to be free to work on a new 'big job'. I am pleased that I was able to make simple Client API and OR search in much faster iterations than previously. I hope to continue like this, keeping the next big job 8-12 weeks at the most before running a new poll. The current list is: Just catch up on small work for a couple of months Reduce crashes and ui jitter and hanging by improving ui-db async code Clean up code and add unit tests Improve tag siblings/parents and tag 'censorship' Add ways to display files in ways other than thumbnails (like 'details' view in file explorers) Add text and html support Add Ugoira support (including optional mp4/webm conversion) Add CBZ/CBR support (including framework for multi-page format) Add import any file support (giving it 'unknown' mime but preserving file extension) Improve 'known urls' searching and management Explore a prototype for neural net auto-tagging Add support for playing audio for audio and video files Add ui for waifu2x and other file converters/processors Write some ui to allow selecting thumbnails with a dragged bounding box Add popular/favourite tag cloud controls for better 'browsing' search Improve the client's local booru (this likely now means a backend migration to the Client API) Improve duplicate db storage and filter workflow (need this first before alternate files support) Improve shortcut customisation, including mouse shortcuts Add ratings import/export, and add 'rating import options' to auto-rate imports Add more commands to the undo system Improve display of very large/zoomed files in the media viewer Set thumbnail border colours on user-editable rating and namespace conditions Improve hydrus network encryption with client cert management and associated ui Add tag metadata (private sort order, presentation options, tag description/wiki support) Improve file lookup scripts and add mass auto-lookup Add multiple local file services (which will enable true nsfw/sfw partition) Add an incremental number tagging dialog for thumbnails (for adding page:n etc… to a sequence of files) Permit custom ordering of thumbnails, through mouse-dragging or otherwise Allow user to have multiple open split tab columns or separate windows with one or more pages Improve rating workflow by providing score representatives to compare with Add file modified/creation timestamp searching and sorting Write an URL Repository so clients can share known url mappings Add animated thumbnails for videos (animating on mouseover) Allow multiple custom 'open externally'-style file launch commands for files Add version tracking to downloader system objects and explore remote fetching of updates Expand file notes system I will put up this poll with the 349 release post and then select whatever seems to be on top by 351, with the proviso that I will try to discount any non-organic (e.g. botted) votes. You will be allowed to vote for multiple items. I am happy to work on any of it. Please feel free to suggest new items or ask for longer explanations of any of the above. I will edit the list as new items are agreed on.
I like the idea of CBZ support because I've got quite a few of those, but it's more important to have hashes of the archive contents, then the archive itself, and I can't think of how to resolve that with the hash is the file identity paradigm. I recognize that this will make the storage requirement of manga marginally larger, but I would be willing to have that for some order. As such I think it would be best to just have a way to unpack these files and improve the way hydrus manages and displays collections.
-public url repo (aka PURR) - imagine searching the hydrus tag db then downloading all the urls associated with the resulting hashes -another round of api because api is great
-PCIDR: Public ipfs content id repo ipfs content ids seperate from url repository. -Also I would make the PURR optional to pull from, because I can imagine that not scaling well. -API -duplicate pairs -manage subscriptions -add url without publishing a page -get GUG/urlclass results -Subscription url class Generate subscription entry from url, good for use with hydrus companion
watch uma musume
From top to bottom, this is my main wishlist: - Add multiple local file services (which will enable true nsfw/sfw partition) - Improve the client's local booru (this likely now means a backend migration to the Client API) - Add an incremental number tagging dialog for thumbnails (for adding page:n etc… to a sequence of files) - Add more commands to the undo system On the other hand, I'm still quite a few versions behind. I tend to keep a large number of pages open because my tagging barely keeps up with my downloading, but having a large number of pages open on upgrade means that I often lose my session on upgrade and all my organized-but-not-tagged-yet files vanish into the database. So I'm not in any real rush for any of those. Sure, a mass auto-tagger would be helpful with my tagging problems, but I don't think it necessarily needs to be built in. It looks like the Client API should provide most or all of what's needed for external tools to do that, and I've already got my own little personal-use mass-tag-search tool I wrote myself that should integrate nicely with it when I upgrade to a version that has the API.
I've seen it suggested a few times on Discord now: How about namespace aliases and/or renaming?
>>12152 Going to have to ask for an expansion of labeling why things are deleted the current where it got deleted from helps a lot. in fact it could facilitate a jerry riggable system where if we have different places to send files, currently we have Inbox Trash Archive If the new where it got deleted from knows that it was in inbox, or it was in archive when it was deleted, then having other places to deposit files, such as Waste of Hdd Generic Bad Meme could be user made, and while its more steps to deleting the files, it would be tagged in the detailed status as 'deleted from 'waste of space'' and effectively work as an explanation of why things are deleted. If this is possible, Ill vote for it over just deleted file tagging as more users have a use for more places to put files then they do for tagging reasons for removal, I do understand my problem is a minority but if its possible to solve though a encompassing issue, then ill go that way instead.
I think something like "file sibling" will be great. for exmaple: you file A and B. you delete file B in the duplicate filter in favor of file A because A is better. when you import file B from a booru, it will be marked as deleted. I think it should point to A instead and give it the tags. and same with the ptr.
>>12152 1. Text, Comic Books, Documents, html/xml/json and code support. 2. Scraping documents or webpages support (see list in bottom) 3. Older video, lossy audio, lossless audio and other arbitrary file support
https://github.com/mikf/gallery-dl (Python, Art Gallery and Booru)
https://github.com/Bionus/imgbrd-grabber (C++, Booru)
https://github.com/Xonshiz/comic-dl (Python, Comic/Manga)
https://github.com/yuru-yuri/manga-dl (Python, Manga)
https://github.com/manga-download/hakuneko (JS, Manga)
https://github.com/Hamuko/cum (Python, Manga)
https://github.com/NguyenDanPhuong/MangaRipper (C#, Manga)
https://github.com/JimmXinu/FanFicFare (Python, Novel + FanFic)
https://github.com/kanasimi/work_crawler (JS, Comic/Manga + Novel)
https://github.com/riderkick/FMD (JS, Comic/Manga)
https://github.com/rg3/youtube-dl (Python, Videos) (yes I know)
https://github.com/adolfosilva/libgen.py (Python, LibGen)
https://github.com/NadalVRoMa/PyLibGen (Python, LibGen)
https://github.com/bibcure/bibcure (Python, LibGen/SciHub)
So, I just got done with the first 1000 images on the duplicate processing Shit was slow to start due to how the program handles zooming in and out, and seeing that progress was VERY slow, I got something called magnifier 2.4 http://www.iconico.com/magnifier/ I have a second 13 inch 1080p screen now, So with the magnifier being the entire second screen, I put the cursor on the eye, zoom in 10x and flip Shit significantly increased productivity when looking for dups im wondering if there is something that could be done in program for much the same effect, or even utilizing the fact the program has the source files to look over to not only zoom in, but possibly even increase resolution till zooming is needed
The "yiff.party attachments" simple downloader is broken, the links it finds are 404s.
>>12168 pretty much doing the same thing but with dsr and [win] + [+] there is a build in solution in win that does it
>>12152 From what I can tell the client UI doesn't hang enough to prioritize that right now Neural net auto tagging is something I'd like in the future but not before batch/sustained mass lookup Custom thumbnail ordering might be useful for some but I see little reason to prioritize it now My priority list would likely be 1) improve local booru and go ahead with migrating to client api 2) batch/mass tag lookup and application, maybe with an option to negate application of tags before committing to either local or remote tag repo 3) db storage and filter workflow to open up alternate file support and maybe squeeze out extra performance 4) split windows/tabs for a/b trash application on manual file import
In the manage tags window, can we get the option to NOT remove tags we type in manually if they already exist? Typically when I type it I want it added, if I want to remove tags I double click them with the mouse. The way it is now is just annoying.
I didn't follow hydrus closely. since the python 2 to 3 move. Are pages and filenames metadata yet? as in:(on pixiv, this filename and page if part of a serie, on nijie, this filename and page, in this tank, ect) Is it even doable? Do we have a separate editable metadata box? Can the tag be weighed like what we see in vndb (very useful for comic/image set tags vs illustration tags? Do we have a separation between spelling error hard siblings (always replace) and alias soft sibling (never replace but appear as). can we alias a namespace easily? And most importantly, do we have a way to organize the subscription per circle/author and not per site. If metada are added, it should be possible to sort by specific elements of the metadata, the ike of pixiv work:# Thanks for bearing with me.
>>12173 this is workflow I use often, I don't take my hands off the keyboard when tagging so I need to be able to delete tags by typing them. If dev changes this please do it with a setting.
>>12152 My wishlist from top importance to least: -Improve duplicate db storage and filter workflow (need this first before alternate files support) I think the duplicate filter UI could use some work. One thing is that the top grey panel covers a lot of the images. Transparency and the ability to move the panel would both help. Another thing that could help is using symbols and color coding to make it easier and faster to recognize the details of the image. Maybe an icon with arrows could indicate resolution/dimension of the image. If the icon is green, it mean the current image of the comparison is higher res. If it's red, it's lower res. A green calendar could mean more recent, and a green T could mean more tags. Grey could mean it's a tie for any of the aspects, and black could mean zero (as in no tags). There could be PNG or JPG icons to show file type too. -Reduce crashes and ui jitter and hanging by improving ui-db async code / Just catch up on small work for a couple of months Some good old fixup work would be great too though. -Improve the client's local booru (this likely now means a backend migration to the Client API) I had some thoughts about basing the booru more around the idea of shared pages than the current state of sharing batches of files. -Add popular/favourite tag cloud controls for better 'browsing' search The more browsing features, the better! -Add file modified/creation timestamp searching This could be really handy for going back into previous years of my collection where the tagging isn't so thorough compared to today.
One of the biggest areas that need improvement is the duplicates section: it will be so convenient to be able to check for all duplicates inside an opened page
Honestly all I wish right now is polishing the already existing features, instead of adding more half finished ones. Duplicate processing still has a lot of work to do, like highlighting certain important tags like "revision", allowing to check new imports for duplicates instead of only being able to check the entire collection, better comparison between images (retain zoom level with different aspect ratios, make zooming close not freeze the entire program), giving images ratings (which you mentioned). I think the client api could also be improved, and possibly make some sort of direct integration with Hydrus possible, for example being able to add custom actions to right click menus (my long term wish would be being able to have an action that processes the selected files externally with waifu2x, add them to the file service, set them as alternates to the original files, copy the original tags and adding a "waifu2x" tag, while being able to do all of this using the client/plugin api). I think a highly extensible api for Hydrus would be very healthy for the program, because for a long time the trajectory has been that of baking literally everything to the program, even if they were possible to do with simple plugins, making Hydrus even more of an unmaintainable mess as the time progresses. Also, all the usual ones like audio support and less crashing on linux.
>>12175 There should absolutely keep being a keyboard friendly way to delete tags, but IMO a better solution than toggle would be prefixing a tag with minus to delete it. either that or a modifier like alt+enter
>>12180 since dev has made or with a modifier key too I feel like a modifier like that is a good idea.
>>12169 oops this was supposed to go in the main release thread… >>12175 Sure, that's why I suggested an option for it. There's that cog button right there just waiting for it. A modifier key is a good idea too though.
Since people are talking about shortcuts, I think I'd be great to go another step further and update the shortcut system to accommodate an entirely keyboard-driven setup. Currently there are some things you have to use the mouse for, not to mention window focus glitches that require you to refocus with the mouse.
>>12188 Hydrus in Emacs when?
A U D I O U D I O
>>12174 The Pixiv and nijie downloaders work but its still too messy to use in Hydrus. You're biggest concern will be trying to translate all those Japanese tags if you don't know Japanese. From 1 artist alone, it took me 1-2 weeks of changing all the Japanese tags into english using siblings as I don't know any other better way to do this.
>>12191 Vim or bust.
Anything from >>11761
>>12154 You might wanna check >>12166 which has a broader scope.
hi! could you please consider adding pagination (booru-style) to files dialog?
>>12154 Yeah, a potential route is for me to provide both actual cbz and virtual, either in a clever or dumb way, so you can access both the archive and the pages as separate files. For a first step though, I would only start with adding actual .zip/.rar inspection and presentation, so it would always be a single file in hydrus. What this job really is is adding single-file-but-multi-page presentation support to the media viewer, some kind of a secondary previous/next when looking at a cbz object so you can page through it as well as move to the previous/next actual media file. There would also be ancillary stuff like cleverer cbz/cbr thumbnail generation from first page and so on.
>>12155 Thank you. I will add public url repo to the list. I want to avoid heavy work on the API for a little bit and see how fitting it into regular weekly work goes, so I will skip it this time.
I am changing "Add file modified/creation timestamp searching" to "Add file modified/creation timestamp searching and sorting".
>>12156 IPFS multihashes sort of fit into the idea of a URL Repository, so perhaps a second version of a URL Repo would support other identifiers. It could also fit into a completely different Hash Repository, for public sharing of md5 and sha1 hashes as well, which might be neat for some future booru lookup operations where looking up the md5 of something you don't have may be useful. Would you like me to add at IPFS Repo to the list, or maybe just a full iteration on current IPFS support first? I still want to add that no-blocks upload system, so maybe that is better done first. I would run a PUR I think, but it would be as optional as the PTR. As I said just above >>12211, I will keep API off this cycle. I am happy with the first version I have just done and want it to get a bit of use and extended feedback before I go heavy back to it. It'd be nice if I can just do little work here and there without having to wait a whole big job iteration to push it forward.
>>12160 That's covered by "Improve tag siblings/parents and tag 'censorship'". I'd love to have user-custom namespace siblings, including from something to nothing, like if you don't like 'clothing:bikini', you can have that appear as 'bikini'.
>>12161 Yeah, multiple local file services is probably your best bet for this. I'd add the particular local file domain's name to the deletion reason. But tbh, now I have that deleted reason table, I may also sneak in a quite option for you that allows you to set up some custom deletion reasons and then add a new button to the delete confirmation box. It'll let you select that 'bad meme' reason and pass that instead of the current generic reason. Please play with the existing new system for another week and then let me know what sort of workflow you would like.
>>12162 Yeah, I would like this. It will require some db prep work before I can do it, so you want "Improve duplicate db storage and filter workflow (need this first before alternate files support)" for now. File alternates will have some sort of file family relationships support with it.
I vote for collections (so we can read/archive mangas etc. properly) and audio in the media viewer. PS: Just something to consider: blacklist files based on pHash (similarity).
>>12168 I have this as "Improve display of very large/zoomed files in the media viewer" for now. I'll move to a tile-based rendering system, so I am only storing what's on screen in memory. Atm, I make a giganto bmp which is pretty shit for several reasons.
>>12169 >>12187 No worries, should be fixed in 348.
>>12174 No, I am afraid most of that stuff is not in yet. Some of those issues would be worked on in the items in the big job list.
>>12173 Sure, that is small and I can add it in the next few weeks. I'll make a new 'add only' check item in the cog button of manage tags dialog.
I am adding "Add animated thumbnails for videos (animating on mouseover)" to the list.
>>12181 >>12180 >>12188 >>12191 >>12198 Yeah, you probably want "Improve shortcut customisation, including mouse shortcuts". There's still a lot to do here, but the ideal is to allow customisation for all actions. I'd love to speed up tag management for keyboard-only, and it would start there. So much shortcut processing is still hardcoded.
Run with command line (all argv or xargs like), context menu for thumbnail grid.
I vote to call the Public URL repository PUR for consistency, or PURe because it's memorable
>>12201 Can you explain this more? What's the 'files dialog', and how would you like it to page? Would you like the thumbnails from a regular search to be split into pages?
>>12217 For sub-paged collections in the media viewer, I think you want cbz/cbr support, which tackles this basic problem of adding two tiers of pages in the media viewer. I can tack collection paging onto the media viewer as part of that. Blacklisting by similarity is something I would love to do when the duplicate filter is more mature. If I can be 99.98% sure in code that one file is the same as another, I can auto-dupe clear on several rules for easy situations (like exact pixel dupes) and only leave complicated/fuzzy dupe processing to human eyes. And as soon as there is auto dupe processing, I can do it on import and fail the import with an appropriate message.
>>12224 Thank you, I am adding "Allow multiple custom 'open externally'-style file launch commands for files" to the list
>>12161 I would go with a total 1 byte ranking number, which you can optionally assign labels to. Lower numbers have higher trash priority.
>>12226 thank you for reply by "files dialog" i meant "my files" page yes, I guess you're right. the only option i see when i have many images selected by query but i don't want to load all of them is to add system:limit to search, but there's no system:offset. it would be awesome to have the opportunity, for example, to set number of thumbnails on page in hydrus options and scroll through this pages even without system:limit. for sure not all thumbnails/tags should be loaded, but only ones that are on page from gui view, i guess arrows with [curr. page]/[tot.num of pages] will be enough
>>12225 If that is the case PTR should be called PeTR
(328.28 KB 590x590 stopit.png)

>manga/CBZ >hydrus Why? There is already a good software for those things. HappyPanda/X for example. I fear that hydrus slowly going to become an all-in-one bloated software, stray from the original path of being a organizer of random images.
>>12233 >>12233 >Why? There is already a good software for those things. HappyPanda/X for example. Indeed, the most zip/cbr support hydrus needs is being able to make thumbnails for those.
>>12233 I kind of agree with this, some things just shouldn't all mix together. Doujins just add a whole another layer of tagging that just doesn't mix well with single images, videos, txt files, etc. All this stuff is really starting to clash with hydrus. Hydrus needs to be split into a different programs if you really want all that extra stuff or running multiple instances of Hydrus(which I'm currently doing for videos but mostly just testing it out). I suggested this once but a separate version of Hydrus that's meant solely for doujins would be a pretty nice idea as some people might not want to use a webclients like happypandX or LANraragi. Managing doujins is a whole lot easier than single images so I doubt you'd need to do much work to manage a program like that.
Siblings handle some cases where you want to replace namespaces, but this doesn't work well for single images or long tags (which you have to retype). I'd like an option to change the namespace for selected tags in one or more on the fly, e.g. change an unnamespaced tag to an actual name space, or replace one for another. For example, I'm manually changing many creator: tags to blog:, tumblr: instagram: or whatever because the account owner is not the actual creator of the work. I can do this with some SQL-fu, but it woul be much easier to say, right-click a tag in the Manage Tags screen, select "change namespace", then type in the new namespace or select it from a dropdown. It would be nice to have some kind of search-and-replace interface for tags globally when doing this kind of maintenance.
>>12235 >Managing doujins is a whole lot easier I might be slightly biased towards this(I'm the dev of one of the webclients and it's not HPX), but doujin providers are much more annoying to scope for metadata than boorus. Alongside this, you also can run into encoding issues with the filenames inside of the archive. Hydrus bypasses encoding problems by renaming files currently (and it's probably much better off thanks to that), but that wouldn't apply to files inside of a cbz. I'd personally rather see support for more esoteric formats like xml/svg than manga, but that's just me wishing for stuff that'd fit my workflow more. With the original HP dead there's not much left in terms of desktop clients for doujins.
>>12215 Honestly loving the new system as, while dup filtering is slow and generally i'm getting rid of the 120kb version instead of the 10mb one, its slower then what I would find optimal. lets give you an example, https://boards.4chan.org/s/thread/18749087 generally any image I get from /s/ wont have duplicates, and generally I subscribe to the 3dpd philosophy, however on this board, I pick only the threads that have an interest for me, or are really attractive, going through this thread, I would probably keep 1/3 of the images, with most being far to low quality, and even using them as art references would be a difficult task, If I could delete them and tag them with 'unattractive' or 'low quality - real' so that's what shows up, I would, and I could likely go through all my threads from real boards and get back quite a bit of space in a short order of time doing this. as for how I would want it… you may remember a while ago, I posted a 'mockup' of what it could look like, an area for inputing a reason, and several quick input buttons for saved canned common reasons all of this on the delete image dialogue, and all of it completely ignoreable incase you don't want to ad a reason. If this was an option either at the send to trash stage and if nothing was added a second chance to add something at the permanent delete stage would be nice. while I love the idea of inputtable text for a delete (and in cases like this, if I delete 1 or 1000 images at once, the one input applies to all) i'm not tied to it, this just facilitates custom reasons to better explain why something is there. lets say I have 3 quick access buttons Low quality Meme Waste of hdd space and someone decides to just dump gore which I don't want, so I would I would have the ability to make a custom 'jackass dumped gore' for the reason and go on, not needing to take up quick action slots and gives further context. Lets say there is a good r34 thread on /b/ if I saw low quality or waste of hdd space, I may still be inclined to see what it was because the rest of the thread gave me some good images, but the added context of 'jackass posted gore' or 'scat' (the common shit posted to threads to bumplimit them without falling into meme) would tell me all I need to know. If you go for a drop down and select approach, I highly suggest having some quick slots in the top 5 or so because low quality will get used far more often then 'assole posted scat' or 'stix log shit' or other generic bullshit people will post to hit bump limits. TLDR _______ |send this file to trash? | |do you have a reason? | |{——-Generic text box here———} | |[1] [2] [3] [4] [5] | | | | [yes][no] | ———————————————————– With 1-5 being quick adds that paste text to the generic text box Because its so tedious to use image editing software to mock something up, here, its also the image in case posting formats it and fucks it to hell and back. this is what I consider ideal, 5 quick reasons, with a text box for a custom more 'fuck these images, seriously' reason Honestly I will take drop down with some quick reasons and no 'per image group' special reason, so long as there are a few custom slots for reasons that can cover nearly everything I need, the images I would use a custom reason for getting rid of would just have to fall into something generic. And BIG THANK YOU for this, If I know that something like this is coming soon, im capable of not downloading images for a few releases if it gets to the point of requiring a new hdd. The first steps to getting my db under control are here.
>>12218 lol well aware of those reasons, the bmp has eaten all my ram before stopping its attempts to render a few times.
Oh, one thing hdev I have gone into this in length in the past with manga viewing/viewer version. If you go down this road to actually doing a manga version before you go further would you be able to have a pause like this for input? personally i'm of the mindset that you would have to 100% make all storage user parseable. with images, I have clumps of who gives an actual fuck what its named, so if hydrus dies tomorrow, I lose nothing, but as far as a manga reader goes, if hydrus dies, there goes my entire db/collection. Personally, I think that it would be for the best to branch hyrus into 2 programs at that point, one for manga, one for images, which while having the same base codebase so they can both be worked on at the same time, they would function in how they archive and present things in 2 radically different ways.
>>12233 >>12234 >>12235 Hentai panda x is fucking garbage Hydrus with its current tagging system and a public repository could tag every manga in existance and it would all come down automatically with little to no hassle. because hydrus would take control of the hentai/manga itself, it would never run into the 'woopse, you moved a file, all the info is now gone' that every current option with tagging has, it has nearly all the ui elements that I would need to replace acdsee as my manga reader The only issue is its not user parseable. which would need to change, and why I propose a split into a manga version and an image version, using the same codebase, just different things get ticked for different versions, images would all be stored in hashes, manga would all be stored in group-author/manga/autor folders group-author for hentai/doujin manga for known manga and author for oneshots. This way if hydrus fucks itself, it has saved everything in a non destructive method and likely saved it in a way that would put anything we currently do to shame every few months I get so pissed at acdsee that I go out looking to see if ANYTHING can replace it, and long story short, nothing can, but hydrus is SO fucking close to being able to do it that I would love to see it try, most cbr/comic readers fail miserably because they rely on outside programs to do something, usually acting as a browser, and this sucks because I cant one hand read, what I mean is with acdsee I get to a manga I want to read enter, down, up, enter i'm now reading the manga page down read till done enter, backpage im out down and i'm on the next chapter repeat above and i'm now reading the next chapter, modify for manga of choices distribution method. The closest replacement I have found is comicrack, but the whole system is near perfect, yet a few steps away from perfect at the same time. I have to use the library in program to save any data and I can't modify files on hdd through library so quick removal of shit is not possible, I also can't do easy dup searches and have to rely on things being named correctly. so shit is very suboptimal from a archive management perspective, the way I have it set up it wont straight up loose info I add to it if a file moves, but i'm not able to even attempt to file manage anything anymore or else shit will get fucked. I'm honestly using this just for porn as its a far better way to do it then acdsee at this point, and hentai panda x cant handle a 45000 file archive without shitting itself horribly. comic rack, while its dead, does a good enough job as an intermediary between acdsee isn't good for porn, and when hydrus adds manga to its functions. the worst that will happen is I either need to manually redo everything I do in this program, which honestly may be fairly simple, I don't plan on going overboard, due to the last time I did that with a tagging program, I lost 10k tags because I thought they were file based and moved with files, not program library based.
(471.28 KB 1200x2093 Money_and_job_Osaka.png)

>>12243 >asking essentially for a second new program with a whole new metadata storage paradigm in order to fit your workflow wew And I thought I was being greedy asking for svg
Have an idea for a duplicate auto mode. going through the dup finder, I am encountering quite a few jpeg images that were converted to png at some point, the png being 3-5 times the file size for the exact same image. would there be a way to check for duplicates that are jpeg and png, then auto pair against each other to see if they are literally the exact same quality, or within a tolerance where the increased size does not make up for the extrem minor gain in quality if there is even a gain at all?
>>12243 Just make a separate database if you don't want to mix manga with images
>>12248 did you read what I wrote? that is not the issue at all, the issue is the way that manga is handled needs to be handled FAR differently then the way single images are dealt with. you can hash images with next to no issue, but lets say hdev dies and something fucks hydrus from working again, there is 0 chance of pulling a reasonably large manga collection out of the hashes, even less so if instead of cbz it held them as separate images. the way that it needs to be handle sorting files NEEDS to be user parseable. 2 programs, one for images and one for manga, is needed, lets give an example, for hydrus to even potentially be useful for manga reader it would need a folder directory. that you can go in and out of, because if you don't, shit WILL get lost, this isn't like images where that's ok, because I get manga form a few sources, I get some with different names, I know there was one that had watashi wa and watashiwa now to find the manga you would need watashi and wa, otherwise you would get alot of other manga cluttering everything, but 2 sources never agreed on what to name it, this would be a pain in the ass at best, but lets say 3-5 years down the road you forget this, and you see you are missing files, so you re download things you already have because you don't know. having it sorted by group-artist - hentai and doujin manga - manga artist - oneshots would be the best base way to sort manga into as few unnecessary subfolders as possible, letting tagging deal with the rest. 256 folders all hashed and non parseable by humans will not work with manga/doujins/oneshots. splitting the program off into 2, while maintaining the same code base for both and all that changes is a few checkboxes at the end or install to go from image mode to manga mode would facilitate better management of manga as that wont be treated like images. If hydrus went and became a great reader without implementing anything, all I think I would ever put in it is porn, as that's hitting a point where its hard to manage, much like my image archives did and losing the names, while painful, I could see as an acceptable loss, But I would need one hell of a push to even consider that, but with user parseable imports, even if everything goes to hell, I can still go back to how I currently handle it.
>>12243 >>12249 Be sure to check https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/issues/70 Personal opinion: It is better for Hydrus to have a rewrite in Qt before we should have complete comic support.
>>12268 honestly, If hydrus added support for zip/rar/archives in general, its more then good enough to do it locally. hydrus would need to extract and open zips/rars in ram and hold them, that's really the only potential issue, switching from winrar to 7zip recently for everything (it was used before but in an all else fails capacity) gave me a brand new appreciation for paid software and what it can do. hydrus would also have to handle folders of images rather then images themselves otherwise people will have to zip/rar everything before importing, a user searchable in program directory would facilitate manga the best, and user accessible directories would facilitate general browsing rather then needing to know that a file exists. god knows, I only remember half at best of what I have. hydrus would need to implement a way to sort images 1 2 3 4 5 6 7 8 9 10 instead of 1 10 11 12 13 14 15 16 17 18 19 2 20 or other various methods people have used to send comics. possibly hydrus could parse a txt file in side the archives or put one in there with all the tags and shit, this would facilitate sharing archives and for thumbnails, hydrus should honestly only save the cover image for the folder/archive and reparse on the fly the rest. I personally have a near 4 million file archive, that shit eats nearly 80gb of space for the thumbs, if manga did the same, I could likely add 40-50gb just though the porn, and then another 200gb through manga, not to mention that due to how black and white files work, and manga in general just giving thumbs of everything laying around… it would be a little pointless. personally if hydrus had a 'detailed view' for a manga mode, that's what I would use for nearly everything but cover pages. it would be nice if the cover pages could come up for duplicate detection, or even potentially without thumbs having a duplicate image catalogue of shit that's in the archives, may be able to get rid of some things that way. I said it before, hydrus is nearly there for replace acdsee as a comic viewer, and comic rack already replaced it for porn for me, but that's more because I don't care about things being even relatively sorted. hydrus could very easily replace both.
>>12272 human-oriented sorting is a problem. Also > duplicate thumbs
>>12272 https://github.com/SethMMorton/natsort there we have it, HyDev take note.
>>12272 Also if you want a fast implementation https://github.com/sourcefrog/natsort
>>12247 Yes, that's basically something like these: https://github.com/andrewekhalel/sewar or even https://github.com/lidq92/CNNIQA or https://github.com/lidq92/CNNIQAplusplus … and so on. Having these available in Hydrus' duplicate filter (or its revised version) should help a lot. That said, you should only expect imperfect reliability in fully automatic mode. You won't ALWAYS get very accurate scoring or even just be able to identify the "better" image automatically. These also generally can misidentify variant images as "the same but worse", and other mistakes like that. >>12152 Like the other anon implicitly did, I also propose the ability to run some more of these image quality / image similarity metrics as a possible feature. Would be nice to have more options to populate the filter and assisted automatic scoring from within the duplicate filter. At he same time, it would probably be a good idea to make the duplicate filter more modal. E.g. "this is above the certainty threshold you set up, so here's how the weighted scoring of the algorithms you picked would solve this and the corresponding scores - confirm?". Probably also needs some thought put into a colored multi-selection GUI thing that makes it quick to see the scoring & resolution to manually deviate and fix mistakes.
I don't know if this is a feature already or not but a sibling thing that could have multiple dependencies so that it could go like >if [arist] + [ocname] then swap to artist:[artist] + character:[ocname](artist) would be very convenient, expecially for scrapping tags from places like furaffinity where people don't use underscoring properly
On the subject of character name. Would there be a way, far in the future to link 2 tag together. Instead of: character:character_(series) use: [[character]]x[[series]] Where x is a dynamic taglink action. And while we're at it, instead of namespaces, multiple list of tagtype a tag can be presented as for that specific media. Tear me up, it's brainstorming more than a definitive solution. I don't even know if it is possible.
>>12277 Never trust Neural Network systems verbatim. Always find an expert system that can work well first.
>>12282 >>12277 true, it wouldnt be perfect, but right now I have to parse between two images and when the quality is close, what could take 1-5 seconds takes 10-20 seconds. if we had dup tiers, like the current one is a blunt and stupid shit looks close, that is perfect, sadly I don't have any examples on had as I deleted them, but there was a scalie thread on /trash/ where some jackass came in, made the shittiest 'corrupt' images that were 7-8 times the good images file size, in a thumbnail looked passable most of the time, but full size, it was unrecognizable. if a dup filter was more accurate, these duplicates would get overlooked, what I would like is this current duptetector as a base line from this a more stringent dup detector to check the base lines work. this should filter out alternative images from normal dups and from there a far more stringent one that would do something like jpeg to png comparisons, and if they are close enough to trigger here, in nearly all cases the jpeg was at some point converted to png, so in this case, scrapping the png would be done as it would save space without needing to go though with a fine comb by hand, with that final level, imagine that the two images are just shown, right click is keep png, left click is keep jpeg, and due to how close they are you almost never click png. at least this is how i imagine it, dups going through 2 dup filters and then a third png to jpeg filter would turn the 10-20+ second checks into 1-5 second confirmations.
(762.49 KB 1440x864 NIMA.png)

>>12282 Of course I'd *also* want the usual expert systems from sewar and so on because they usually are faster, mostly very easy to implement, and even more suitable for some use cases. But actually we probably need both. The problem is that we don't really have an expert system that really can do the technical model analysis of something like this: https://github.com/idealo/image-quality-assessment
>>12285 I will add more information to the Optimization thread on the list of expert system vs NN repos
>>12286 Personally I preferred to KISS and just suggested a few python frameworks that might be easy to hack into Hydrus - but sure.
- parser revisioning - remote fetching - version tracking
>>12152 I'm not sure how big this is but here is a suggestion: implement an icon and search command for files which has any notes attached to them.
A probably small thing that would help me a lot would be an "delete both" button in the duplicate filter - i know that i can press del but i usually dont have a hand at the keyboard while filtering and then i also still need to press delete both anyway…
>>12232 PeTR Public Emacs??? Tag Repository
- Cookie management from API - Tag statistics from API so if you search for a_totally_sfw_tag, and it produces a lot of creator:cname , you know you should sub to cname
>>12298 Public Integrative Tag Repo Public Incorporative Tag Repo Public Interdependent Tag Repo Public Ingrained Tag Repo (PITR)
>>12166 You might wanna get https://github.com/deanmalmgren/textract (this is some good stuff for text-like documents)
>>12230 Thanks for clarifying. Unfortunately, this is not trivial to do for hydrus. Any paged system needs sorting, and hydrus supports many clever kinds of sort, so in order to implement this, I would need to load 'media' metadata for every file in a search result before I could fetch the first page of results. This would not save much time from the current system, where most of a search delay is in fetching that same media metadata. The proper solution, and I imagine how the boorus probably do it, is by having a sort cache (and they have page caches as well, and generally simpler searches to cache), to cross-reference search results against to figure out page slices. This is more complicated than I want to make hydrus search code at the moment. I am happy with being able to display and manage thousands of results at once in the main gui, and I also don't want to further complicate the viewer with paged management and load code. As you say, I encourage users to add 'system:limit=x' if they want less laggy searches. I would be interested in your further thoughts if you have certain scenarios where search is very slow. If there are particular instances where the client runs very slow for you, I'd love to help it run faster.
>>12213 IPFS repos would be important, also an advanced API that can trade IPFS hashes and images would be sweet
>>12234 >>12233 >>12235 >>12243 >>12245 >>12248 Yeah, I am mixed on cbz. I like the idea in the sense of waving a magic wand and having great support, but I can't do that and I know I fall to feature creep too easily. If this is voted on, I would try to make very simple support and see how that goes, and then iterate on it in future if it proves popular. I can't out-compete the programs already out there, but I can do some simple stuff, and ancilliary code like navigating multi-page single-file media will have uses for things like file alternates. I really want all future big jobs to be small improvements and experiments, ideally 6-8 weeks and pref no more 12, so I don't get bogged down like the downloader engine overhaul. I am open to experiments that fail and don't want to get emotionally attached or fall into sunken cost fallacy.
>>12236 Thanks. Yeah, I would like easier sibling workflow, including from the right-click menus, as part of a tag sibling/parent improvement. I would push in this direction with "Improve tag siblings/parents and tag 'censorship'".
>>12240 Yeah, your dialog mock-up is exactly the sort of thing I was thinking of. I'll have a new options panel somewhere that turns on the advanced mode of this dialog and let you set up some favourite reasons and custom entry. Now that I have the 'set a reason' infrastructure in place, this will not be super difficult to add. I expect to have it in in the next few weeks.
>>12247 >>12277 Yeah, my first push here will be to set up a system that permits auto-decisions in a sensible and generic way, along with user ability to control what is permitted, and then in future hang new auto-decision systems on it. The 'this is a png copy of a jpg' seems like a nice simple way to start, and I know I can do very quick detection of that by just hashing image pixels. Then maybe explore some 'this jpg is definitely lower quality than this one of same resolution' stuff. The way to slice through dupe mountain will be through automatic systems to reduce the human drudgework, but I am similarly leery >>12282 >>12283 of anything too clever/vapourware to start with. Most of all I want to get the infrastructure and maintenance processing code in, and then we can test all kinds of different comparison systems for our exact purposes.
>>12272 >>12273 >>12274 >>12275 Thanks. I have 'human' number sort capability in hydrus already, although I am sure there are still places to apply it. All numbered tags should sort like this atm. I am pretty confident I can get directory listings and file access of rars and zips with the python libraries I already have. Any first version of a cbz viewer would be simple and just read through the internal pages one by one, no bookmarks or per-page metadata or anything. Just something that lets you penetrate the 'list of numbered jpg' zips already in your db in the media viewer (and rename to .cbz or whatever, so you can 'open externally' to your preferred comic reader).
>>12277 For new image recognition techniques, yeah, I designed the search system to make this possible. Much like >>12319 , the main push of duplicate detection 1.0 was to build a search system that could handle many search systems and hang one simple 'looks like' system on it. I can fairly easily add new techniques to support rotation or colour similarity or whatever on it now. This would not be my urge at the moment, as this simple system we already have the biggest problem is there are way too many to go through, so the processing workflow is now the weakest link, but once we have that more under control I can work on this.
>>12278 Yeah, that's a tricky one. I know exactly what you are talking about, and I would love to have a nice system for it, but the actual guts of how 'if … then' tag relations would work are way more complicated than I am confident I can currently support. For siblings and parents, my first priority is to improve the data store behind the whole system first. Once that isn't on fire behind the scenes, I'll consider carefully adding this sort of power. I am sure it could go very wrong if not thought about, so it'll be baby steps until we have some real world experience.
>>12281 Both of these thoughts are on my mind. Adding tag siblings revealed to me a big set of pain in the ass problems related to tag definitions I had not considered before. For both of these, I think the ultimate far future way to solve them is to have a tag definition structure "Add tag metadata (private sort order, presentation options, tag description/wiki support)", where clever metadata can be applied to tags. So you could say: character:shimakaze (kantai collection) And the tag definition, which would essentially be a cleverer iteration of the current siblings and parent system, would say "this has 'series:kantai collection' parent" and also perhaps "this can be displayed to the user as 'character:shimakaze'" without destroying the unique tag identifier through merging with some 'character:shimakaze (my oc series, donut steel)', basically being aware of the (kantai collection) after the main tag. I am very much on the side of letting users display tags how they want, as there are many different spergy desires here, and having a system that recognises info about tags lets us do mass management rather than the current per-tag mess and endless firefight. Same for namespaces. I am experimenting with 'clothing:' namespace on the PTR, but I know some users hate that. It would be ideally better if the tag 'bikini' had the 'property' of "clothing" rather than an explicit namespace to argue over, and then a user could say 'when a tag has "clothing" property, display it as namespace'. As it is, I expect my next step here will be more in line with little patches. Namespace sibling control (like saying 'display all creator: tags as artist: please' or 'display all clothing: as unnamespaced') seems an easy-ish next step. "Tags were a mistake." - t. hydrus_dev
>>12288 Thank you, I am adding 'Add version tracking to downloader system objects and explore remote fetching of updates' to the list.
>>12296 Thanks, this is actually a small thing. I assume you want no duplicate action applied, just a basic 'get rid of these two shits' and move on to the next decision? I'll see if I can add a button to quickly do this for 349 or 350.
>>12299 Thank you, I have api cookie management in my current to-do. I won't work on client api as a big job in this cycle just so it has some time to breathe. Can you explain the 'tag statistics' idea a bit more? Could this be something to apply to the program more generally, rather than just the API, like something to click that says "show me what artists I like and do not sub to?" Subs need a db-level data overhaul before I can do clever inspection about them btw. But this is something else to integrate into the client api as well–managing subs.
>>12293 Thank you. There are multiple jobs for improving notes that I have not been able to get to. I am adding "Expand file notes system" to the list to cover a general push in this direction. This would include multiple notes and likely connecting the notes system to the downloader as well.
>>12325 Yes, thank you very much!
>>12152 Hydrus is already pretty great, when it's stable. The main 'feature's i'd like to see worked on next are getting pixiv scrape working again and more login manager support for big sites like Fur Affinity and Ink Bunny. Having hydrus scrape a largely nsfw site and miss all that is kind of pointless because of no login.
>>12316 sadly the only program that you would be up against is comic rack and acdsee in terms of full featured program and in terms of comic rack, nearly every feature they have, you also have, and in terms of acdsee you either use crash prone versions from 10+ years ago (version 8/first pro to version 9) or you use the recent versions which are pure bloat for unicode support, there is very little middle ground with acdsee due to so many of the in between versions fucking with features. >>12319 I honestly never want an automatic sort, as much as it would be good for going though my cluster fuck, I am still getting images that are full featured images going against 23 byte black boxes. running into this makes a pure auto system unacceptable to me, I would accept an auto system that looks at two images, and then has me spot check lets say it finds a better worse pair, it presents me the better and it presents me the wrose green border is better, red border is worse I scroll though like normal, but left click confirms green, right click confirms red. Confirmed green acts like better worse does currently, confirmed red takes it out of the auto figure it out area and into a manual pick.
Are there any plans at all to (at least have the option to) keep metadata of individual tag mappings, like a file's tag history or keeping track of which process added this tag to this file? Like was it typed manually, was it imported while scraping, etc. I assume this would bloat the DB by a big factor but at least for my personal use I think that might be worth it down the line
Poll >>12358 !


Forms
Delete
Report
Quick Reply