/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

Uncommon Time Winter Stream

Interboard /christmas/ Event has Begun!
Come celebrate Christmas with us here


8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

The poll on the next big thing to work on hydrus_dev 11/14/2018 (Wed) 23:05:09 Id: 0eb199 No. 10654
Due to some funny voting, I am considering the poll finished earlier than expected and will start work on an API prototype in the new year. (details >>10845) Thank you everyone for voting–I really appreciate the feedback. Here is the poll if you would like to review what was overall popular and not: https://www.poll-maker.com/poll2148452x73e94E02-60 This thread remains available for discussion of anything related to the poll. Thanks everyone!
From an image archiving perspective waifu2x poses some interesting problems. 1) It will be difficult to determine which up-scales are derived from original source images vs 500px tumblr images. 2) Even if waifu2x provides deterministic up-scaling for now there will undoubtedly be different/better trained networks in the future, resulting in innumerable marginally different images. 3) It is almost inevitable someone will start x20 up-scaling their images and pass them around as original. It will be interesting to see how things transpire.
>>10684 see >>10575 This is one of the reasons I want a contender mode A known good vs a new image makes sorting loads easier than not knowing where your known good is. granted an auto filter would likely not catch an original image unless it was tagged as 'source' which very well could be a thing that happens from now on when downloading from certain areas, as these are the original source areas, rather then a booru source.
>>10659 Auto-lookup would automate file lookup scripts. Essentially a background task to look up tags for files. You would queue up a thousand files, and hydrus would spend a day asking danbooru or wherever 'hey, you got any tags for this file?' spaced out in a polite way.
>>10661 'unknown' mime files would get their own mime, probably something like 'application/octet-stream', so would be searchable or excludable with system:mime. I am not sure how I would do the import workflow for them, whether I could do that on the existing pipeline or better to have a separate one, but I should think it would have optional filename tagging like the current system.
>>10665 >>10675 >>10676 My plan is to do something like shift+enter to open a new OR chain. So you type: system:inbox ENTER skirt ENTER blue eyes SHIFT ENTER green eyes ENTER And get system:inbox skirt blue eyes OR green eyes It would support n tags in the OR group/chain. Let's start with something simple like that and then iterate on it. The db level searching will be some new nested objects in the big search object and just munching through all the code and adding UNION instead of INTERSECT when an OR turns up. Not ultra complicated, but a decent amount of work.
>>10702 Just to clarify, each line in the hydrus search predicate list would still AND with every other line, so you could have: blue eyes OR green eyes OR red eyes blonde OR redhead To get ( blue | green | red ) & ( blonde | redhead ) i.e. blondes or redheads with blue, green or red eyes, but not a brunette with blue eyes.
>>10685 >>10684 My general feeling on this issue is that the train is already leaving the station. Cloudflare and other hosting providers are 'optimising' and resizing content and locking down original files all over the place, so our future can expect more duplicate-matching tech and less hash-reliant tech no matter what. Since we'll have to move in this direction, adding waifu2x–and whatever other processing systems are added in the next ten years–won't be such a giant problem. I see 8bit -> 10bit+ colour channels as a big missing puzzle piece here as well, when we finally move from jpg/png to whatever HDR format takes over. I'd love to have an exe that will fill in natural green colours in all my old jpegs, but again there is no guarantee that that will be deterministic. The future is shapes and colours and 100% GPU, not hash-matching. :/
>>10702 >>10704 Shift-enter looks like the simplest way to go about it since it's based on a current way of doing things. Much easier than having to type out a big query all at once. This will be very fun to play with in the future.
Absolutely amazing how everybody wants something different. Welp, you have your work cut out for you, devbro.
>>10710 It is a nice problem to have! I get a lot out of working on hydrus, and I can't imagine my to-do running out any time soon.
>>10702 >>10704 You can't put ANDs inside ORs that way but it sounds good.
>neural net auto-tagging I'll be honest, when I first heard and used Hydrus, I thought the program was essentially capable of this. I was a bit disappointed that it couldn't recognize and tag even some simple memes simply because the picture isn't an exact match from some database. So clearly this is what I want most (helps that I think Hydrus is already great as is beside the poor autotagging)
>>10733 See >>10663 with API, adding illustration 2 vector would be easy. https://github.com/rezoo/illustration2vec
damnit, the deleted one is so low on the list now. I cant do culling till something no matter how jerry rigged it is is implemented.
>>10733 >>10744 With all the sensitivity and specificity issues these things have, it'd be a good idea to put their tags into their own managed databases that can be regenerated when the model or neural network software get an update, or even just allow them to generate and supply additional tags transiently and async that are then merely some quick suggestion for actual manual tagging.
>>10704 Why not just make it look explicitly like: > ( blue | green | red ) & > ( blonde | redhead ) An user can still enter only "blue | green | red <ENTER> blonde | redhead", but at the point where he hits enter he'll get the explicit grouping round brackets and the "&" added. Maybe with some coloring applied to kinda indicate it isn't an editable thing and added automatically. Then simply allow it all on one line, too: > ( blue | green | red ) & ( blonde | redhead )
>>10751 Actually, I guess the grouping brackets would be there anyhow and you only get the & on a new line. Either way, if this is explicit, it will not get very confusing if the long expressions now line wrap or anything else like that, plus quite a lot of people understand better what's going on this way anyhow.
For those who want video/audio support: Just add MPV to it (GPLv2) For those who want Office functions: Just add OnlyOffice to it (AGPL) The only thing we need is MIME support
I believe that before adding a new functions hydrus needs to improve everything it have now, so: Just catch up on small work for a couple of months Reduce crashes and ui jitter and hanging by improving ui-db async code Improve duplicate db storage and filter workflow (need this first before alternate files support) Cleanup code and improve practises
>>10760 Agreed. Hydrus dev also needs to allow pull requests & commit during development instead of before release.
>>10761 >Hydrus dev also needs to allow pull requests We've gone over this a couple times, hdev is not inherently against this but feels that currently doing so would not help him. He is going to work on his style and such the coming weeks until 12th of december, after which he will convert to py3 for a couple of weeks. Then he will start on the newly polled feature. Come to the discord and ask for the @dev_alert role to talk to dev about it when he comes online.
>>10748 I don't know what to say m8. If you can bear it without sperging out, I think your best return on investment is to prioritise a 90% solution instead of waiting for a 100%. I'm happy with how delete works and still drowning in my inbox, and I know many other users are as well, so your concern about missing certain files due to specific reasons may be a false fear compared to a larger problem of 'how do I get through these 20,000 files in time when I know there will be 5,000 more next week'.
>>10751 >>10752 I think part of OR will be talking to people about how they actually use it once a prototype is out. If it is common to join, say, six or more tags together, multi-line wrapping makes sense. Otherwise I think I can fudge through with just adding tooltips or something. If people don't like OR as the text joiner, I can always customise it. A benefit I have here that is different to how the boorus do all this is that I store the search predicate as a complicated objects. I won't be storing it as a string 'a | b | c', but a new OR predicate with list [ a, b, c ]. This makes different presentation and other kinds of handling like initial entry and even editing much easier. As for the &, maybe I confused things with my rough explanation. I mean for OR-supported search in hydrus to work the same way as current search, where every line in the search: system:inbox blue eyes blonde hair Is implicitly &. That search is ( inbox & blue eyes & blonde hair ). I see OR search in hydrus as just allowing one or more of those exclusive predicates to be an OR, like: inbox blue eyes OR yellow eyes blonde hair To mean ( inbox & ( blue eyes | yellow eyes ) & blonde hair ), or "in inbox and blonde hair and either blue or yellow eyes". Is there a more formal way I should write that? I am just using 'pseudo-code' logic.
>>10763 Also vote for cleaner code and better practices so we can establish code contribution rules like: - only fork from releases so he does not have to handle non-release forks - make sure everything works before forking or else it will not be accepted - format the documentation so that he can understand what you have done Currently his style looks like a one-man version of https://rfc.zeromq.org/spec:22/C4/
>>10775 Sorry, I am not interested in working directly with others on hydrus. Anyone is free to fork the code and do whatever they want, but I do not work well in teams. I would like to clean the code however and improve how my week works and build goes together, which is what this item means.
>>10779 But people will try one way or another. Code dominance is a thing, and people want make a mark in history by getting involved, whether Dev takes it or not. The reasoning for those 3 lines is to act as a deterrent and to not be used at all.
>>10768 >I think part of OR will be talking to people about how they actually use it once a prototype is out. I often use OR to bundle together groups of tags that I'm interested in to make a mini-gallery, kind if like a "smart playlist" in music terms. It could be for example, artists who draw in a similar style, or characters who belong to a similar group or category such as an im@s idol group or a class of KanColle ships that can't really be tagged objectively based on the content of the image alone. These kinds of uses can result in a range of as few as 3 to as many as 10 or more tags joined with OR.
(97.36 KB 200x145 1381460519606.gif)

>audio support 7th It's never gonna happen
>>10842 My big push now is to have quick turnaround on these jobs. Unlike the way overdue downloader overhaul, I want to quickly iterate on this stuff and keep the job time to 2-3 months. If it ends up being seventh in line, that would ideally be a couple of years. I'll keep pushing. In the meantime, there may be a library update that makes this easier, btw. If someone gets some good python ffmpeg bindings going, or I end up moving to qt next year and there is a good embeddable solution, I may be able to fit this in easier instead of in some borked way. I may also take a week sometimes to add a simple 'has audio' metadata to the client. I'm pretty sure I can grab that from my existing ffmpeg connection, and it'd be nice to display in the media viewer ui a speaker icon or something and have search capability for this, which should improve existing 'should I double-click to open externally?' workflows.
It looks like a whole bunch of votes came in all at once to OR in a non-organic way, see pic related. I don't want to get into the business of trying to figure out and subtract what looks like dodgy votes, so I am going to consider the vote finished now, early. As far as I can tell, the overall story of this vote was that API and OR are the most desired, with API a persistent very slight lead. I will work on an API prototype next, starting in the new year. I am pleased that we got good data up until now, and I am thankful for everyone who voted. Some things were higher priority than I expected, and some lower. I will edit my OP here and repeat it in the release post for 333. Since OR was such a close contender, I am open to working immediately on that afterwards since it will most likely win the next vote anyway, but I am open to other suggestions.
>>10844 >>10845 Please have cleaner code (2rd), please have API (1st), please have better workflow (3rd). And if possible, Async (5th), local booru (4th), metadata (8th) and Dedup (7th) support. Audio/Video support can simply be done with inclusion of MPV, so that is trivial (in fact by including that we can supportmany more video formats).
>>10864 Please read this https://github.com/jaseg/python-mpv to see how easy that is Might as well plug this in if people want it https://github.com/rg3/youtube-dl for downloading from YouTube, Soundcloud and all the alt-tube sites Also regarding the API, according to most mobile devs, they would prefer Danbooru/Moebooru API over everything else, so I drafted an API spec for Hydrus https://ghostbin.com/paste/hghv2 (Gelbooru could come later) And Nori dev (tjg1) says the downloader script GUI is overengineered, whatever.
Sorry lads, but how can I bulk-download booru-imageboard image files in bulk?
(10.21 KB 427x185 temp.png)

>>10868 I get the error of pic related. I already followed instructions to reinstall the .dll for this, so I am at a loss what to do now. Pls halp ;_;
>>10869 >>10868 Sorry for not putting this all into one post - but I tried to install "bionus" "Imgbrd grabber" by the way. Is something like this possible with hydrus!? I never tried…
>>10870 Imgbrd is a separate project that does similar things, this board is not responsible for it, best ask in https://github.com/Bionus/imgbrd-grabber/issues Bionus & pals has promised to make things easier for us to import data to Hydrus, but it is not yet well integrated. https://github.com/Bionus/imgbrd-grabber/issues/1001 and https://github.com/Bionus/imgbrd-grabber/issues/588 has some notes


Forms
Delete
Report
Quick Reply