/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

Uncommon Time Winter Stream

Interboard /christmas/ Event has Begun!
Come celebrate Christmas with us here


8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

Next Big Job Poll hydrus_dev 04/24/2019 (Wed) 21:39:10 Id: 3ccbc4 No. 12358
Here is the poll for the next big job: https://www.poll-maker.com/poll2331269x9ae447d5-67 You can vote for multiple items. I expect to start work on the top-voted item at roughly the time of the v351 release post, in two weeks. Please feel free to discuss and ask about items on the poll in this thread.
>>12360 Sure: >Add text and html support This will be for file import. So if you have a .txt or .rtf story or something, I'll figure out import support for it. It'll be a 'big job' because it will need a new import routine as file type is more difficult to figure out for text files than it is for images and videos. The html part will just be that you can import .html files too. >Add popular/favourite tag cloud controls for better 'browsing' search Several users would like a 'tag cloud' in the program. This is basically a big list of your favourite/most popular tags, often with bigger font size for the 'best', that lets you quickly browse to your favourite things. This tag cloud in hydrus would let you double-click 'blonde eyes' 'species:elf' quicker than typing it in the autocomplete. >Improve the client's local booru (this likely now means a backend migration to the Client API) A long time ago, I made a prototype booru as a 'big job'. It has stayed in this very basic prototype state for a long time. You can try it out yourself by turning it on under services->manage services. There's a little help for it here too https://hydrusnetwork.github.io/hydrus/help/local_booru.html . Now I have the nicer Client API work done, I suspect the next step for the booru is to merge it into the newer code and then hang some more bells and whistles on it. The Client API can do searching, so I can then have the booru do searching, and so on. I'd like to have more templating as well. >Add tag metadata, private search order I think that is bad phrasing, probably from when I was talking to a user about this. I mean 'personal' sort order, so you can, on your client, say 'when sorting tags by namespace, put creator at the top, then series, then character' rather than a strictly lexicographic sort.
>>12363 Yeah, I would like to split my PTR this year. I expect it will have to be an executive override on my part, maybe September time or something, where I just choose that as my big job for a couple of months before the bandwidth and gigantic db sizes overwhelm us all. EDIT: Reading your post more carefully, my PTR split is slightly different to what you are talking about. But yeah, my idea is to split the PTR into certain namespace shards and iterate on how some of the sync works. 500M tags is starting to strain some systems, and the problem is only getting worse with time. I also agree that auto-taggers should probably interface with a different repo, at least to begin with. The true ideal here, likely only ready in 5-10+ years time, is that we retire the PTR for many duties and instead have auto-taggers operating with our own CPU time on our own local dbs exclusively. All this stuff is definitely on my mind. I have sympathy to the idea that some Anons have expressed that waifu2x and neural net experiments may now be better offloaded to the Client API. If they are chosen, I suspect most of my work would be in adding client ui ways to interface with these systems rather than developing a whole bunch of actual nn taggers or whatever myself.
>>12396 >>12386 Yeah the problem with mpv, in my research so far, seems to be wx. There are good Qt plugins, but my ui library is wx and I haven't found a good cross-platform mpv plugin for it. If anyone discovers one, please forward me the link! There's a long-term plan by one Anon to convert Hydrus to Qt with an automatic system followed by a bunch of man hours to fix the gaps. If this happens, possibly in the next 24 months, then I would probably jump over and continue with Qt from then on, which would open a bunch of options for us. I am still fond of wx, but if I could wave a magic wand, I would prob switch to Qt. I'd love to offload all my rendering pipeline (+audio) to an mpv window I embed. I would love any suggestions like this that work on all platforms with wx. Most stuff I have looked into so far has been super crashy or only one platform (and not great at that). I'd also love any links to a good modern python ffmpeg wrapper library.
>>12391 Yeah, if audio isn't chosen this time, I expect I will add simple audio detection and 'this has audio' metadata in one of my 'ongoing' work weeks as the medium-size job. I can parse this true/false info easily with my current ffmpeg solution, and it would be nice to have that info for searching and (eventual) video comparison and quick 'do I want to double-click this 9 second webm to open externally for the audio?' user choices and so on.
>>12398 A user asked for an expansion of the ratings dialog with something like this. Perhaps as you mouse over the different stars in the ratings control, it shows in a preview-sized media window an example of a file with that rating. I would be open to more thoughts on this as well, maybe some better ratings value shortcuts in the regular media window and so on.
>>12403 oh okay so it fetches some random file you listed as whatever the rating? Makes sense I guess, not a huge feature but could be neat I suppose. I thought it might have been MachineLearning rating which would be pretty tricky. or auto-rating something by a specific author or source.
(2.51 KB 517x29 ClipboardImage.png)

>>12358 On the topic of convenience and workflow improvements I'll suggest a small one. This dropdown menu I want it to remember what I last used. Most of the time I never use 'sort by smallest filesize' and I either use sort by last imported to keep a gallery chronological as if they'd look like on a booru. Or sort by favorites or ratings, filesize is useful for weeding out the bloat, but loading the thumbnails in 'sort by smallest' then reloading them as 'sort by imported' just takes extra time. If it started as a sort by last choice then it would speed up my workflow.
>>12414 or let me set a default for the choice in the options for the menu
(21.33 KB 1048x95 a.jpg)

>>12415 >or let me set a default for the choice in the options for the menu Am I missing something?
>>12414 Thank you for this suggestion. Although sort should be remembered per page in the gui session, and you can set a default under the options as >>12419 says, I had not thought of setting/remembering this value from the last actual selection. I can see how you and others might find this useful. Adding this would be a fairly simple job–I will see if I can add a checkbox (default off) to enable this behaviour in the options, either this week or for 352.
>>12424 In addition to having sort being remembered from a previous session, I was wondering if you could have collection settings saved between sessions. I like to have different collection preferences per page and each time i start Hydrus it gets jumbled up.
>>12419 >>12424 Huh didnt know that was there, i'll be satisfied with setting the defaults for a while. thanks
>>12399 you forgot markdown, markdown would be very useful as well along side text and txt and html/xml. > Add tag metadata (private sort order, presentation options, tag description/wiki support) Sounds like a good addition > Clean up code and add unit tests > Reduce crashes and ui jitter and hanging by improving ui-db async code > Just catch up on small work for a couple of months should really be done together > Write an URL Repository so clients can share known url mappings > Improve the client's local booru (this likely now means a backend migration to the Client API) > Improve 'known urls' searching and management Would be really useful together > Add CBZ/CBR support (including framework for multi-page format) > Add import any file support (giving it 'unknown' mime but preserving file extension) > Add Ugoira support (including optional mp4/webm conversion) > Allow multiple custom 'open externally'-style file launch commands for files Do it together as an overhaul? > Add version tracking to downloader system objects and explore remote fetching of updates Unironically useful > Add popular/favourite tag cloud controls for better 'browsing' search Might have to be rewritten once the Qt migration is done
>>12400 > The true ideal here, likely only ready in 5-10+ years time, is that we retire the PTR for many duties and instead have auto-taggers operating with our own CPU time on our own local dbs exclusively. All this stuff is definitely on my mind. I'd not completely plan for this myself. For example, what if reliable tagging with just a few topical training sets (e.g. celebrity name, everyday objects, clothing type, facial expression type) takes 5 minutes per image on a fast computer (relative to what people have at home of course, not national supercomputers) and a day for a video even if only a frame every x-th second of it is analyzed? Maybe instead of retiring the PTR for auto-taggers, it'll be used even more for these - with individual database tables derived from some original tagging program (with its program version, parameters, training data of course varying over time, presumably sometimes necessitating a full regen of the database). > If they are chosen, I suspect most of my work would be in adding client ui ways to interface with these systems rather than developing a whole bunch of actual nn taggers or whatever myself. Yea, it is probably already a fair amount of work to figure out some good set(s) of training data and other parameters plus to have a basic UI/database interactions for some of the decent existing NN. Apart from that, even with your dedication and current NN frameworks, I'd not necessarily expect you to be quickly able to reimplement your variant of even the simpler non tagging NN like NIMA image quality assessment or waifu2x, there is just a lot of work involved in making them work well. We will probably get the most use if Hydrus is able to use some of the good existing ones.
>>12400 >>12452 I would rather see that a social media engine be made for Hydrus such that managing tagging could be easier. See: https://github.com/r888888888/danbooru (written in Ruby) and https://github.com/rr-/szurubooru (written in Python) Because over-reliance on neural network based taggers would ultimately need to the worsening of the dataset. Dev, if you want to, add this to the table for the next vote around, but I do have to warn you about the SQL version issues.
>>12401 can https://github.com/jaseg/python-mpv/issues/42 and https://github.com/mpv-player/mpv-examples/tree/master/libmpv#methods-of-embedding-the-video-window give good pointers as to how we can Wx the MPV window? Otherwise maybe we should wait for the Qt migration first before having a video/audio player.
>>12453 >I would rather see that a social media engine be made for Hydrus such that managing tagging could be easier. Do you essentially mean a Danbooru for Hydrus?
>>12456 yeah.
Cant believe people here actually discuss of using Hydrus as a movie catalog or audio player. "What a mess we made, when it all went wrong"
>>12470 it's pretty convenient to open the webm right then and there. I would say the most practical use is videos, webms and other similar files to be under 2 minutes. If someone wants to add one that's more, then it's kind of insane. But currently it's not very practical because it seems (for me at least) that webms are much smoother when viewed with anything other than Hydrus even if I use something basic like windows media player. People should just use that, but if hydrus improves, then it'll be convenient.
(142.27 KB 570x712 3465346.jpg)

So I just had this crazy idea that I wish I could have posted earlier in the other Big Project thread. You know how artist always upload their works to various sites boorus, tumblr, twitter, pixiv, etc. and how annoying it is to have to search through all those sites just to find roughly 40-60% of their content? Well what if we started getting more artist to use Hyrdus to upload their works to and have them set up their own client for other people to link up to and follow. I'm not saying they should stop using twitter or other social media sites in favor of Hydrus, but more of a better alternative backup for their works. Everyday you always see artist have to put up with the most retarded bullshit from some of these sites, random shit being deleted, censorship, no porn, flagged, account suspended, etc. forcing artist to move elsewhere or link to 3rd party sites like imgur just to continue on. With them running a Hydrus client and something gets taken down then they don't have to worry about any of that because everything is still stored in the client. The biggest benefactor to this is there's no restrictions on what they upload. So when an artist list all the sites where you can find them(tumblr, twitter, pixiv, patreon) they'll also list their Hydrus client key with it. Like I said this is just a crazy idea and its just something neat to think about. For this to happen though, I think a special version of Hydrus needs to be made. A content creator Hydrus? With a bit more options for artist. Something that would be real easy to set up and use, users would browse an artist client like they are browsing a site and even when the the artist's client isn't up, as long as there's other people who are also linked to that artist's client is still online, you'll still get their works. I know that artist love using Patreon so I'm not really sure how patreon works or would work with Hydrus so I guess anything involving money would probably be off the table but I also know artist want to have certain things released and certain things not released yet but I doubt they would want to weed through clients or followers, maybe something for that. I think the real issue here though is, would artist be willing to use Hydrus or any personal client has their main upload source and would Hydrus be able to fit those needs?
> Well what if we started getting more artist to use Hyrdus to upload their works to and have them set up their own client for other people to link up to and follow Most artists are Patreon-loving and/or really tight with money. The rest gets censored heavily but are too dumb to /tech/. Which is why this idea might have a lot of holes. > Something that would be real easy to set up and use, users would browse an artist client like they are browsing a site and even when the the artist's client isn't up, as long as there's other people who are also linked to that artist's client is still online, you'll still get their works So dev needs to work on IPFS a bit more before this can even be considered. but I like IPFS, so please dev help this guy out and make an API strong enough to allow for a deviantArt or Pixiv like blog wrapper to work. t. person researching on creating an alternative to the Patreon revenue model for artists
>>12472 Yea, I don't think this will happen, plus anyhow coverage won't be good if we can't rely on Hydrus downloading the files. > Everyday you always see artist have to put up with the most retarded bullshit from some of these sites, random shit being deleted, censorship, no porn, flagged, account suspended, etc. forcing artist to move elsewhere or link to 3rd party sites like imgur just to continue on. This isn't going to be different if you present Hydrus as a service; they'll go after hydrus_dev. It only kind-of works on p2p clients, and I'd say to keep Hydrus itself clear it probably should be a 3rd party addon so hydrus_dev can continue as usual.
>>12472 Yea, I don't most artist switching will happen at any point soon, plus until then anyhow coverage won't be good if we can't rely on Hydrus downloading the files. It's probably more important to keep going with downloading the actually used sites. > Everyday you always see artist have to put up with the most retarded bullshit from some of these sites, random shit being deleted, censorship, no porn, flagged, account suspended, etc. forcing artist to move elsewhere or link to 3rd party sites like imgur just to continue on. This isn't going to be different if you present Hydrus as a service; they'll go after hydrus_dev and/or he'll be struggling to host it all due to bandwidth issues and various providers banning porn and so on. It only kind-of works on p2p clients, and I'd say to keep Hydrus itself clear it probably should be a 3rd party addon so hydrus_dev can continue as usual.
>>12477 See >>12473 we are getting to p2p someday. Sorry for burdening dev though.
>>12434 I forget why, but when I added 'sort by' to the session object that is saved and loaded, I did not add 'collect by'. I think it was more complicated so was not easy to add. The code is much cleaner these days, so I'll check this again and see if it is easier now. If it is, I'll slip it into normal weekly work.
>>12452 As I understand, most of the CPU work for neural nets is in the training. You put hours and hours into learning from examples and then the actual use of the trained net is relatively quick. Also the new libraries being developed right now (including some stuff in OpenCV, which is already in the client), actually offload this work to the GPU. Assuming future tech here is deployed all over the place for vidya and phone facial recognition etc…, I imagine future GPU hardware will have accelerated instruction sets specifically for NN fun just like we have video decoding acceleration now. I figure it will only get faster and more accurate here on out. I am no expert at all, but my working assumption atm is that this tech is going to get big in the next ten years. I'm prepared for it to fizzle, but if it works out, I'd like it if Anons had our own implementations rather than the only option being big tech companies. I am happy to fail a couple of times with this in the pursuit of getting it right. Anyway, I don't want to talk firmly about anything more than a year in the future. But this stuff is on my mind. If we can gen simple tags on our own time, that's a great solution to many problems. And if we can one day get 'apply this learned memetic data to this image' a la those 'dogs everywhere' NN morphs you see, I see that as an incredible potential source of banter value.
>>12453 >>12456 I am afraid I probably will not work on much with a social media focus unless it can be through an Anonymous-friendly lens. I'm an imageboard guy more than anything and am not really interested in discussion or other team workflows that include usernames. And there are plenty of good implementations of boorus already out there. As well as not having the personal interest, I don't think I have the experience in making social systems like that to do them at all well. If you would like to do this sort of thing yourself, plugged into the Client API, please go ahead and let me know what else you would like the API to provide.
>>12455 Thank you, I will read these properly and do some tests my end. If we can figure this out in a sensible multiplatform way, I'll happily move over and ditch all my code.
>>12472 >>12477 My File Repository serves some of these functions. Although I know some users use it as a private file archive amongst their friends, it never took off as well as the tag repo. As the hydrus community grows, perhaps it will get a new focus and I can put some more time into it. Remember though that even those artists who want to share their creations widely are sometimes not very proficient at it. Much like the 'big gif' issue here >>12476 >>12481 , many artists do not appear to understand how resizing (e.g. when you upload to twitter) affects image quality, nor how important preserving original file bytes can be. I hope these issues will get wider traction in future. I am confident P2P systems like IPFS will have a role to play there. Then again, some days the phone-only and pro-censorship community seems only to be increasing, so maybe things will only get worse, ha ha. Then again, with decent P2P, then as long as some lads know what they are talking about, someone will be curating a decent collection somewhere. I imagine I can help out some of the metadata requirements of that sort of curation. I personally hate censorship and files suddenly disappearing for other reasons, so I made a file db you keep on your own machine, that you control completely, where you can store all your good stuff without having to worry about it. :^)
>>12487 Maybe there is a way out Database receives API order => Database takes image => image gets resized => deliver image through API (of course it is caches for 15minutes~1 hour) => Phone displays resized image >>12474 already require those options
VOTE DONE I am just putting my 351 release post together now. 'Improve duplicate db storage and filter workflow (need this first before alternate files support)' stayed on top for almost all of the past two weeks and is still top now, so that is the winner. This item will involve overhauling the duplicate db structure to be more efficiently group-based rather than pair-based and will continue the recent ui-side workflow improvements. If there is time, I might even make a skeleton for optional auto-decision making on things like 'if one file is jpg, the other png, and they have the exact same pixels, set the jpg as better'. I will keep file 'alternate' support in mind throughout. Thank you for voting everyone. I appreciate your input and the surrounding discussions over what to do with hydrus.
>>12505 I have 2 suggestions for the duplicate auto filter 1a) there was a thing a number of years back that would take a png and a jpeg 95% and show you a negative that would be the difference in quality, I know this can also be done with music to show what's lost in compression. would there be a way to pair files off in something like this 1b) Would it be possible to pair files regardless of size or ratio differences off in this? this honestly may completely fuck the results but I have run into cases where the file size is half the normal one in ratio and its fairly obvious whats better, and having an auto duplicate sorter to at least give it a once over for a possible determination 1c) If you do something like this, I would assume it has to be done at or around run time as in no database of files, would it be possible to show the negative so it may be easier to see where my eyes should focus 1d) because I personally subscribe to a 'some loss is acceptable depending on the circumstances' way of thinking, would it be possible to have a better worse check determine one image was better but go with the lower quality one because it fell under a certain % difference? 1e) because of 4chan and a few of the people there I know another thing they did to piss space away, someone posted an image, and for the entire thread all he did was reopen and resave the same image as a jpeg over and over again, slowly corrupting it, I know that if I had an image set like this, I may have gen 1 (good) get replaced with gen 2 (1%wrose) and maybe that one is accepted, but then it goes down from gen 2 ot 3 ect to get 150 where its a fucking wreck. I know its not likely, but a fail safe in automation against that would be greatly appreciated where if it sees 3 images deceptively close in quality to each other it triggers manual review immediately. 2a) Due to the /trash/ scalie thread spam, I was made aware of a method to troll who use an auto dup filter, they have the base image, then they make a png, corrupt the fuck out of it, and that bloats the png, which in a filter would be seen as a better image, even though its worse. so even if we have an auto dup 'advanced' filter, would it be possible to user parse it? 2b) Personally I would like to see the stats always on for better worse, with reasons why something passed or failed could this be added and color coded per image such as jpeg<png 342kb<1.9mb then flipping images would change the text from green to red to denounce why this image won, or why it lost. 2c)with the manual dup filter you have black and grey, which is great when you are manually looking at the images, its about as neutral as you can get to easily see which is better or worse without coloring the image due to your eyes being tricked. but with an auto filter, would it be possible to show the images with blue or red/green or red or something so its FAR more obvious which one won out of the pair being shown? personally, I would love to have images auto reviewed and then manually confirmed, it would make shit go SO much faster for me. 3) there was a problem a while ago, where if an image went through the better worse filter, and then got reimported for some reason, it would not show up in the better worse filter again. Is there a way to flush the better wrose whatever in the program, or at least recognize that there is a better worse pair again and show it? I may have some other thoughts, so ill post about them later on
>>12506 Check the big list of repos in >>12295 especially https://github.com/andrewekhalel/sewar (it will come in handy)
>>12511 >>12506 1a)I am chiefly limited in how much time I have. I'd love to do more complicated autodecision workflows and backup ui, and I think that sort of thing would help a lot, but I don't have time in this rewrite. I also read up on some jpeg quality estimation here https://www.politesi.polimi.it/bitstream/10589/132721/1/2017_04_Chen.pdf , but it is too complicated for me to implement in the time I have. I would also like to focus on the db side more in this cycle. 1b)I am moving to a simpler comparison system in this rewrite. You'll always be seeing files compared to the 'best' of a group once it is done, which should exaggerate filesize and resolution differences. I am squeamish about autodecision on filesize or resolution alone as there are plenty of stupid bloated pngs of jpegs out there, but I think that bias could be part of a larger system that takes multiple variables to auto-decide. 1c)Sorry, I just don't have time to write clever ui like this atm. 1d)Yeah, I am afraid of the edge cases here. My thrust will always be default to off and lots of user customisation. I'll prep any auto-system with rules like "if exact same pixels and one is jpeg one is png, the jpeg is better", but I'd like for you eventually to be able to write your own rules for what you want out of it. 1e)Yeah, single pixel edits are a problem here. In a future iteration of any autodecision system that took multiple rules to make decisions, I think a blanket "if one pixel different, the older file is better" could be the ticket. There's also issues with file metadata being stripped or altered by CDNs. 2a)Yeah, that's the difficult stuff. Any dupe filter can't be simple, or any simple rules should be able to gauge certainty and pass the decision up to human eyes when something smells fishy. This clashes with situations like waifu2x, where it is a blow-up of the original, but a clever and presumably desireable one. 2b&c)My thoughts on the autodecision system would be to have it run in the background for very easy decisions, but confirming decisions (or maybe just some decisions with low confidence) with the user could be another way to go. 3)Duplicate metadata is not deleted when files are, so the delete/reimport cycle doesn't affect it. Saying 'remove this from all dupe pairs and requeue in the system' is not easy at the moment (I think you'll have to do a bunch of right-clicking on the thumbnail in advanced mode to show the exact pairs and then sever the relationships and requeue), but this will be easier with the new data structure I am designing. In the new system, I'd also like thumbs to load with their dupes, whereas at the moment that info is only fetched on right-click.
>>12542 yea all of 1 more or less required the image difference comparison to be a thing. as for 1e its not even single pixels, its resaveing the exact same image over and over again as jpeg slowly corrupting it over the course of 100-150 images. that's where I see a slow better worse auto filter jumping from a good image and quickly ending with the most corrupted one is no fail safe was implemented.


Forms
Delete
Report
Quick Reply