/hydrus/ - Hydrus Network

Archive for bug reports, feature requests, and other discussion for the hydrus network.

Index Catalog Archive Bottom Refresh
Name
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

Uncommon Time Winter Stream

Interboard /christmas/ Event has Begun!
Come celebrate Christmas with us here


8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

(12.02 KB 480x360 CWYTOOtplY8.jpg)

Version 331 hydrus_dev 11/21/2018 (Wed) 23:12:19 Id: 6850cd No. 10747
https://www.youtube.com/watch?v=CWYTOOtplY8 windows zip: https://github.com/hydrusnetwork/hydrus/releases/download/v331/Hydrus.Network.331.-.Windows.-.Extract.only.zip exe: https://github.com/hydrusnetwork/hydrus/releases/download/v331/Hydrus.Network.331.-.Windows.-.Installer.exe os x app: https://github.com/hydrusnetwork/hydrus/releases/download/v331/Hydrus.Network.331.-.OS.X.-.App.dmg tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v331/Hydrus.Network.331.-.OS.X.-.Extract.only.tar.gz linux tar.gz: https://github.com/hydrusnetwork/hydrus/releases/download/v331/Hydrus.Network.331.-.Linux.-.Executable.tar.gz source tar.gz: https://github.com/hydrusnetwork/hydrus/archive/v331.tar.gz I had a good week. The login manager is done, I have added proxy support, and some tag stuff is faster! login manager The 'manage logins' dialog has some small usability improvements, and it also has a neat 'do login now' button to easily manually attempt logins. The network->data->review session cookies panel now hides 'empty' sessions by default and allows for cookies.txt drag-and-drop import! And login scripts can now be rolled into the 'easy import' downloader pngs! They get listed like all the other objects. With these last jobs, I have finished my last big jobs for the login manager. I would still like to write some decent help, and I'll fold new login scripts into regular updates as users create and share them, but this phase's bulky work is all complete. Sites with captcha and other complicated login systems (like exhentai.org) are not supported in this first version (so you'll have to manage without, or use manual cookies.txt import from your browser), but I am overall really happy with how it has gone. proxy support A user reminded me last week to add proxy support back in. This was an important objective of the big network engine rewrite (I had previously removed my tentative support as the old engine became too convoluted to support), and adding it this week couldn't have been smoother–the 'requests' library for python is just great. If you have an http or socks4/socks5 proxy you would like to use, please go to options->connection and fill in the simple new options. It has some text to explain it (including how to add socks support if you are running from source and lack the right library). This is client-wide for now, so if you set a proxy, all client requests will go through it. Let me know if this causes any problems. I may revisit this and have it work on a per-domain basis, like how bandwidth rules work. other Searches that include 'system:num_tags' are now much faster and have accurate tag counts even when multiple services share the same tag! This is still a pretty CPU-expensive predicate, so I recommend you reduce the search domain by including a tag or a simple system predicate like 'system:size', but it should nonetheless run a lot quicker. I wrote a new and unusual query to pull this off, so please let me know if it fails in some situations. If it works well, I will use it elsewhere. I have also started an extremely basic tag cache for the client db. It makes autocomplete results and file search results appear significantly faster in many situations. Like the new system:num_tags query, I am going to play with it a bit on my IRL client, see if there is any 'I noticed the difference in a good/bad way' feedback, and likely do a similar cache for file data and iterate on them a bit. There are better and more "show all these things' files" menu entries on the gallery import and thread watcher management panels' lists' right-click menus. You can now choose between showing the 'presented', 'new', and 'all' files, and they now show more quickly in the current page. I updated last week's e-hentai.org login script to one that works better, and I added a 4channel.org thread url class to permit watching the soon-to-be-separated sfw 4chan boards. full list - added a 'do login now' button to the manage logins dialog. it only enables when the selected logins are active and not invalid and so on, and will ok the dialog and queue up some login attempts, which will make report popups as normal - 'review session cookies' panels now support drag and drop cookies.txt import! cookies.txt importing will also handle errors a bit better and report total number of added cookies - the 'review session cookies' panel now defaults to not showing sessions with zero cookies. a new checkbox controls this - login scripts can now be rolled into easy import pngs! should work for export and import just like the other objects (although they won't be auto-added based on domain in export dialog) - brushed up some of the 'change login script' code–particularly, it now puts login scripts that have matching domains first in the selection list, for easier selection - after striking a reckless bargain with a daemon from the database-plane, system:num_tags now runs significantly faster and produces accurate tag counts even when searching over multiple tag services that have duplicate tags. if this works out, the immaterial beast promises greater gains for similar jobs with no possibility of anything going wrong
[Expand Post]- prototyped a new tag cache in the db that affects (and should speed up) many tag fetching routines. let's see how it goes - added complete, global proxy support for the new network engine! there are new options under options->connection (with some explanation text) to handle it. if pysocks is installed, socks4/5 proxies are also available! - updated the e-hentai.org login script to the new one on the github. your existing mappings for e-hentai.org _should_ all be updated right. exhentai.org is likely too difficult to properly support in the current system - the different panels where you enter system predicate information now all run on the new panel sizing system–if you have had problems with these, please let me know how they size now! - added a '4channel thread' url class to support watchers for the new 4channel sfw domain. it works for now, but let's see if their api changes when the split actually happens - the list right-click menu on gallery import and thread watcher panels now has three options to show combined importers' files–presented, new, and all. it also now shows the files (more smoothly) in the same page, clearing any existing highlight. - misc ui improvements - updated 'running from source' in help next week I have three more weeks until I break up for the holiday and start the big python 3 rewrite. With the login manager work done, I am now just going to catch up on small jobs. I'd like to focus on some code cleanup, particularly, and maybe untangle some hellish db siblings/parents code. If you missed the 'next big thing' poll last week, please check it out here: >>10654
Two more versions and we'll be at: Three! Three! Three! (333)
got the problem again where adding to many watchers gums the system up to the point it stops working, or at least I think that's what It is.
Ok, issue sorted, its either I was having the network clog problem, or I was something else was happening in the background that made the network stop… i'm unsure which because when the client restarted it started to vacuum, but I also paused everything but one page so it is entirely possible that I jammed the program Now, Having a bit of fun dicking around with show new and show all but I notice a few things missing, and a few things I can do in right click not translating 1:1 First thing when I highlight something with right click, clear highlight is not an option, to clear it I have to highlight something then clear it. playing around with right click gave me an idea, would it be possible to 'add highlight'? I can think of several uses, but the ones that come to mind most were the threads on /a/ where people just dumped entire volumes of manga, but they got split into different threads, say you had them in the watcher list. you highlight one, then instead of clear and highlight the next one, you just wanted to append the next thread to the bottom of the current one. I dont think this would have a whole lot of uses but the program is able to add images from a drag drop, so this function could work. and as for the missing thing. a version or so ago, I had thread watchers that were just open, mostly crap I wanted to go though, now, there was an option to open the watcher/gallery in a new tab, I was finding the ability to do that very useful. GRANTED, my use case comes down to a mix of the tism that does let me let something go till its done, add that doesn't let me focus on what I want to do, and having a watcher list 4630 watcher big so letting a watcher go would make finding it again if I didn't know the query a daunting task. So I have to ask, would it be possible to have the 3 show file options in right click also able to open the same queries in new tabs? and at that, would it be possible, at least when one is selected, to base the tab off the watcher/gallery name?
Ok, finally getting around to closing out a shit ton of smaller image pages in hopes to get a 70-80k sessission down to around around 1000 image and whatever subs update. ran into an issue. right now i'm doing a fast 'do I want to mark the image as I really like it' and 'mark for holy shit, make sure I tag this image in depth' Now I press these buttons individually, they register fine, I press them at the same time, they register fine I press them in quick succession, they have issues. Pressing both at the same time is uncomfortable, and pressing them slower… well… is slower. I think there is something to look at here. If I had to assume how it worked, it immediately writes a change the moment a change is made, however, you don't need to do this, instead queueing up the changes to be made and the moment you move to the next image having the changes commited would likely solve it, at least the way i'm imagining this being done in my head and the problem that may come up where the program hangs so the most slight time when doing this, but long enough that the second input is missed.
>>10757 I don't have anything exciting planned, unfortunately! But I'm open to ideas, if you can think of something small and fun. Maybe some more Mr Bones stuff.
>>10758 >>10759 >>10767 Thank you for this feedback. I am still thinking about multiple highlight. I may add it, I may hold off in order to keep things simpler. If you want the media to go to a new page, I recommend you show it in the current page and then ctrl+a and thumbnail-drag up to the notebook page tab area to give it its own page. You can also right-click->show media in new page. Ctrl+r should also be the default for quick 'remove' if you want to clear the page of current media without highlight/clear. If doing that in the current page is too inconvenient, let me know and I'll see if I can add some options or yeah new menu entries. I don't want that menu to get too cluttered though. Thank you for this report about the shortcuts as well. This is shortcut keys, right? Is it for ratings or a tag? I will look into this this week.
(8.22 KB 293x78 Untitled.jpg)

should i be concerned?
>>10769 No, no more Mr. Bone, more "Do it for her"
Not sure if this is the right thread for this but a small thing that i would find really useful would be a "delete both" option in the duplicate filter - i often grab a thread or something and then go over duplicates first and right now i just randomly keep one if there are duplicates that i dont want to keep and have to delete the other once i filter the thread itself.
>>10776 This is already an option, just hit the Delete key while in the filter and arrow key over to the "send both to trash", hit it again and arrow key over to "permanently delete both". You might need advanced mode on, I don't know. I wish there were a way to make them the default selected options, because I never have any reason to delete just one since the filter automatically does that and applies the tags to the better one whenever I hit "this is better" anyways.
>>10771 This is probably fine. Depending on when these orphans got in, it could have been my old delete code not clearing some things out right or maybe it was my newer, slightly better delete code not clearing some things out right. It is probably something like the client took a pause before it went to actually physically delete some files and then you shut it down. Or maybe something esoteric like you restored some deleted files from your recycle bin and some hydrus files got caught up in it. Let me know if you have more problems in this area–I think you can also set clear orphans to put the files somewhere, can't you? In which case you might like to run it in a couple months like this and see what it thinks was an orphan.
>>10776 >>10777 This may be a bit awkward to do as a matter of course, but if you click 'custom action' button in the top hover window, you can tell it to 'delete both files' there just for that one action. I think it would be approx: click 'custom action' double-click 'not dupes' or whatever click 'delete both files' click 'apply' Which is a bit of pain, but should work.
>>10781 So basically this would override the not dupes action? Is there any way to add my own custom action with its own button, or to bind "not dupes" to the del key? "same quality" would be fine too, I never use that because it just keeps both versions but doesn't mark them like with "alternates".
>>10782 It would still do the database-level duplicate action of 'not duplicates' (which iirc is basically a single row update from the existing 'I think there is a relationship, but it needs human eyes in the dupe filter to judge', to 'not dupes'), but also deletes both files afterwards. The client can be aware of duplicate relationships for files that it does not have, so the file delete does not override anything as far as dupe status goes. (e.g. if you were to reimport those deleted files, the client would not end up re-queueing them for duplicate filter, since it would still remember they were 'not dupes') Since you are deleting both files, you probably don't really care what their dupe status is, but you might as well be accurate. If they are same quality, you may as well set that so tags can be shared either now or in the future where the client cleverly uses its knowledge about those files to infer some other dupe status somewhere. You can bind 'not dupes' to a different key, like 'd' or 'ctrl+n' or something. Delete is reserved for now. Check file->shortcuts->duplicate filter->add->duplicate_filter_not_dupes.
>>10780 Yeah they were all old stuff I had deleted, except the thumbnails obviously. It autodeleted those.
I have an ambitious scraping idea but I don't know if it's even possible under current Hydrus capabilities. Basically, it would need to be able to: >go to the front page of a subreddit >follow the retarded endless scrolling backwards to the first post of the subreddit (or go directly to the oldest post if there's some way to do that I don't know of) >check each post and skip those that had [M] but no [F], for example >take those with no brackets or those with acceptable brackets and open and text search them and their comments >download any linked images and call downstream downloaders for linked galleries and clip hosting sites like erome or imgur >give all of the files obtained from these various sources a creator:[username] tag and a unique post:[hexadecimal id] tag >once text is supported, add a text file dump of all of the comments in the thread including the OP and ideally tab or dash delineated for reply chains, also tagged with the user and batch IDs because text is small and it can be helpful finding sauce >also highlight or list at the top of the file any URLs which were dead, are for known paywall-only sites, had files but none could be downloaded, etc, etc, and the general reason why listed instead of downloaded from If this stuff isn't possible now, any guesses how far off we are from it being? Would it be less of a clusterfuck to just write my own application-specific program for this and then port stuff to Hydru via filename tagging?
>>10790 Some of this stuff is doable, other stuff like comment parsing is not. I don't use reddit, but I think there are some parsers floating around already. Maybe try this for basic support: https://github.com/CuddleBear92/Hydrus-Presets-and-Scripts/blob/master/Download%20System/All-in-Ones/Single-Sites/easy-import-reddit-2018.09.21.png Since you know how to script, you might like to dip into the new downloader system–and look through that reddit parser to see how it basically works–here: https://hydrusnetwork.github.io/hydrus/help/downloader_intro.html Some jobs like parsing multiple links from one tag's text, seem to be a bit tricky, and I may add a new parsing formula or something to help these 1->n situations, but for the most part I consider the downloader engine to be at v1.0 and will not revisit it for a significant iteration unless it comes up in the big poll. Parsing text into file notes and supporting multiple notes per file is another thing I may slip in in regular work, but I can't provide any confident timeframe. If you can script up what you want in thirty lines of clever code, that may be the better solution. (Also, it looks like API is scoring high on the current poll, so the ways you can talk to the client look to expand this coming year).
>>10770 ratings, i find them easier to work with to cut through images faster then tagging currently. as for making the drop down to cluttered… I honestly don't think it would be, however you could add a nest for 'more open options' as it stands the method for opening a watcher, getting the images to a new tab, and then removing them is a bit cumbersome, but for the time being i'll see if I find myself wanting to do it often enough to warrant 3 options. >>10774 on the topic of mr bones, it would be nice if it reported the trash and hand a gb amount next to it.
(655.03 KB 1357x871 ClipboardImage.png)

(232.29 KB 1898x794 おばあちゃん.jpg)

Hello hydrus, I have migrated machines and ever since I am getting a DB malformed error. I have been analyzing and testing with the database and am confident that it is not malformed. I only encounter this issue when trying to update the PTR, so far I have not encountered it otherwise. Even after resetting the process cache, backing up then restoring the db, and more, the problem persists, and the service enters a paused state. Is any of this useful for you to hear? Attached is a customary bonus picture, and here is my traceback. DBException
DatabaseError: database disk image is malformed
Traceback (most recent call last):
File "include\ClientServices.py", line 1436, in SyncProcessUpdates
( did_some_work, did_everything ) = HG.client_controller.WriteSynchronous( 'process_repository', self._service_key, only_when_idle, stop_time )
File "include\HydrusController.py", line 705, in WriteSynchronous
return self._Write( action, HC.LOW_PRIORITY, True, *args, **kwargs )
File "include\HydrusController.py", line 209, in _Write
result = self.db.Write( action, priority, synchronous, *args, **kwargs )
File "include\HydrusDB.py", line 914, in Write
if synchronous: return job.GetResult()
File "include\HydrusData.py", line 1503, in GetResult
raise e
DBException: DatabaseError: database disk image is malformed
Database Traceback (most recent call last):
File "include\HydrusDB.py", line 536, in _ProcessJob
result = self._Write( action, *args, **kwargs )
File "include\ClientDB.py", line 11823, in _Write
elif action == 'process_repository': result = self._ProcessRepositoryUpdates( *args, **kwargs )
File "include\ClientDB.py", line 8597, in _ProcessRepositoryUpdates
self._ProcessRepositoryDefinitionUpdate( service_id, definition_update )
File "include\ClientDB.py", line 8456, in _ProcessRepositoryDefinitionUpdate
self._CacheRepositoryAddHashes( service_id, service_hash_ids_to_hashes )
File "include\ClientDB.py", line 645, in _CacheRepositoryAddHashes
hash_id = self._GetHashId( hash )
File "include\ClientDB.py", line 4120, in _GetHashId
self._c.execute( 'INSERT INTO hashes ( hash ) VALUES ( ? );', ( sqlite3.Binary( hash ), ) )
DatabaseError: database disk image is malformed


Database Traceback (most recent call last):
File "include\HydrusDB.py", line 536, in _ProcessJob
result = self._Write( action, *args, **kwargs )
File "include\ClientDB.py", line 11823, in _Write
elif action == 'process_repository': result = self._ProcessRepositoryUpdates( *args, **kwargs )
File "include\ClientDB.py", line 8597, in _ProcessRepositoryUpdates
self._ProcessRepositoryDefinitionUpdate( service_id, definition_update )
File "include\ClientDB.py", line 8456, in _ProcessRepositoryDefinitionUpdate
self._CacheRepositoryAddHashes( service_id, service_hash_ids_to_hashes )
File "include\ClientDB.py", line 645, in _CacheRepositoryAddHashes
hash_id = self._GetHashId( hash )
File "include\ClientDB.py", line 4120, in _GetHashId
self._c.execute( 'INSERT INTO hashes ( hash ) VALUES ( ? );', ( sqlite3.Binary( hash ), ) )
DatabaseError: database disk image is malformed
If I can help gather information, please let me know. Thanks.
>>10794 Thanks. I am afraid I cannot reproduce this shortcuts issue, either for ratings or tags. Can you try it again on some thumbnails, as opposed to the media viewer? If you can get the same problem, can you then try it again on thumbnails with help->debug->report modes->shortcut report mode on? It should make some popups about the shortcuts it is getting that may help figure this out. If you have the screen space for it, having both the media viewer open and then hitting the shortcuts on that file's thumbnail will help you do this as you'll see the ratings flip (or not) as you action the thumb. In some testing on my end, I noticed the media viewer doesn't use a particular shortcut processing object that has the report mode hooks, so perhaps that is what is missing here.
>>10795 'image is malformed' errors are SQLite-specific. They typically mean hard drive error (e.g. some part of the database is scrambled due to previous disk fault), but they can also mean hard drive access error. It happens at a lower level than my code, and I don't think I can even produce one. It is a serious error. I do not recommend you run your client until you get it figured out. The 'help my db is broke.txt' in your install_dir/db directory is a good starting point for at least some background reading. If you are very confident the backup is all good (and even better, if you can get it to run or 'PRAGMA integrity_check;' on a different drive), then I suspect your new machine's hard drive is funky in some way. Maybe a 'dirty bit' is set, or the folder hydrus is running from has odd access permissions, or the cable is joggy. Let me know how you get on!
>>10799 >>10795 I should add, as the client seems to boot ok but has a problem inserting into 'hashes' table, if there is a fault, it is likely in 'client.master.db'. As you go through 'help my db is broke.txt', make sure you run the integrity check on that file particularly.
>>10798 https://pastebin.com/nkB6gTsW Ok, here is what I got, num 0 is the tag further alt+b is favorite if alt+b is pressed, it processes no matter when if num 0 is pressed second, it doesn't process if its pressed in semi rapid succession And just tested, if I don't use my macro pad and instead use the keyboard, it still shows the same behavior.
>>10799 >>10800 Ahh, it seems like my "client.mappings.db" was fine, but my "client.master.db" was not. I either missed this line >If you do not know which file is already broken, try opening the other files in new shells or skipped that file before. I lost the log, but there was a single integrity issue. I cloned it to a new db, it was slightly smaller, not sure if this was the result of a vacuum or data loss, but nothing is obviously missing. I let the PTR process overnight and it seems to be working fine. Thanks hydrus.
>>10802 Great, thanks. By these lines: >2018/11/28 04:38:34: Key shortcut "alt+numpad 0" passing through <include.ClientGUIMedia.MediaPanelThumbnails object at 0x00000002708CF798>. I am in a state to catch it. >2018/11/28 04:38:34: Key shortcut "alt+numpad 0" passing through <include.ClientGUI.FrameGUI object at 0x000000004150B948>. I am in a state to catch it. As opposed to these: >2018/11/28 04:39:41: Key shortcut "numpad 0" passing through <include.ClientGUIMedia.MediaPanelThumbnails object at 0x00000002708CF798>. I am in a state to catch it. >2018/11/28 04:39:41: Shortcut "numpad 0" matched to command "set ratings "1.0" for Tag Further" on <include.ClientGUIMedia.MediaPanelThumbnails object at 0x00000002708CF798>. It was processed. It looks like the alt+ part is getting mixed up with the 'numpad 0' sometimes, and hence it isn't getting processed as the rating action. I guess your macro pad has a slight delay on unsetting the alt status sometimes, or maybe the OS is, and so it is occasionally mixing up the shortcuts sent. I recommend you change the alt+b to just b or the numpad 0 to alt+numpad 0 so they have the same modifier, so if the modifier hangs around longer than expected, it won't mix and match what you mean to send. Let me know if that doesn't fix it!
>>10803 Great, thanks for letting me know. The slightly smaller db file is yeah probably the result of a vacuumed new db. On the off-chance you have lost fifty hash records (out of like 19 million), they are probably files you will never see, but there's a tiny chance some tag processing will iterate over them again and be surprised to find them missing. If this happens, the client will give you some popup errors complaining about it. Let me know if this happens and we'll see if we can fix it manually.
Just stumbled across this tidbit on /e/ which might help anyone trying to process stuff from Twitter: You can get the "raw" resolution from a twitter image by changing the url of the image.
For instance to get the raw resolution for this image:
- original url: (elided)/media/DiX6UF-V4AAC8le?format=jpg
- raw url: (elided)/media/DiX6UF-V4AAC8le?format=jpg&name=orig

You can play with the "jpg" part and change it to "png". If it returns an error than no png version exists.

If the url of the image uses the "old" scheme like so (elided)/media/DiX6UF-V4AAC8le.jpg, you can replace the file extension to get the raw (elided)/media/DiX6UF-V4AAC8le.jpg-orig. Depending on the browser you use, it will use the file extension from the url instead of using the extension for the content-encoding the server returns in the response. In other words, it will use "jpg-orig" instead of "jpg" as the file extension.

If you need to rename your "jpg-orig" file you can use the following powershell code in a powershell terminal (windows 7 and up, macos and linux supported with "powershell core") IN THE DIRECTORY WITH THE FILES TO RENAME.

Get-ChildItem *.jpg-orig | ForEach-Object { Rename-Item -LiteralPath $_.FullName "$($_.BaseName).jpg" }

It should work but use at your own risk.
>>10808 For me, I always take a url like https://pbs.twimg.com/media/DrkOeeqU0AUdsbI.jpg:large and turn it into https://pbs.twimg.com/media/DrkOeeqU0AUdsbI.jpg:orig I get the first url from just "view image"'ing them in the browser.
>>10747 I said this a while back, you may as well stop with exhentai/e-hentai I found a script that will get download links, it will stager the downloads, and even then, at 1 dl a minute, they still catch it, its just not worth the effort form them.
>>10812 What were you trying to download? I sometimes open various(10-20) doujinshis at once and use a full page script to leave them all opening at once to then save them, but it's a short rapid burst. If you're downloading a lot of shit over an extended uninterrupted period of time that could be why.
>>10822 the last one I downloaded was an 8 page comic with everything spaced out 1 minute, I did all my testing on very small downloads and regardless of what I did, 1-1 week long lockouts. It seems they want you to pay them for the ability to download things, as 20$ gets you a weekly 5gb pass on downloads, something I would gladly pay, but I need to link bank accounts and do money transfers for cryptocurrencies, and i believe my bank was one that was one that was punishing people for even trying.
>>10811 >>10808 Thanks. I hadn't heard of the jpg-orig trick. Hydrus currently does the :orig conversion, which works well as long as the artist themselves didn't originally upload a 350x400 80% quality jpg. :^) I know that twitter can provide direct mp4 links for videos, but I haven't figured out a way to parse this info. This service provides it: https://twitter.com/this_vid But takes a couple of minutes to produce an answer, so my guess is they manually request it from twitter using a legacy "I'm an old phone" http header or something and then forward the link. Normal twitter video works with some DASH streaming or something that we can't handle (yet). If you discover a way to get simple mp4 links for twitter vids, please let me know!


Forms
Delete
Report
Quick Reply