Wow, this thread got big! Thank you all for your comments. I can't really marshall my thoughts into anything that sounds coherent right now, but I've not replied to this thread for too long, so I'll apologise and just say a mix of things. If you would like a specific response to something I miss, please say so or send me an email.
I mean for hydrus to be a store for files that are 'complete'. Such files can have their metadata changed, but there is no reason to alter their content. They are read-only.
Curating a thousand files in explorer is easy, but a hundred thousand is impossible. I see no point at all in trying to manage that many files in a brittle directory structure.
Throughout development, there has been a temptation to permit links, to allow hydrus to manage files at a remove, in their original locations, but this betrays my two former points. External files are subject to renaming, editing, or deleting, and they remain in an impossible-to-manage structure.
Filenames and folders are useful for things you are working on right now, where you might have several versions, or the program you are using to edit relies on filenames to manage library imports or whatever. The only file that should be imported to hydrus here is the final .png or whatever that comes out of the whole process.
Filenames and folders are also useful for a few hundred files, perhaps to be wrapped up in a zip and sent to a friend. Hydrus tries to support this with its export dialog.
Filenames and folders are not useful for storage or search, which is mostly what hydrus aims to do.
Any useful filename data, like in "artist_name-series_name-chapter_24-page_10.jpg", is better parsed and stored in an intelligent database that can regurgitate that information at will. Adding filename: namespaces is fine, but I think that data is better slowly converted to more granular and powerful tags. You can go tags->filename easily, but filename->tags takes more effort.
>>537 Has it right on hashes; the database stores a decent amount of metadata about files, including hash. Those hash indices are cross referenced whenever needed–the actual files are only ever loaded when the whole file is needed for something (display, uploading etc…). When a file is imported, its hash is calculated and then checked against the db–if it is already in the db, it tells the import controller 'redundant' so it can report that and move on to the next file. There's a fair amount of similar checking for image URLs as well, and md5 hash for APIs that report that.
>>513I don't really want to mess around with %APPDATA%, because I want to keep the program together in one place. I want the program to be portable, easy to backup, and able to run from multiple locations at once without conflict. The Windows installer violates this, but I make that release to be a simple solution for people who don't want to mess around with extracting archives.
Having said that, if there is a good way to set the proper user access control to the db directory under program files with ISTool, which is what I put the Windows Installer release together with, then I am interested to know.
Keeping short sequences of images together is a continuing problem. page namespace works well for large groups, but not small. I may create a new file format, like .cbr is used in online comics, to represent these collections as single multi-page units.
I'll have a look at tag censorship for namespaces; thank you for reporting it. I'll also have a think about options for namespace hide on the 'tags in selection' box.
>>528Suggested tags is something I definitely want to do.
You can change the default tag repo in
file->options->gui.
Page up and page down can do that for the manage tags window right now. I can add this to options if you want, so you can change it to whatever you want (although binding left and right to a text control can be a problem, because then you can't use those buttons to move the caret around).
I haven't touched like/dislike and numerical ratings in a long time. I want to get back to them.
Doing safe/questionable/explicit has been tough for me to think through. I think you are right, saying do it with tags, probably with a 'safety' namespace or whatever. It is really subjective though–maybe it'll be better done with a numerical rating? If people want to filter safe vs explicit, I now suggest they just run two clients–one for sfw, the other for nsfw and anything else private. That way, they can browse their sfw client with someone looking over their shoulder and have nothing to worry about.
>>530At some point, I want to make a service to accept and sync that sort of metadata to clients. I'll be writing a better dupe checker as well, and add a gui that'll show two very similar images to the user and ask questions like 'which is better?' and 'should tags be merged?'. The complicated parts of that data will be made automatic for users, so images will be replaced/rotated/whatever a bit like tag siblings are now. I'll run a public service, and anyone else who disagrees with my ruleset can run their own. I hope this is a decent compromise between the current mess and the impossible pursuit of 'file purity'.
>>536I will see about selecting multiple tags–thank you for the suggestion.
>>540Export dragging is unfortunately a massive pain to code. I have tried several times to figure a way to do it, but I can't find a wx-compatible multiplat solution for virtual file drag and drop. Dropping real writable files with nice filenames is easy, but I want to insert an interim step that'll copy and rename, which I can't figure out in wxPython.