/t/ - Technology

Discussion of Technology

Index Catalog Archive Bottom Refresh
Options
Subject
Message

Max message length: 12000

files

Max file size: 32.00 MB

Total max file size: 50.00 MB

Max files: 5

Supported file types: GIF, JPG, PNG, WebM, OGG, and more

E-mail
Password

(used to delete files and posts)

Misc

Remember to follow the Rules

The backup domains are located at 8chan.se and 8chan.cc. TOR access can be found here, or you can access the TOR portal from the clearnet at Redchannit 3.0.

Uncommon Time Winter Stream

Interboard /christmas/ Event has Begun!
Come celebrate Christmas with us here


8chan.moe is a hobby project with no affiliation whatsoever to the administration of any other "8chan" site, past or present.

You may also be interested in: AI

Archiving Discussion Anonymous 12/06/2020 (Sun) 20:54:36 No. 1940
Over the past couple of years I've been using my own technique to put together my own "locally stored google" of sorts. It's nothing complicated, I use the legacy add-on Session Exporter to export my tab history to an html file, and Ctrl+F them anytime I need to find something. I do this periodically while keeping Pale Moon set to remember tabs from the last session, and using FEBE to hold onto any profile underarchived while I start fresh to save memory. Lately I've been trying to use this technique to keep track of all the election related shit from thedonald.win, and I am finding it difficult to keep pace with this method alone. Between the sheer volume of links and media files, and the slowness of archive.is which forces a Jewgle capthca if you try to archive too many links at once, I am left looking for a technical solution that can address my needs. So, let's have a thread about archiving in general. Share your techniques and helpful advice.
>>1940 I use WinHTTrack to archive websites that are of personal interest to me, like wikis, online-only documentation, and (originally) resources for game development. I imagine that could help with your specific issue in some regard, since it seems to archive it just as well. In general, I also use Tartube as a frontend for yt-dl to maintain a database of videos I like in order to still watch them without having to go to the site or even a proxy like invidious. As for advice, the most I could say as a novice is to archive your archives, and make back-ups of those; I prefer to use the term archive to also mean "gather up a bunch of resources and put them in a 7z or zippted folder", to differentiate it from proper backup and restoration methods. I plan on building a couple home servers, one to act as a makeshift seedbox, and another to keep archives and proper backups of things like personal dumps of videogames.
(467.03 KB 900x720 338541_421949_5551.jpg)

>>1940 I have 100+ TB of media mostly Pictures and Music. Also don't use archive(.)is it's cuckflared.
>>1964 What's an alternative that you recommend?
>>1964 There's no good alternatives to archive.is and as far as I know they have an ongoing dispute with cuckflare over DNS configurations.
>>1940 The Session Exporter trick is neat, although I don't think I'd have a use for it. I only really have 2 or 3 use cases for local archiving: archiving some video or media content for when it's inevitably taken down, archiving a single page that has good info, or archiving an entire site or a section of it. For the first case I use youtube-dl, although it doesn't work on every page. When it fails to work, I try to sniff the download links manually, which is sometimes straightforward but most sites have migrated to DASH formats so it's a bit more involved and in those cases I use curl to download all the pieces and I have a script to put all the pieces together again with ffmpeg. The disadvantage is that if it's a really long video with hundreds or thousands of pieces curl will only download 1 at a time, but you can always subdivide the range and use multiple instances. Multiple videos can also be done in parallel by using multiple instances. For the second case I simply download the page through the browser, while for the third case I couldn't find anything that gave me the flexibility of wget. I tried using HTTrack but it always performed very poorly for me, while I can get almost the same functionality from wget plus it allows me to fuck around with regexes to tell wget what I want it to download from the site with enormous precision. The downsides are that it's single threaded so be prepared to wait a lot, and also although it has a mirroring mode that converts links in the downloaded files for offline browsing the devs changed the behavior so that only the files that wget downloaded in a certain invocation would get converted which is rather infuriating. So if you previously downloaded a 50GB booru and wanted to update it, a sane course of action would be to download only the HTML again and only the new images, but no, wget would require you to either fix the HTML manually or download the entire 50GB again, which is utterly retarded. There's wget2 which fixes the the multithreading issue but although it supposedly aims to be command line compatible with wget I couldn't get it to accept my regexes correctly while they worked perfectly in wget. >>1964 Nice, I wish I could have that kind of storage space, but with so many drives I'd go insane.
>>1971 >Nice, I wish I could have that kind of storage space, but with so many drives I'd go insane. Beyond 4 or 5 drives (capping out around 40 TB with modern 8 TB drives) you really want to build a NAS so it can just be one big volume accessible from anywhere. That makes it much more manageable.
>>1965 Clearnet or Onions?
>>1985 Both; I'd only use Clearnet, but onions would also be useful for any anons who use Tor.
(21.63 KB 475x423 0ac.jpg)

>>>1971 >I have a script to put all the pieces together again with ffmpeg. How do i do that the proper way? I've been archiving some ongoing live streams from youtube that I know wont have an archive (/will get privated), but timestamps on these fragments fuck me up.
>>2126 I'm not sure exactly what you're trying to do or how it looks like but I just make a text file that has all the names of the fragments in the form file '<name of fragment>' one per line and then do ffmpeg -f concat -safe 0 -i files.txt -c copy output.mkv. You may want to look at how youtube-dl handles concatenation of the pieces since it also does it somehow. I remember trying a few different things and this being the one that worked the best, although I'm not convinced it's an ideal way to do it.
>>2063 What a coincidence, I was reading through that book. I want to make a tool to give anons asap.
>he doesn't have his own locally stored google
>>2139 >I want to make a tool to give anons asap you shouldn't. they should learn on their own if they really care about archiving.
basc-archiver for 4chinz wget to download websites web.archive.org for preserving webpages


Forms
Delete
Report
Quick Reply