Draco
Table of Contents
Draco is a script to convert reddit thread to Org document. It accepts a url & prints the Org document to STDOUT. It'll also print comments along with their replies.
Project Home | Draco |
Source Code | Andinus / Draco |
GitHub (Mirror) | Draco - GitHub |
Why?
I reference things from the web in my Journal & don't want those links to break so I save them locally. Previously I used to manually archive the whole thread, this automates it.
Demo
This was recorded with asciinema(1)
.
- Draco v0.3.2: https://asciinema.org/a/375432
- Draco v0.1.2: https://asciinema.org/a/373860
- Draco 2020-11-19: https://asciinema.org/a/373851
- alt-link (download): https://andinus.nand.sh/static/draco/
Installation
Follow these instructions to get draco & then install the dependencies, they're listed below. All dependencies are in Debian & Fedora repositories.
Check the News section before updating or downloading latest release.
Release
Release archives are generated by cgit/GitHub.
- Download the release:
- Extract the file.
cd
into the directory.- Run
make install
as root. - Install dependencies.
From Source
All commits will be signed by my PGP Key.
# Clone the project. git clone https://git.tilde.institute/andinus/draco cd draco # Install draco. Use `sudo' if `doas' is not present. doas make install # Install dependencies. See the section below.
Dependencies
OpenBSD
doas pkg_add p5-Unicode-LineBreak p5-JSON-MaybeXS p5-IO-Socket-SSL cpan install HTTP::Tiny
Debian (apt)
sudo apt install libunicode-linebreak-perl libjson-maybexs-perl \
libhttp-tiny-perl libio-socket-ssl-perl
Fedora (dnf)
sudo dnf install perl-JSON-MaybeXS perl-HTTP-Tiny perl-Unicode-LineBreak \
perl-IO-Socket-SSL
News
v0.3.3 - 2022-08-09
- Add IO::Socket::SSL dependency. Required for HTTPs support.
Print response 'contents' on errors. From https://metacpan.org/pod/HTTP::Tiny,
Errors during request execution will result in a pseudo-HTTP status code of 599 and a reason of "Internal Exception". The content field in the response will contain the text of the error.
v0.3.2 - 2020-11-26
- Add
author_flair_text
to properties section of each comment. - Keep each dot in a single line. This feature was added in v0.3.0 but each dot was printed in a new line. For huge posts this would be annoying so now dots are printed in a single line.
v0.3.1 - 2020-11-25
Minor improvement.
- Put author name in code block if it begins & ends with "
_
". Org underlines headings that begin & end with "_
".
v0.3.0 - 2020-11-24
This version adds code to fetch all the comments in a thread. Now users can archive the whole thread.
Everyone should get this update, the code has become a lot more complex since v0.1.3. If you don't want to update then you can get the patches for small changes from the git history.
- Fetch all the comments.
- Add debug message for HTTP calls. It'll print a "." for every HTTP call. Users will be able to tell when the script is making HTTP calls.
v0.2.2 - 2020-11-24
This version is mostly structural changes, it'll now be easier to add code to fetch comments hidden behind "continue this thread".
- Add more debug information.
v0.2.1 - 2020-11-24
- Previously fetching comments hidden under "load more comments" would fail if the url passed by user ends in "/". This has been fixed in this release.
v0.2.0 - 2020-11-23
This version makes the script lot more complex. If you download only small threads then this update is not required.
Previous version (v0.1.3) might throw some errors on threads that have comments hidden behind "load more comments" but the rest of thread will be saved.
This version will load all those comments hidden behind "load more comments". But not those hidden behind "continue this thread". This is a known bug.
- Add "[S]" after submitter's comments.
- Print comments hidden under "load more comments".
- Document environment variables in manual.
- Add "limit=500" & "sort=top" to all posts/comments.
- Print more information when debug is on.
- Add help option.