Use Google cache
Often Google cache has a full copy of an article.
Check for the existence of the cache for a given URL as some URLs may not be cached.
I had originally used curl for this.
This is not an accurate science. Sometimes the cache is needed and sometimes it is not, but the more information we bring in (such as if the cache exists), the better, as this leads to a more informed decision.
github-alphapapa for this one.
(defun url-found-p (url) "Return non-nil if URL is found, i.e. HTTP 200." (with-current-buffer (url-retrieve-synchronously url nil t 5) (prog1 (eq url-http-response-status 200) (kill-buffer))))
Try them out
(url-found-p "http://webcache.googleusercontent.com/search?q=cache:https://medium.com/riselab/functional-rl-with-keras-and-tensorflow-eager-7973f81d6345") (url-found-p "http://webcache.googleusercontent.com/search?q=cache:https://news.ycombinator.com/")
Add some advice to the eww command which expands URLs just before they are loaded
;; This makes it so for certain urls, the google cache is loaded instead (defun eww--dwim-expand-url-around-advice (proc &rest args) (let* ((url (car args)) (cached_url (replace-regexp-in-string "^" "http://webcache.googleusercontent.com/search?q=cache:" url))) (if (and (or (string-match-p "towardsdatascience" url) (string-match-p "medium.com" url)) (not (string-match-p "webcache.google" url)) (url-found-p cached_url)) (setq url cached_url)) (let ((res (apply proc (list url)))) res))) (advice-add 'eww--dwim-expand-url :around #'eww--dwim-expand-url-around-advice)