blog

Categories     Timeline     RSS

Watch website changes with RSS

I use the Go program below to get notifications in my RSS reader when websites change that don’t offer RSS feeds themselves. For each website you would create a new command in main(), choose a shortname, enter the URL and enter a HTML node selector for the part you are interested in (thus also excluding surrounding stuff that might be dynamically created on each visit). You would then call this program with “go run webwatcher SHORTNAME” in your RSS reader.

package main

import (
	"bytes"
	"encoding/json"
	"fmt"
	. "git.fireandbrimst.one/goutil.git/html"
	"git.fireandbrimst.one/goutil.git/misc"
	xnetHtml "golang.org/x/net/html"
	"io/ioutil"
	"os"
	"path"
	"text/template"
	"time"
)

const (
	DL_LIMIT     = 15 * 1024 * 1024
	CACHE_FOLDER = "cache"
)

const RSS_TEMPLATE string = `<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:content="http://purl.org/rss/1.0/modules/content/">
<channel>
<title><![CDATA[ {{.Shortname}} ]]></title>
<link><![CDATA[ {{.URL}} ]]></link>
<description><![CDATA[ {{.Shortname}} ]]></description>

    <item>
      <title><![CDATA[ {{.URL}} ]]></title>
      <content:encoded><![CDATA[ {{.LastContent}} ]]></content:encoded>
      <guid><![CDATA[ {{.URL}}/{{.LastModified.Format "20060102-150405"}} ]]></guid>
      <link><![CDATA[ {{.URL}} ]]></link>
      <pubDate>{{.LastModified.Format "Mon, 02 Jan 2006 15:04:05 -0700"}}</pubDate>
    </item>

</channel>
</rss>
`

func optPanic(err error) {
	if err != nil {
		panic(err)
	}
}

type command struct {
	shortname string
	URL       string
	selector  func(n *HtmlNode) bool
}

func (c *command) filename() string {
	return path.Join(CACHE_FOLDER, c.shortname)
}

func (c *command) getContent() string {
	b, err := misc.DownloadAll(c.URL, DL_LIMIT)
	optPanic(err)
	tmpdoc, err := xnetHtml.Parse(bytes.NewReader(b))
	optPanic(err)
	doc := (*HtmlNode)(tmpdoc)
	n := doc.Find(c.selector)
	var buf bytes.Buffer
	xnetHtml.Render(&buf, (*xnetHtml.Node)(n))
	return buf.String()
}

func unmarshalHPObject(filename string) HP {
	bytes, err := ioutil.ReadFile(filename)
	if err != nil {
		bytes = []byte{}
	}
	var hpObject HP
	err = json.Unmarshal(bytes, &hpObject)
	if err != nil {
		hpObject = HP{}
	}
	return hpObject
}

func marshalHPObject(filename string, hp HP) {
	bytes, err := json.MarshalIndent(hp, "", "  ")
	optPanic(err)
	err = ioutil.WriteFile(filename, bytes, 0644)
	optPanic(err)
}

func (c *command) genRSS() {
	content := c.getContent()
	hpObject := unmarshalHPObject(c.filename())
	hpObject.Shortname = c.shortname
	hpObject.URL = c.URL
	if content != hpObject.LastContent {
		hpObject.LastContent = content
		hpObject.LastModified = time.Now()
	}
	err := rssTemplate.Execute(os.Stdout, hpObject)
	optPanic(err)
	marshalHPObject(c.filename(), hpObject)
}

type HP struct {
	Shortname    string
	URL          string
	LastContent  string
	LastModified time.Time
}

var rssTemplate *template.Template

func main() {
	rssTemplate = template.Must(template.New("rss").Parse(RSS_TEMPLATE))
	os.Mkdir(CACHE_FOLDER, 0755)

	commands := []command{
		command{"stilldrinking", "https://www.stilldrinking.org/", IsTag("div").And(HasID("cont"))},
	}

	for _, command := range commands {
		if command.shortname == os.Args[1] {
			command.genRSS()
			os.Exit(0)
		}
	}
	fmt.Fprintln(os.Stderr, "unknown command", os.Args[1])
	os.Exit(-1)
}

Inline style block with Content Security Policy

To my surprise, it is possible to serve inline style blocks with Content Security Policy enabled:

Snippet latexpic

I uploaded a new snippet called ‘latexpic’ to my git server. It can be used to quickly generate PNGs from partial LaTeX source. It requires LaTeX, Imagemagick and Bash.

Example:

> latexpic "white on black"  '$ f(x) = e^x $ \\ $ f(\bar x) = \bar x^2 $'

… generated this:

latexpic result

TDL rewrite

I decided to rewrite TDL, my language for weightlifting plans. The program generates concrete training days from an abstract schedule description. You can find the source on my git server and releases at releases.fireandbrimst.one.

The output looks something like this (see the input and output files):

$ ./TDL schedule.tdl 
schedule20201104-223938.txt

$ cat schedule20201104-223938.txt 
Maxes
=====

Squat: 175.00
HexDeadlift: 200.00
Bench: 110.00
Press: 80.00

Day 1
======

Press
--------

| Set            | Plates           | Resulting |
| -------------- | ---------------- | --------- |
| 32.00 x 5      | 5, 0.5           | 31.00     |
| 40.00 x 5      | 10               | 40.00     |
| 48.00 x 3      | 10, 2.5, 1.25    | 47.50     |
| 48.00 x 7      | 10, 2.5, 1.25    | 47.50     |
| 52.80 x 4      | 10, 5, 1.25      | 52.50     |
| 57.60 x 10     | 10, 5, 2.5, 1.25 | 57.50     |
| 40.00 x 10 x 5 | 10               | 40.00     |

5 sets of 10 chin-ups or 10 rows

[...]

RSS all the things

I don’t really know, why I ever stopped using a feed reader, but I’ve recently got back to it to minimize my online time. In that vein, my static site generator wombat now generates RSS feeds for blogs.

Sadly, Twitter no longer provides RSS feeds itself, but I made a template for drivel, my Twitter CLI, to get around that. It’s useful with the home, mentions and timeline commands.

Lastly, I put some bash glue around youtube-dl, jq and GNU date to generate RSS for YouTube channels or playlists. irq0 correctly mentioned that YouTube provides RSS - but only for channels and only with Flash. Personally, I like the embedded iframe better. youtube-dl does not provide the upload date in playlists so a faux one is created for ordering by playlist position.

< Older

Newer >