main
Raw Download raw file

Calvin and Hobbes json

Sources

Comic text sourced from Calvin and Hobbes quotations by S Anand

See also:

Recration:

Collect quotes

curl 'https://www.s-anand.net/comic.calvin.jsz' --compressed --output quotes.tsv

Collect images

cut -f1 quotes.tsv | grep -v id | xargs -I {} date --date={} +'%Y-%m-%d' | xargs -P10 -I {} ./dl.py {} | tee aria2c.list

Panel detection

kumiko -i comic.jpg --min-panel-size-ratio=.15

Ideas

  • Image boundry detection to split the comics into separate panels
  • Character identification to enumerate characters per panel
  • OCR to split text into per-panel text
  • Transform data into fine-tuning style json