scrapem
a foray into undocumented APIs
TODO
- scrapem
- fix downloading songs more than once (flat file db?)
- handle ‘/’ characters correctly
- catch more errors
- pusher
- fix multiple playlist creation in pusher
- catch more errors
- create a cron job
scrapem.js
A PhantomJS/CasperJS website scraper that specifically targets hypem.com’s dynamically generated content, creates the correct urls, and downloads the binary files (.mp3’s). These features motivate the languages and implementation (PhantomJS for dynamic content, CasperJS for easier binary downloading).
Example usage:
casperjs scrapem.js --url=http://hypem.com/popular
The methods in this script are expressly forbidden in The Hype Machine’s Terms of Use: “Subscriber shall not download or store audio Content from the Site”. As such, this project should not be used in any way. It does however serve as an example use of the dynamic web scraping capabilities that a headless webkit browser provides.
Special thanks to BlissOfBeing’s userscript that motivated this project’s creation.
pusher.py
A simple Google Music Undocumented API script that uploads music to Google Music and then adds them to a specific playist. I found the traditional uploading application sorely lacked this ability and my library ended up being full of things that I didn’t know where they came from. Theoretically this could be used to keep a folder-to-playlist mapping between local and gmusic.