Extracting notes from Kindle may be easier than I thought.
2 min read
Further to my earlier post, I dusted off Calibre and plugged in the Kindle (two things I don't do very often).
Calibre, when connected to the Kindle, shows a text file called My Clippings. The file is very well structured, with book title, author, Note or Highlight details and the note or highlight itself.
There is no separation between books, but of course the title will have changed from one to the next. That would make it possible to break the single file into individual files, one per book.
That could be automated, but in the end it is probably quicker to do it by hand. The first time would take a while, but from then on I can just scroll to the bottom of the file, because the clippings are in chronological order.
Breaking up each note and formatting them must be automatic. But it will also be reasonably simple with decent text manipulation.
- line 0 is ==========
- Next line is the title, which will also be the name of the file and can be ignored.
- Next line is - Your Highlight/Note on page xxx | location xxxx-xxxx | Added on Monday, 2 May 2016 19:22:20
- Four pieces of information
- Highlight or note
- Page in book
- Location in Kindle file
- Date (irrelevant)
- So it should be easy to wrap those in HTML with appropriate classes
- Next line is blank
- Next line is the note itself
All I need to do now is make a script to do all that. Ha.
Nifty!
Jeremy, glad you figured it out. If you sync your books/notes to the Kindle desktop app, you can also alternately export your notes and highlights from individual texts as raw html with one of their interface buttons. You can then more easily cut and paste into your own site. With a bit of modified CSS magic, you can format it as you like, thus: http:/ / boffosocko.com/ 2012/ 06/ 17/ big-history/
Your clippings file isn't bad, but doesn't sync across devices, so you have to export from each of your devices if you use multiple, and they're also time ordered, so if you read multiple things at once, or interspersed, it makes for more processing between titles. There's also a service called https:/ / www.clippings.io/ which may give you some better raw data, though I think they charge a fee for some services.
See also http:/ / indieweb.org/ read for other ideas.
Chris Aldrich, Feb 10 2017 on stream.boffosocko.com