Categories
Uncategorized

Quoteblog ala Atom

Recently there was much hullabaloo about “quoteblogs” which seem to be defined as a blog that just takes a whole post from another blog and reprints it. (See Quoteblogs vs. Linkblogs and Quoteblogs follow up for background…)

So in a comment to this post by Scoble about the quoteblog, I started to think about how I’d do it, using the data found in Atom feeds… I said:

Well, in Atom feeds, you can create a summary for each post (it’s an optional element.) This summary, which is typically a brief one or two line description of the full post, would be ideal to use in Robert’s situation. Actually, in my own RSS 2.0 feed I use the description element in this way, with the full post in a content:encoded element. I think Atom has an advantage here, because you could easily use the summary if it exists, and if not, create a summary from the content element if that exists… Of course if none of those exist, you can always use the title, which is required in Atom. In RSS I don’t think it’s quite as clear what the description will contain…

(Yes, I just quoted my comment, in full, from Robert’s blog… how’s that for adding to the confusion!)

Anyway, I’m tossing this out there… A mockup of an Experimental Atom Quote Blog.

I’ve grabbed a few Atom feeds I subscribe to, and pretty much did what I outlined above. the data for each post comes from the associated Atom feed. You can pretty easily map the element of each item to it’s corresponding piece in the feed. Granted, this is a small sampling, as I didn’t create working code (yet) I just did some copy/paste work here…

I did attempt to use blockquote and cite properly, and display the copyright if one existed… I think Atom lends itself well to such use. I honestly don’t know about RSS. I’m sure it could be done, but I didn’t feel like it was a worthwhile exercise for me, YMMV

Categories
Uncategorized

Kill Mork

When last you joined us you learned about Mozilla’s history file, jwz and mork. Well, jwz was not content to just sit by and do nothing, not when Bugzilla is around… So jwz is launching some fireworks

Bugzilla Bug 241438 – please make history.dat be machine-parsable (i.e., not Mork)

Obviously this bug needs more votes… so go vote! I mean, if you care about applications having a simple way to get data out of them instead of locking them behind some insane, difficult to parse format, then go vote…

Categories
Uncategorized

Using Mozilla Data

I want more out of Mozilla. It’s got my bookmarks, and it keeps a history of URLs I’ve visited. I want that data! Here’s what I’ve hacked together thus far. bminer is an application that parses my Mozilla bookmarks.html file, grabbing the URL, title, and date added for each entry. It then shoves it all into MySQL. (If a URL already exists, we don’t bother to insert it again…) cron makes this run at regular intervals. So, we’ve in effect got a solid backup of the bookmarks that we can run queries against, and with a little CGI magic, access from elsewhere. Oh, we could (in theory) run this on all machines we have Mozilla on to get a comprehensive list of all of our bookmarks. So much for the bookmarks. Of course the code to do this is hacky perl code with regular expressions that parses an HTML file. Could there be a better solution? Probably, but it seems to work. I should probably clean up the code and release it.

Now, for the history file, it’s a bit more work. history.dat is in some insane format known as ‘mork db’ that McCusker came up with. Previously I was unable to find a good way to parse this beast. Luckily jwz solved this problem with mork.pl recently. So next on the list is some code to store the data from history.dat in MySQL as well, with the ultimate goal of tracking where we’ve been and when. This is a little more tricky than the bookmarks, because honestly my bookmarks file might only change a few times a day, and we have cron set to parse it more than would be needed, but for the history we need to determine how often to parse it, and we can’t just parse the real file, we need to first copy it, and convert the line endings from classic Mac line endings to unix line endings. (Sigh, please, please, please! This is Mac OS X, banish all classic Mac line endings!)

Anyway, when complete we should have what we’re after, better tracking of our browser history. Need to find all pages with ‘perl’ in the URL that you visited last week Tuesday? Need to find a bookmark you added months ago with ‘foo’ in the title? We got it…

Nobody ever said parsing, cleaning, storing, and retrieving data was easy…

(Sidenote: Keep an eye on MozWho from Surfmind’s MozWho Lab which looks very interesting…)

Categories
Uncategorized

Nearest Book, Page 23, 5th Sentence

Ok, it’s being done by many… the whole Nearest Book, Page 23, 5th Sentence thing, so here’s the instructions if you missed them, and mine:

  1. Grab the nearest book.
  2. Open the book to page 23.
  3. Find the fifth sentence.
  4. Post the text of the sentence on your blog along with these instructions.

The sentence in question, from The Perl Cookbook:

It removes the leading whitespace from the text of the here document.

Now this is tricky, because the nearest “book” would have probably been a PDF file on my Mac, as the keyboard and mouse are the closest things to me. Heck, even the hard drive is closer than a paper-based book. Of course by that notion, I suppose any electronic book could have been considered as close. Perhaps Lessig’s Free Culture should have been the book of choice, as it’s just a click away…

The whole exercise was also tricky because the book I chose contained a code sample, so I had to determine exactly what a “sentence” consisted of. I chose a string of words followed by a period, which when read seemed to make sense. This is somewhat fitting considering the book I chose. Of course it would have been more fitting if I had chosen the book under it, Mastering Regular Expressions which may have actually been nearer, but I didn’t measure, so who knows.

(Hmmm, I just realized, the nearest book is actually an old QuarkXPress manual that is under my monitor, which isn’t really very accessible. Since we’re all about accessibility, we won’t even think about using that book!)

Sheesh, do I make things complicated or what?

Categories
Uncategorized

Hey y'all!

Ya know, people from the south, I mean the South, actually type things like "y'all" – that's right, they don't just say
"y'all" they actually type "y'all" in written communications.

I can't decide if this is quaint, cute, charming, or just plain goofy…