posts tagged with the keyword ‘script’

2010.08.16

If you’re interested in exporting (or backing up) your Google Reader Subscription List you can log into Google Reader, go to Manage Subscriptions, and then Import/Export and then export your subscription as an OPML file (which is basically an XML file.)

Google Reader - Export Subscription List OPML

If you want to automate this process, there are a few steps involved… I used curl, which is easy, but other tools can also work.

The first thing you need to do is get an Auth code:


curl -daccountType=GOOGLE -d Email=[USERNAME]@gmail.com -d Passwd=[PASSWORD] -d service=reader https://www.google.com/accounts/ClientLogin 

Substitute your own Google username for [USERNAME].

Substitute your own Google password[PASSWORD].

Once you do this, you’ll get 3 lines returned, that look something like this:


SID=HFDY49j4ljlkfgdg4tfh03fdkjgldkhfl945840598djglkjh40hi5h... 
LSID=HFDY49j9ljlkfh03fhgfh565dkjgldkhfl945840598djglkjhhi5h... 
Auth=HFDY49j7ljlkfh03fdkjgldkhfl945840598djglkjhjgh6640hi5h... 

Note: I’ve shortened these (and made them up) but it’s but it’s basically 3 keys SID, LSID, and Auth and their associated values. You’ll need the value for the one labeled Auth.

Now, use curl to request the following:


curl -H "Authorization:GoogleLogin auth=HFDY49j7ljlkfh03fdkjgldkhfl945840598djglkjhjgh6640hi5h..."  http://www.google.com/reader/subscriptions/export 

Again, I’ve shortened the Auth code (it’s really long!) You’re basically passing the authorization in the header of the request. It should go without saying that the SID, LSID, and Auth should be kept private. (Which is why I just made up a random string in the example above.)

OK, if it all worked, curl returned your subscription list as OPML. Hooray! Also, you just used OAuth, so Double Hooray!

And here’s our shell script, which will download/backup your subscription list as OPML file. (It’s similar to our mysql backup schell script.)


#!/bin/bash

DT=`date +"%Y%m%d"`

curl -s -o /home/backups/SubscriptionList-$DT.opml -H "Authorization:GoogleLogin auth=HFDY49j7ljlkfh03fdkjgldkhfl945840598djglkjhjgh6640hi5h..."  http://www.google.com/reader/subscriptions/export 

Each time you run it, it will get the date with the year, month, and day and use it in the name. So %Y%m%d would produce something like 20100816. This should work fine if you run just one backup per day. (And of course you can store it somewhere besides /home/backups/ if you like. cron is your friend here.)

I know that most people believe that Google will not lose their data, or if the day comes they want to export this data, they’ll just go to the site and export it, but this lets you prepare for the day you can’t get to the site and export your data… or the day Google loses it, or deletes it, or whatever.

By the way… I found most of this information in the Google Reader API wiki. It’s nice that Google is providing an API for things, I just wish some of the info was easier to find… as of this post, that’s the only damn page in that entire wiki!

This is all part of my renewed interest in putting my own data into my own hands, and I may be bugging Jason (@plural) a bit more in the future. ;)

Update: Jason reminded us of dataliberation.org, which I’ll discuss in another post. :)

And just for fun: This gem from 2007: Data Loss At Google Reader.

2010.08.08

Let’s say you’ve got a file named “file.xml” and want it pretty printed, all indented nice and everything…

For just such an occasion I have a Perl script named “pretty.pl” and I just run my XML file through it like so: cat file.xml | perl pretty.pl

Here’s the code I use:

#!/usr/bin/perl

use XML::Twig;
use XML::Parser;

my $xml = XML::Twig->new(pretty_print => 'indented');

$xml->parse(\*STDIN);

$xml->print();

You can even pass it through right as it comes in over the wire: curl http://example.com/data/file.xml | perl pretty.pl

Here’s an example of data from Foursquare without pretty printing. (I used curl to grab the data. Also, I added in some line breaks, just to make it a little more readable.):

<?xml version="1.0" encoding="UTF-8"?>
<checkins><checkin><id>123847273</id>
<created>Mon, 09 Aug 10 00:50:33 +0000</created>
<timezone>America/Chicago</timezone><venue><id>2357761</id>
<name>The Kiltie</name><primarycategory><id>79067</id>
<fullpathname>Food:Ice Cream</fullpathname><nodename>Ice Cream</nodename>
<iconurl>http://foursquare.com/img/categories/food/icecream.png</iconurl>
</primarycategory><address></address><city></city><state></state>
<geolat>43.107391</geolat><geolong>-88.464475</geolong></venue>
<display>Pete P. @ The Kiltie</display></checkin></checkins>

And here’s the same data, again using curl to grab it, and then passing it through the pretty.pl script:

<?xml version="1.0" encoding="UTF-8"?>
<checkins>
  <checkin>
    <id>123847273</id>
    <created>Mon, 09 Aug 10 00:50:33 +0000</created>
    <timezone>America/Chicago</timezone>
    <venue>
      <id>2357761</id>
      <name>The Kiltie</name>
      <primarycategory>
        <id>79067</id>
        <fullpathname>Food:Ice Cream</fullpathname>
        <nodename>Ice Cream</nodename>
        <iconurl>http://foursquare.com/img/categories/food/icecream.png</iconurl>
      </primarycategory>
      <address></address>
      <city></city>
      <state></state>
      <geolat>43.107391</geolat>
      <geolong>-88.464475</geolong>
    </venue>
    <display>Pete P. @ The Kiltie</display>
  </checkin>
</checkins>

I still find Perl extremely useful for this sort of task… I’m sure there are other command line ways to do this, but this one works for me.

(Hat tip to A Curious Programmer where I picked up this Perl code from…)

2007.04.28

I needed to find all files on a Linux box that had been modified in the last 24 hours. Years ago I used to have a custom Perl script for such things, which I would use for various software development projects, but I found this awesome command:

find / -type f -mtime -1 -exec ls -al {} \;

You can change the ‘/’ if you want to look somewhere specifically, like your home directory, or /etc. Of course you can pipe it to grep for just grabbing certain matches. The -1 specifies 1 day. I always love finding simple commands that are so powerful.

Oh, I found that bit at DZone Snippets, it’s titled Search for files modified the last … days.

|


buy the button:

Buy The Button