Categories
Uncategorized

More (dirty) XML Secrets

As we all know, XML can be easily parsed with an XML parser. Right? Right… So what happens when XML is not really XML. Well, as we all know, when XML is not XML you resort to text and regular expressions. It’s one of the dirty secrets of XML. And hey, I’m not the only one who uses regex to parse XML. There’s also the speed/memory issue, but right now I’m just concerned with the not-really-XML part of it.

The Universal Feed Parser tries to use XML, and if that fails, does the regex dance.

If XML parsing fails due to well-formedness errors in the feed… …it will automatically fall back to the 2.x-style parser based on regular expressions.

If you’ve processed a form of XML commonly known as RSS, you might have run into these issues before, because there are feeds that are not well-formed, and therefor invalid, and if you want to be picky, they aren’t really XML… Perl needs a module that does the “try it as XML, and fall back on regex if it ain’t” module. Why? Because once again I figured I could just use something like XML::DOM to deal with an RSS file, which is supposed to be XML, but when you’ve got an & instead of an & in there, it all blows up. (Hmmm, perhaps we should go the other way around, create a pre-filter that takes in XML, fixes all the errors making it valid XML, and then passes it on to the XML parser! Could this be done?)

I guess I’ll blame the developers creating the software that creates the invalid XML/RSS. Want more secrets? I’m probably one of them. Most of the code that creates my RSS feeds, and Atom feed is a bunch of perl with home-brewed templates, and regular expressions… Why? Why don’t I use the proper tools? Laziness, lack of… whatever, it doesn’t matter. People are going to do it this way, and even though you would think RSS is simple and you could create valid markup, we don’t always do that. Sure, I’ve implemented feed checking into my system, as I don’t want to be a wonk that outputs garbage, but I still have to deal with the garbage out there, and damn is it frustrating.

To rephrase “Be liberal in what you accept, and conservative in what you send” I’d say: “Garbage in” is bad but “garbage out” is worse…

Is there hope? Well, there’s always hope, right? Will Atom save the day, doing what RSS can’t always do? It would be nice, but I’m just not sure… Should we rely on software that requires well-formed XML, and can fall back on plain old regular expressions if needed? I don’t know… I tend to think that’s a hack we shouldn’t need, but only time will tell…

Categories
Uncategorized

Atom, RSS, Google, Choices, Etc…

See the following bit, Google spurns RSS for rising blog format, where there’s a quote from Dave:

“A good way to provide feedback to the Google people is to switch away from them,” Winer wrote on his site, citing a blogger who had suggested RSS supporters bolt from Blogger. “Let them make the connection that the day they started playing unfair, is the day the users started moving away.”

Meanwhile, Dave had this bit about one of my comments:

In a comment on the Cadenhead site, a guy named Pete says: “Just a reminder, you don’t have to use Google.” Perfect. A good way to provide feedback to the Google people is to switch away from them. Let them make the connection that the day they started playing unfair is the day the users started moving away. Companies always respond to this kind of input. It’s where users have the most power.

Just to put things in context (since I am that guy named Pete) I wasn’t specifically recommending that people move away from using Google because of this RSS/Atom controversy. I was recommending that people move away from Google when they become uncomfortable with any of their practices. Don’t sit on your backside and complain about Google, do something about it. There are alternatives, use another search engine. Do the others not have the features you like? Suggest them! Though you might be surprised by what some of the others can do. When’s the last time you used a search engine that wasn’t Google? If you have to really thing about that question, it might be time to switch. Remember, users are customers, and the customers are a big part of what made Google so successful. The whole “don’t be evil” thing is a good guideline, but I often think once a company becomes popular and grows to a certain size, evilness will creep in, it’s just inevitable, you just can’t please everyone, and the more customers you have, the larger the percent that might think you are evil.

Is Google the next Microsoft? Let’s hope not, but once again, if you want to look at a company that people continually complain about, Microsoft is it. I can’t tell you how many co-workers complain about some Windows problem they’ve had at home, and whine about Microsoft. How many times have I heard of people who use Windows all day at work, but when they get home use a Mac, or Linux because they actually want to enjoy using a computer… Sure, they’ll argue that Macs cost more, or they don’t really know how to use Linux, but again, the reality is, you have the choice of what kind of computer you use in your own home, don’t you?

Just remember this: You’ve always got a choice… You can choose what products to buy, what companies to support, what operating systems to run, and what search engines to use…

Categories
Uncategorized

Linux Under The Desktop

We hear a lot of talk about Linux on the desktop, and how this is the year it will really happen. Heck, I’ve heard that IBM has it’s own desktop Linux distro with 15,000 internal beta users. I’ve no doubt that Linux on the desktop will continue to improve and get more popular, as will Mac OS X, and I think both will happen at the expense of Windows. That’s just my opinion of course, and what I’m here to talk about is Linux under the desktop.

What is Linux under the desktop? It’s the practice of sticking a server under your desk to get the job done. In some companies they’ve got a lot of Windows servers, and they do all these official things like email, file/print services, DNS, etc. But when some geek type needs something done that can’t be easily done by Windows, they stick an old PC under their desk, load up a Linux distro, and installs the tools needed to get the job done.

Over time, these machines become useful, or even critical, and you need to move them into the server room along side all of the Windows boxes. And then gradually, over time, the Linux boxes outnumber the Windows boxes, and guess what? They’re more reliable, and they’re cheaper to put in place, and as long as you have people who know what they are doing, they’re easier to maintain.

That’s the plan anyway…

Oh yeah, what about Linux on the desktop? It’s coming… I hear this is the year it will really happen!

Categories
Uncategorized

DIY PowerBook Repairs

As PowerBook (and iBook) users know all to well, those damn power adapters can go bad. Well, mine finally got to the point of not working. So for the last few weeks or so I’ve been without the use of pbox, our lovely old PowerBook G3 Wallstreet.

I looked on ebay and at some of the 3rd party suppliers of power adapters, but because I’m what you might call frugal, and a hacker, I took matters into my own hands, and in my own hands I put some tools. Pliers, utility knife, wire cutters, and some duct tape. Ah, there’s always room for duct tape…

So now the power adapter works again. I did managed to lose a tiny resistor in the hackery of it all, but as the saying goes “We got power!” I mean, what could that little resistor be doing anyway? Sure, there is a chance I might get an electrical shock when checking email, or launching Firebird, I mean FireFox, might cause it to burst into flames. Oh well, such is the price you pay for attempting to keep up with the fast pace of technology on a limited budget…

Flaming PowerBook warning label

Categories
Uncategorized

Dave and IE

Dave is meeting with the IE team.

I want to talk about how the browser can be made more useful to people who use RSS and who write weblogs. I’m going to ask for features that work for all blogging software and all aggregators, foolish me, maybe I’m the only one who thinks we all do better if everyone has a chance to compete.

Um, Dave, you do realize that the next version of IE will be a pay-to-play affair. As IE for Windows will only be included when you purchase Longhorn, and IE for Mac will only be available if you purchase MSN. So if you really want to help “all blogging software and all aggregators” and want us all to have “a chance to compete” will you also be meeting with the Mozilla folks, the Opera folks, the Safari folks?

The funny thing is, if Dave wanted to make the browser work better for these things, he might do well to look at the Mozilla project, which has a huge pile of extensions at MozDev, many of which aim to make the browser known as Mozilla a better blogging and aggregator component. If Mozilla has these components, and they are done right, won’t Microsoft sit up and take notice, and do something about it?