RSS Primary Source Detection

Here’s the problem: I read about 100 blogs and generally end up reading the same story more than once. It’s like some kind of bizarre internet echo. Many bloggers merely link to another’s article, sometimes without so much as adding their $0.02. Sometimes I wonder, “Did I already read this? Is this just deja vu? Am I going insane?”

I want an aggregator which digs down a few levels of indirection and attempts to identify the primary source for some story. This would be the root of a whole tree of links. Then, merely show me this “uberpost” and give me an easy way to de-prioritize the others.

If I have this problem with 100 blogs, just imagine what Scoble has to deal with — he reads over 600: Lifehacker, Hackszine, Four-Hour Workweek (this is the primary source…case in point)

Who will build this first?

Update: Upon reflection, I think the real problem is the sense of deja-vu. I might be satisfied with a feed reader which simply restores my confidence in my own sanity. So maybe just give me a view that shows the related posts together. Or perhaps, just give me a UI indication “Links to primary source (Read Tuesday at 9:22 AM)”

Advertisements

2 Responses to “RSS Primary Source Detection”


  1. 1 John Knox June 22, 2007 at 12:57 pm

    It would be very useful if the RSS reader could categorize and group posts like that.

    Maybe the easiest way to do this could use explicit tagging. Unfortunately, I doubt that many sites provide a useful level of tagging (if any). So relying on the pages to supply tags is probably out.

    URLs could act as a unique marker indicating that several posts are the same. Three posts containing the same link to a LOLCATS! photo can safely get grouped together. Same thing for posts that link to each other.

    Some sort of implicit tagging might work, but how do you decide which words are important? Include lists (e.g. we call the phrase “Unicorn Chaser” valid categorization data)? Exclude lists (e.g we say the word “the” should be ignored when looking for keywords)?

    Actually, the notion of implicit tagging feels similar to search engine functionality. Perhaps some google API can indicate the degree of Kevin Bacon between two links? That would outsource a lot of the difficult magic.

    I wouldn’t be surprised if somebody was working on this problem (if it doesn’t already exist). The kind of RSS feeder I want is one that also filters out all posts that aren’t both practical and heavy in informational content. That sounds more difficult to me.

  2. 2 Mark June 22, 2007 at 1:09 pm

    Yea John, I’m definitely looking for something automatic. I’ve never been a big fan of tagging.

    I really think you could just follow a few levels of link indirection.

    My original intent, detecting a “primary source”, is a little fuzzy. It would be pretty hard, for example, to distinguish between a follow-up (with new, useful information) and a simple “me too” type of post. It would be subjective, as well.

    But, really, you don’t have to do this. Just notice the link structure and remind me, “This post links to X, which you have already read.”

    Oh, and I think your Kevin Bacon idea is a good one too. I’d appreciate something like, “This post is 2 links away from X, which you have already read.”


Comments are currently closed.




%d bloggers like this: