15 October 2009

Last monday, I noticed that my Python bot script using FeedParser did not work well. WordPress' feed was read and parsed to its variable, but first entry was wrong. I had tested using Python interpreter using this pseudo code,

>>> import feedparser
>>> f = feedparser.parse('http://example.com/feed/')
>>> f.entries[0].title

It did not show correct last entry our blog had, but previous one entry. So, I am sure nothing is wrong with my whole script, because FeedParser itself went wrong parsing first entry. Did FeedParser read from cache (its cache if exists, or maybe our Web proxy)? To ensure, I load that feed on Opera, and entries were correct displayed; W3M, console-based browser resulted the same. Finally I downloaded the feed, saved it as file, then read as file, and still the first entry was wrong.

Bot script itself is installed as cron/scheduler entry, and surprisingly, yesterday, report from bot is shown correct, new entry from feed, and I still do not know yet, how this fix had been done.

No comments: