Monday, January 31, 2011

Real-Time Delivery Explained


One of the most in-demand features of Feed My Inbox is real-time notifications. We get several inquiries and questions each week about it. In an effort to be more transparent and better educate our customers, today we're going to dissect what real-time truly means and all the efforts we're making to deliver emails to you as they are posted.



The question we hear the most is, "how long does it take a real-time notification to reach my inbox?" It's a very fair question. However, the answer is more complicated than we'd like. We always have to say, "It depends." This post will cover each dependence, where it might fall short and why.



There are two ways FMI learns about a feed update. The first is by polling, which means our servers go out and manually check feeds for updates and store off anything new. The second is through a real-time protocol. Instead of FMI having to go out and poll the feed for updates, real-time protocols proactively notify us of new updates as they happen.



To understand our real-time notifications is to understand these two methods better. So let's dive in!



Feed Polling


As mentioned before, polling is the process of our servers checking a feed manually for updates. On a pretty consistent basis, we can guarantee that a feed will be polled and an email is sent to you within about 5-15 minutes. We're always optimizing our systems and adding infrastructure as needed to keep this number as low as possible.



The Problem with Feed Polling


Real-time emails are delivered slower than the 5-15 minute window when the feed you are subscribed to goes through a middle-man first. So there's the feed, then another service that sits on top of the feed, then your subscription.



Examples


Millions of feeds around the world are hosted by a service called FeedBurner (owned by Google). It's a great service, which provides valuable analytics to feed owners about subscribers, popular posts and much more. We use it for the Brightwurks feed.



FeedBurner is what I am referring to as a middle-man. They sometimes take 10-30 minutes to poll your feed for updates and add new content to the feed they host for you. It then takes Feed My Inbox the 5-15 minute window to poll FeedBurner for the update. So a 45-minute delay from the post going up to you getting an email is very possible (although unlikely) in this scenario.



Services such as Google Reader, Yahoo! Pipes and Feed Rinse among tons of others create this same problem for us. Basically any service that serves as a go-between from the original content source to Feed My Inbox can cause delays. They have to poll the original feed, then we poll the middle-man feed. Worst of all, this isn't something we can control. If a middle-man takes 30 minutes to poll the original source, your email will be 30 minutes delayed and we can't do a thing about it. All we can do is continue making our own polling faster.



Real-time Protocols


Feeds that support a real-time protocol solve all our problems, right? Actually, not always. These are the protocols we're using:




PubSubHubBub


This is a nifty little technology. The feed file identifies a "Hub" server URL. You can create a hub, use a community hub or use the hub your publishing software has setup. When the feed is updated, the software generating the feed knows to ping the hub, who in turn pings Feed My Inbox to send out a real-time email.



Instead of having to poll feeds manually for updates, PubSubHubBub lets us know when there is a new post; then we email it out. It all happens in 1-2 minutes.



Hubs such as the ones provided by our friends at SuperFeedr work beautifully, sending updates to FMI almost instantly. Aforementioned FeedBurner supports PubSubHubBub for all their feeds, but we still fall victim to their manual polling before we get the update.



For PubSubHubBub-enabled feeds, we still manually poll them because of slow-moving hubs or middle-man issues as discussed. For whatever reason, sometimes our polling beats the hub. We've seen emails go out at 1 minute, 1 minute, 2 minutes, 1 minute, then 15 minutes. The 15-minute email is when our polling beat out the hub.



RSS Cloud


RSS Cloud is much like PubSubHubBub, with only one crucial difference. When the cloud (another name for "hub") pings Feed My Inbox, we have to go fetch the feed content. With PubSubHubBub, the ping sent to us from the hub already contains the content; so we don't have to go fetch it. Otherwise, these two are very much the same.



We've had pretty good experience with RSS Cloud, but it hasn't been adopted at the same rate as PubSubHubBub because Google didn't choose to use it. Big sites like Wordpress.com have implemented support for RSS Cloud on their millions of blogs, so it's still worth paying attention to.



Both of these real-time protocols provide an edge that helps us deliver emails faster, but we still have to rely on our polling as well in case they flub up. Usually the results are great and feeds using these protocols can be delivered to our customers in a couple minutes.



Twitter API


I should be clear that Twitter isn't a real-time feed protocol. Fact is, our Twitter integration doesn't even look at feeds anymore. It's too slow and we were sending them too many requests as FMI grew. We use their API to get profile updates almost instantaneously. It's awesome, and very fast (like less than a minute). However, we still do the old feed polling method for Twitter searches and non-profile feeds.



For some of you, I understand this article might be too much information. But we think it's very important to document why feed delivery times vary for those that are interested. Like most things in life, it's not as simple as it may seem on the surface. Our real-time updates are not always perfect, but in our tests Feed My Inbox is still MUCH faster than other web apps trying to do the same thing.



Last week's maintenance included a number of performance improvements and we've been working more on real-time performance this week. Real-time delivery is getting better on a consistent basis. So even if it's not good enough for you right now, check back with us later. Most of all, let us know if real-time isn't satisfactory for you and we can tell you why or try to make it better.






0 comments:

Post a Comment