Storm is a distributed realtime computation system. The past decade has seen a revolution in data processing. MapReduce, Hadoop, and related technologies have made it possible to store and process data at scales previously unthinkable. Unfortunately, these data processing technologies are not realtime systems, nor are they meant to be. There’s no hack that will turn Hadoop into a realtime system; realtime data processing has a fundamentally different set of requirements than batch processing. However, realtime data processing at massive scale is becoming more and more of a requirement for businesses. The lack of a “Hadoop of realtime” has become the biggest hole in the data processing ecosystem. Storm fills that hole. Storm has two basic units of processing: the Spouts and the Bolts. The Spouts are the elements that generate the data to be processed, they may get that data from external sources or generate it themselves but their mission is to introduce it to the cluster. Bolts are pr...