A group of four Spanish researchers, amongst which IMDEA Networks is represented, has developed a model for online resource search based on modified random walks in order to accelerate results outcome. The new mechanisms achieve reductions of up to 90% in the average length of searches made with conventional random walks.
When we run a search online, we hardly ever stop to think of the technical process involved to deliver the expected results to us. Beyond what appears to be simple, each search made by an internet user involves the application of a complete series of computing and mathematical algorithms to conduct the search. However, the effectiveness of these algorithms, measurable through indicators such as search speed, length limitation for the query, or bandwidth used, depends on the amount of information provided.
Antonio Fernández Anta (IMDEA Networks Institute), Víctor López Millán, (CEU University of San Pablo), Vicent Cholvi (Universitat Jaume I) and Luis López (Universidad Rey Juan Carlos) have presented a model for resource search, based on the so-called random walks, which has allowed for the design of efficient mechanisms of resource search on a network where a centralized information of resources is not available. The performance of said mechanisms has been assessed using both analytical models and experiments by means of simulations.
Random walks have been widely researched in mathematics, where they have been analyzed as Discrete-time Markov chains, and used in a wide variety of fields, such as statistical physics, populations dynamics, bio-informatics, etc., When applied to communications networks, random walks have had a profound impact in algorithmic and in the complexity theory, as they prevent the overload in communications that is required in other mechanisms, due to their simplicity, the low consumption of processing resources and node storage, as well as the fact that they only use local information.
Researchers used SAW random walks (self-avoiding walks), instead of regular random walks, as well as precomputed partial walks, in order to build longer random walks more efficiently. The main goal was to improve the performance of normal random walks to prevent revisiting the same nodes, therefore increasing the odds to locate the node with the desired resource.
SAWs achieve reductions between 50% and 75% of the average searches length compared to the regular random walks in the experiments conducted, depending on the network type and average degree. Scale-free networks show the biggest reductions, while in networks with resource replication, reductions decreased between 7% and 26%, which shows that the benefits from avoiding nodes revisitation are partially compensated by the knowledge of neighbor resources when normal random walks are used.
Searches with partial random walks achieve more significant length reductions: the square root of the average length of normal random walks searches, with a minimal Bloom filters false positive rate. This implies some temporary search reductions of 88% to 98%, without a significant dependence on the type of network. Furthermore, while combining partial walks and SWA walks, additional reductions of 5% - 12% are achieved.
When mechanisms based on partial random walks are adapted with dynamic resources, significant reductions of the average length in searches are also observed. The experiments conducted showed that these reductions decrease when the dynamicity of the resources increases, but they remain within significant values even for a high volatility index.
This study has proven that in mechanisms based on SAW and partial random walks, and applied to networks of random structure, where any node plug may be connected through a link to any other connection spot in the network with uniform probability, it is possible to achieve a performance that surpasses that of normal random walk-based searches.
These results pose a number of relevant practical implications for everyday internet users. When a user runs a search on the Internet, he will most likely use platforms like Google or similar engines, which surf the web to store information related to its contents on its database. But what if we do not count on those centralized search engines, or if they are not available at a certain time? It is for such a scenario, or for P2P networks that do not have a centralized links structure for contents, that the model proposed by this group of researchers would prove useful: It would allow for Network scrutiny without the need to turn to file sharing spaces or to search engines, with an average search length significantly lower than those exclusively based on random walks.