Webgraph

From HandWiki
Short description: Graph of connected web pages

The webgraph describes the directed links between pages of the World Wide Web. A graph, in general, consists of several vertices, some pairs connected by edges. In a directed graph, edges are directed lines or arcs. The webgraph is a directed graph, whose vertices correspond to the pages of the WWW, and a directed edge connects page X to page Y if there exists a hyperlink on page X, referring to page Y.

Properties

  • The degree distribution of the webgraph strongly differs from the degree distribution of the classical random graph model, the Erdős–Rényi model:[1] in the Erdős–Rényi model, there are very few large degree nodes, relative to the webgraph's degree distribution. The precise distribution is unclear,[2] however: it is relatively well described by a lognormal distribution, as well as the Barabási–Albert model for power laws.[3][4]
  • The webgraph is an example of a scale-free network.

Applications

The webgraph is used for:

  • computing the PageRank[5] of the WWW-pages;
  • computing the personalized PageRank;[6]
  • detecting webpages of similar topics, through graph-theoretical properties only, like co-citation;[7]
  • and identifying hubs and authorities in the web for HITS algorithm.

References

  1. P. Erdős, A. Renyi, Publ. Math. Inst. Hung. Acad. Sci. 5 (1960)
  2. Meusel, R.; Vigna, S.; Lehmberg, O.; Bizer, C. (2015). "The Graph Structure in the Web - Analyzed on Different Aggregation Levels". Journal of Web Science 1 (1): 33–47. doi:10.1561/106.00000003. https://air.unimi.it/bitstream/2434/372411/2/Vigna_JWebScience_2015.pdf. 
  3. Clauset, A.; Shalizi, C. R.; Newman, M. E. J. (2009). "Power-law distributions in empirical data". SIAM Rev. 51 (4): 661–703. doi:10.1137/070710111. Bibcode2009SIAMR..51..661C. 
  4. Barabási, Albert-László; Albert, Réka (October 1999). "Emergence of scaling in random networks". Science 286 (5439): 509–512. doi:10.1126/science.286.5439.509. PMID 10521342. Bibcode1999Sci...286..509B. http://www.nd.edu/~networks/Publication%20Categories/03%20Journal%20Articles/Physics/EmergenceRandom_Science%20286,%20509-512%20(1999).pdf. .
  5. S. Brin, L. Page, Computer Networks and ISDN Systems 30, 107 (1998)
  6. Glen Jeh and Jennifer Widom. 2003. Scaling personalized web search. In Proceedings of the 12th international conference on World Wide Web (WWW '03). ACM, New York, NY, USA, 271–279. doi:10.1145/775152.775191
  7. Kumar, Ravi; Raghavan, Prabhakar; Rajagopalan, Sridhar; Tomkins, Andrew (1999). "Trawling the Web for emerging cyber-communities". Computer Networks 31 (11–16): 1481–1493. doi:10.1016/S1389-1286(99)00040-7. 

External links