Topic

TrustRank

About: TrustRank is a research topic. Over the lifetime, 84 publications have been published within this topic receiving 32744 citations.

...read moreread less

Topic Tools

Find unexplored research gaps

Generate a literature review

Explore related concepts

Papers published on a yearly basis

Papers

Journal Article•10.1016/S0169-7552(98)00110-X•

The anatomy of a large-scale hypertextual Web search engine

[...]

Sergey Brin¹, Lawrence Page¹•Institutions (1)

Stanford University¹

1 Apr 1998

TL;DR: This paper provides an in-depth description of Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext and looks at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

...read moreread less

Abstract: In this paper, we present Google, a prototype of a large-scale search engine which makes heavy use of the structure present in hypertext. Google is designed to crawl and index the Web efficiently and produce much more satisfying search results than existing systems. The prototype with a full text and hyperlink database of at least 24 million pages is available at http://google.stanford.edu/. To engineer a search engine is a challenging task. Search engines index tens to hundreds of millions of web pages involving a comparable number of distinct terms. They answer tens of millions of queries every day. Despite the importance of large-scale search engines on the web, very little academic research has been done on them. Furthermore, due to rapid advance in technology and web proliferation, creating a web search engine today is very different from three years ago. This paper provides an in-depth description of our large-scale web search engine -- the first such detailed public description we know of to date. Apart from the problems of scaling traditional search techniques to data of this magnitude, there are new technical challenges involved with using the additional information present in hypertext to produce better search results. This paper addresses this question of how to build a practical large-scale system which can exploit the additional information present in hypertext. Also we look at the problem of how to effectively deal with uncontrolled hypertext collections where anyone can publish anything they want.

...read moreread less

16,670 citations

Proceedings Article•

The PageRank Citation Ranking : Bringing Order to the Web

[...]

Lawrence Page, Sergey Brin, Rajeev Motwani, Terry Winograd

11 Nov 1999

TL;DR: This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them, and shows how to efficiently compute PageRank for large numbers of pages.

...read moreread less

Abstract: The importance of a Web page is an inherently subjective matter, which depends on the readers interests, knowledge and attitudes. But there is still much that can be said objectively about the relative importance of Web pages. This paper describes PageRank, a mathod for rating Web pages objectively and mechanically, effectively measuring the human interest and attention devoted to them. We compare PageRank to an idealized random Web surfer. We show how to efficiently compute PageRank for large numbers of pages. And, we show how to apply PageRank to search and to user navigation.

...read moreread less

16,420 citations

Book Chapter•10.1016/B978-012088469-8.50052-8•

Combating web spam with trustrank

[...]

Zoltan Gyongyi¹, Hector Garcia-Molina¹, Jan Pedersen²•Institutions (2)

Stanford University¹, Yahoo!²

31 Aug 2004

TL;DR: This paper proposes techniques to semi-automatically separate reputable, good pages from spam, and shows that they can effectively filter out spam from a significant fraction of the web, based on a good seed set of less than 200 sites.

...read moreread less

Abstract: Web spam pages use various techniques to achieve higher-than-deserved rankings in a search engine's results. While human experts can identify spam, it is too expensive to manually evaluate a large number of pages. Instead, we propose techniques to semi-automatically separate reputable, good pages from spam. We first select a small set of seed pages to be evaluated by an expert. Once we manually identify the reputable seed pages, we use the link structure of the web to discover other pages that are likely to be good. In this paper we discuss possible ways to implement the seed selection and the discovery of good pages. We present results of experiments run on the World Wide Web indexed by AltaVista and evaluate the performance of our techniques. Our results show that we can effectively filter out spam from a significant fraction of the web, based on a good seed set of less than 200 sites.

...read moreread less

1,322 citations

Proceedings Article•10.1145/1062745.1062762•

Identifying link farm spam pages

[...]

Baoning Wu¹, Brian D. Davison¹•Institutions (1)

Lehigh University¹

10 May 2005

TL;DR: Algorithms for detecting link farms automatically are presented by first generating a seed set based on the common link set between incoming and outgoing links of Web pages and then expanding it, providing a modified web graph to use in ranking page importance.

...read moreread less

Abstract: With the increasing importance of search in guiding today's web traffic, more and more effort has been spent to create search engine spam. Since link analysis is one of the most important factors in current commercial search engines' ranking systems, new kinds of spam aiming at links have appeared. Building link farms is one technique that can deteriorate link-based ranking algorithms. In this paper, we present algorithms for detecting these link farms automatically by first generating a seed set based on the common link set between incoming and outgoing links of Web pages and then expanding it. Links between identified pages are re-weighted, providing a modified web graph to use in ranking page importance. Experimental results show that we can identify most link farm spam pages and the final ranking results are improved for almost all tested queries.

...read moreread less

292 citations

Proceedings Article•10.1145/1390334.1390412•

BrowseRank: letting web users vote for page importance

[...]

Yuting Liu¹, Bin Gao², Tie-Yan Liu², Ying Zhang³, Zhi-Ming Ma⁴, Shuyuan He⁵, Hang Li² - Show less +3 more•Institutions (5)

Beijing Jiaotong University¹, Microsoft², Nankai University³, Chinese Academy of Sciences⁴, Peking University⁵

20 Jul 2008

TL;DR: Experimental results show that BrowseRank indeed outperforms the baseline methods such as PageRank and TrustRank in several tasks.

...read moreread less

Abstract: This paper proposes a new method for computing page importance, referred to as BrowseRank. The conventional approach to compute page importance is to exploit the link graph of the web and to build a model based on that graph. For instance, PageRank is such an algorithm, which employs a discrete-time Markov process as the model. Unfortunately, the link graph might be incomplete and inaccurate with respect to data for determining page importance, because links can be easily added and deleted by web content creators. In this paper, we propose computing page importance by using a 'user browsing graph' created from user behavior data. In this graph, vertices represent pages and directed edges represent transitions between pages in the users' web browsing history. Furthermore, the lengths of staying time spent on the pages by users are also included. The user browsing graph is more reliable than the link graph for inferring page importance. This paper further proposes using the continuous-time Markov process on the user browsing graph as a model and computing the stationary probability distribution of the process as page importance. An efficient algorithm for this computation has also been devised. In this way, we can leverage hundreds of millions of users' implicit voting on page importance. Experimental results show that BrowseRank indeed outperforms the baseline methods such as PageRank and TrustRank in several tasks.

...read moreread less

220 citations

...

Expand

Performance Metrics

Papers

638

Citations

No. of papers in the topic in previous years
Year	Papers
2021	1
2020	9
2019	1
2018	4
2017	3
2016	1

TrustRank

Topic Tools

Papers published on a yearly basis

Papers

The anatomy of a large-scale hypertextual Web search engine

The PageRank Citation Ranking : Bringing Order to the Web

Combating web spam with trustrank

Identifying link farm spam pages

BrowseRank: letting web users vote for page importance

Related Topics (5)

Performance Metrics