About: Microsoft SmartScreen is a research topic. Over the lifetime, 61 publications have been published within this topic receiving 2710 citations. The topic is also known as: SmartScreen & Windows Defender SmartScreen.
TL;DR: It is found that it is often possible to tell whether or not a URL belongs to a phishing attack without requiring any knowledge of the corresponding page data.
Abstract: Phishing is form of identity theft that combines social engineering techniques and sophisticated attack vectors to harvest financial information from unsuspecting consumers Often a phisher tries to lure her victim into clicking a URL pointing to a rogue page In this paper, we focus on studying the structure of URLs employed in various phishing attacks We find that it is often possible to tell whether or not a URL belongs to a phishing attack without requiring any knowledge of the corresponding page data We describe several features that can be used to distinguish a phishing URL from a benign one These features are used to model a logistic regression filter that is efficient and has a high accuracy We use this filter to perform thorough measurements on several million URLs and quantify the prevalence of phishing on the Internet today
TL;DR: The design and performance characteristics of a scalable machine learning classifier developed to detect phishing websites are described and this classifier is used to maintain Google’s phishing blacklist automatically.
Abstract: Phishing websites, fraudulent sites that impersonate a trusted third party to gain access to private data, continue to cost Internet users over a billion dollars each year. In this paper, we describe the design and performance characteristics of a scalable machine learning classifier we developed to detect phishing websites. We use this classifier to maintain Google’s phishing blacklist automatically. Our classifier analyzes millions of pages a day, examining the URL and the contents of a page to determine whether or not a page is phishing. Unlike previous work in this field, we train the classifier on a noisy dataset consisting of millions of samples from previously collected live classification data. Despite the noise in the training data, our classifier learns a robust model for identifying phishing pages which correctly classifies more than 90% of phishing pages several weeks after training concludes.
TL;DR: Though there are several anti-phishing software and techniques for detecting potential phishing attempts in emails and detecting phishing contents on websites, phishers come up with new and hybrid techniques to circumvent the availableSoftware and techniques.
Abstract: Phishing is a form of identity theft that occurs when a malicious Web site impersonates a legitimate one in order to acquire sensitive information such as passwords, account details, or credit card numbers.Though there are several anti-phishing software and techniques for detecting potential phishing attempts in emails and detecting phishing contents on websites, phishers come up with new and hybrid techniques to circumvent the available software and techniques.
TL;DR: Over a period of three weeks, the effectiveness of the blacklists maintained by Google and Microsoft with 10,000 phishing URLs was tested, and the existence of page properties that can be used to identify phishing pages were explored.
Abstract: Phishing is an electronic online identity theft in which the attackers use a combination of social engineering and web site spoofing techniques to trick a user into revealing confidential information This information is typically used to make an illegal economic profit (eg, by online banking transactions, purchase of goods using stolen credentials, etc) Although simple, phishing attacks are remarkably effective As a result, the numbers of successful phishing attacks have been continuously increasing and many anti-phishing solutions have been proposed One popular and widely-deployed solution is the integration of blacklist-based anti-phishing techniques into browsers However, it is currently unclear how effective such blacklisting approaches are in mitigating phishing attacks in real-life In this paper, we report our findings on analyzing the effectiveness of two popular anti-phishing solutions Over a period of three weeks, we automatically tested the effectiveness of the blacklists maintained by Google and Microsoft with 10,000 phishing URLs Furthermore, by analyzing a large number of phishing pages, we explored the existence of page properties that can be used to identify phishing pages
TL;DR: A phishing detection system with several notable properties: it requires very little training data, scales well to much larger test data, is language-independent, fast, resilient to adaptive attacks and implemented entirely on client-side.
Abstract: Phishing is a major problem on the Web. Despite the significant attention it has received over the years, there has been no definitive solution. While the state-of-the-art solutions have reasonably good performance, they require a large amount of training data and are not adept at detecting phishing attacks against new targets. In this paper, we begin with two core observations: (a) although phishers try to make a phishing webpage look similar to its target, they do not have unlimited freedom in structuring the phishing webpage, and (b) a webpage can be characterized by a small set of key terms, how these key terms are used in different parts of a webpage is different in the case of legitimate and phishing webpages. Based on these observations, we develop a phishing detection system with several notable properties: it requires very little training data, scales well to much larger test data, is language-independent, fast, resilient to adaptive attacks and implemented entirely on client-side. In addition, we developed a target identification component that can identify the target website that a phishing webpage is attempting to mimic. The target detection component is faster than previously reported systems and can help minimize false positives in our phishing detection system.