Business Magazine

AT & T Researchers Look at Spotting Malicious Domains Using Word Segmentation

Posted on the 16 June 2015 by Worldwide @thedomains

Two researchers from AT & T, Wei Wang and Kenneth E. Shirley had a domain related paper published on the Cornell University website The paper looks at whether using certain word patterns and data can be useful in spotting malicious domains.

From the paper:

In recent years, vulnerable hosts and maliciously registered domains have been frequently involved in mobile attacks. In this paper, we  explore the feasibility of detecting malicious domains visited on a cellular network based solely on lexical characteristics  of  the domain  names. In addition to using traditional quantitative features of domain names, we also use a word segmentation algorithm to segment the domain names into individual words to greatly expand the size of the feature set.

Experiments on a sample of real-world data from a large cellular network show that using word segmentation improves our ability to detect malicious domains relative to approaches without  segmentation,  as  measured  by  misclassification  rates and  areas  under  the  ROC  curve.  Furthermore,  the  results  are interpretable, allowing one to discover (with little supervision or tuning required) which words are used most often to attract users  to  malicious  domains.  Such  a  lightweight  approach could be performed in near-real time when a device attempts to  visit  a  domain.  This  approach  can  complement  (rather than  substitute) other more expensive and time-consuming approaches to similar problems that use richer feature sets.

Among the largest  400  out  of  these  5327  coefficients  (i.e.  those most strongly  associated  with  maliciousness) were several words that fell into  groups of related words, which we manually labeled in the following list:

1) Brand names: rayban, oakley, nike, vuitton, hollister,timberland, tiffany, ugg
2) Shopping:dresses, outlet, sale, dress, offer, jackets,
watches, deals
3) Finance: loan, fee, cash, payday, cheap
4) Sportswear:jerseys, kicks, cleats, shoes, sneaker
5) Basketball Player Names (associated with shoes):kobe, jordan, jordans, lebron
6) Medical/Pharmacy:medic, pills, meds, pill, pharmacy
7) Adult:webcams, cams, lover, sex, porno
8) URL spoof: com

Read the full paper on

Back to Featured Articles on Logo Paperblog