Learning from Authoritative Security Experiment Results
Web Adoption: An Attempt Toward Classifying Risky Internet Web Browsing Behavior
Alexander Kent, Los Alamos National Laboratory
Lorie Liebrock, New Mexico Institute of Mining and Technology
Joshua Neil, Los Alamos National Laboratory
Background. This paper explores associations of computer compromise events in relationship to web browsing activity over a population of computers.
Aim. Our hypothesis was that computers are more likely to be compromised in comparison to other computers when the computer regularly browses to web sites prior to other computers visiting the same site (early adopters) or browses to unique web sites that no other computer visited (unique adopters) in a given time period.
Method. Web proxy data and associated computer-specific compromise events covering over 24,000+ computers in a contiguous 6 month time period were used to group computers in various adopter categories and compare potential compromise events between the groups.
Results. We found distinction in some web surfing behavior, in some cases differentiating the chance of compromise from 2.5-fold to over 418-fold between certain adopter categories. However, the study also showed no additional value in predicting compromise using these more complex adopter categories when compared to using simple unique web activity counts. As additional contributions, we have characterized several large, real-work cyber defense relevant data sets and introduced a method for simplifying web URLS (client web requests) that reduces unwanted uniqueness from dynamic content while preserving key characteristics.
Conclusions. We found that a count of unique web visits over time has the same level of predictive power for potential compromise as does the more complicated web adopter model. Both models have better than chance levels of prediction but also reinforces the idea that many factors beyond elements of web browsing activity are associated with computer compromise events. Nonetheless, our adopter model may still have value in objective computer risk determination based on web browsing behavior.
Get the Full Paper.