Research and Advances
Computing Applications

Personalization on the Net -Using Web Mining: Introduction

On the Internet, we have experienced massive growth in systems that can personalize content delivered to individual users. The science behind personalization has undergone tremendous changes in recent years, yet the basic goal of personalization systems remains the same: to provide users with what they want or need without requiring them to ask for it explicitly. Personalization is the provision to the individual of tailored products, services, information or information relating to products or service. It is a broad area, also covering recommender systems, customization, and adaptive Web sites.
Posted
  1. Article
  2. Authors
  3. Footnotes

Three aspects of a Web site affect its utility in providing the intended service to its users. These are the content provided on the Web site, the layout of the individual pages, and the structure of the entire Web site itself. The relevance of each of the objects comprising a Web page to the users’ needs will clearly affect their level of satisfaction. The structure of the Web site, defined by the existence of links between the various pages, restricts the navigation performed by the user to predefined paths and therefore defines the ability of a user to access relevant pages with relative ease. However, the definition of relevance is subjective. It is here that there is a potential mismatch between the perception of what the user needs, on the part of the Web site designer, and the true needs of users. This may have a major impact on the effectiveness of a Web site.


Personalization will be ubiquitous in digital interactive devices, from handheld computers through mobile telephony devices to digital TV.


Personalization technology involves software that learns patterns, habits, and preferences. On the Internet, its use is primarily in systems that support e-business. Personalization works in this context because it helps users to find solutions, but perhaps more importantly, it also empowers e-business providers with the ability to measure the quality of that solution. In terms of the fast emerging area of Customer Relationship Management (CRM), personalization enables e-business providers to implement strategies to lock in existing customers, and to win new customers.

Daniel E. O’Leary from the University of Southern California coined the phrase "AI Renaissance" in 1997,1 to describe how artificial intelligence (AI) can make the Internet more usable. Personalization technology is part of that renaissance. In parallel with the academic progress covered in this special section, the commercial world is witness to unprecedented growth in personalization technology companies. It is sometimes difficult to find a commonality in technology foundation that spans the breadth of commercial product offerings and global academic efforts in personalization, as well as the broad cross section of emerging efforts in digital markets.

Initial attempts at achieving personalization on the Internet have been limited to check-box personalization, where portals allow the user to select the links they would like on their "personal" page. However, this has limited use since it depends on the users knowing beforehand the content of interest to them. Arguably, collaborative filtering was the first attempt at using AI for achieving personalization in a more intelligent manner. This allows users to take advantage of other users’ behavioral activities based on a measure of similarity between them. These techniques require users to divulge some personal information on their interests, likes and dislikes, information that many Web users would not necessarily wish to divulge. An alternative is observational personalization, which attempts to circumvent the need for users to divulge any personal information. The underlying assumption in this approach is that hidden within records of a user’s previous navigation behavior are clues to how services, products, and information need to be personalized for enhanced Web interaction.

In our view, there are three principal components to observational personalization: analytics, representation, and deployment. Web mining provides the tools to analyze Web log data in a user-centric manner such as segmentation, profiling, and clickstream discovery. The knowledge mined by using these tools is increasingly being represented using W3C standards such as XML and the deployment of the knowledge on Web servers may be carried out through personalization or recommender systems.

Perhaps for the first time since the beginning of O’Leary’s AI renaissance, research and commercial goals are identical. These goals are high performance, high quality analytics, flexible and novel representational schema, and real-time application or deployment of the represented knowledge on Web servers, WAP servers, and so forth.

This special section contains four articles chosen to reflect various aspects of personalization on the Net using Web mining. Spiliopoulou provides a rationale for why Web log data should be mined. The effectiveness of a Web site in providing users with the content they need in the most optimized manner is the key to retaining them. Traditional focus group based methods for evaluating the effectiveness of a Web site’s structure are expensive and difficult to implement because little is known about the underlying population of users of the site and their needs. Spiliopoulou describes a process by which mining for navigational patterns may be used to gain insight into a Web site’s usage and optimality with respect to its current user population.

Cingil, Dogac, and Azgin describe the need for interoperability when mining the Web and how the various W3C standards can be used to achieve personalization applications. In particular, the article looks at how the recent data exchange, metadata and privacy standards from the World Wide Web Consortium (W3C), namely, Extensible Markup Language (XML), Resource Description Framework (RDF), and Platform for Privacy Preferences (P3P) may be used to support personalization activities.

Mobasher, Cooley, and Srivastava provide a framework for mining Web log files to discover knowledge for the provision of recommendations to current users based on their browsing similarities with previous users. The process for discovering such knowledge includes gathering and preprocessing the data necessary for discovering user behaviors, application of data mining techniques to discover usage patterns, and aggregation and filtering of the data mining results in order to create decision rules for customizing Web site content based on an individual user’s behavior.

Perkowitz and Etzioni address personalization as a process that adapts an Internet site through the automated generation of index pages for the Web site. The article explores adaptive Internet sites: sites that automatically improve their organization and presentation by learning from visitor access patterns. Adaptive Web sites mine the data buried in Web server logs to produce more easily navigable Web sites.

We hope you find these articles interesting and useful, and that they show the potential and diverse uses available for personalization systems. In February 1999, the Gartner Group stated that, "matching direct or inferred reader requests through content personalization will be the most dramatic development in the Internet…through 2002, and will help differentiate the Web as a new medium." Clearly this is an important technology, and is being applied in many Internet systems in use today. As digital markets converge, personalization will be ubiquitous in digital interactive devices, from handheld computers through mobile telephony devices to digital TV.

Little attention to date has been paid on evaluating the success of Web sites in terms of their utility to the users. Server statistics provided by Web log analysis tools provide metrics for evaluating the success of the server in serving pages to users, however, no insights are available regarding how useful the content provided was to the user. Evaluation of the effectiveness of a Web site must be based on the business goal of the e-business and these metrics would provide the basis of evaluating the success of personalization activities. Web Intelligence tools based on Web mining have an important role to play in the development of these e-metrics. ACM’s KDD conference in Boston (Aug. 20–23) will feature e-metrics and the role of Web mining in personalization.

The success of personalization on the Web depends on the ability of the personalization community in promoting responsible use of the technology by e-businesses. ("Responsible," like "relevant," is a subjective term.) A recent survey2 of Web users suggests over 70% of Web users find online solicitation from a Web site a hindrance rather than helpful and over 50% of Web users are worried about their privacy.3 How the providers of personalized services can manage to tread the fine line between personalization and personal intrusion and control the hype around it will determine the future of this field. Technologically, the scene is set, for the first time, for a mutually equitable relationship to develop between businesses and their customers.

Back to Top

Back to Top

    1See O'Leary, D. The Internet, intranets, and the AI Renaissance. IEEE Computer, (Jan. 1997).

    2See www.personalization.org.

    3These are users that always read the privacy statement on a Web site before divulging any personal information.

Join the Discussion (0)

Become a Member or Sign In to Post a Comment

The Latest from CACM

Shape the Future of Computing

ACM encourages its members to take a direct hand in shaping the future of the association. There are more ways than ever to get involved.

Get Involved

Communications of the ACM (CACM) is now a fully Open Access publication.

By opening CACM to the world, we hope to increase engagement among the broader computer science community and encourage non-members to discover the rich resources ACM has to offer.

Learn More