Researchers in India and Japan say they have discovered an automatic way of differentiating between Web sites that express personal opinions and corporate marketing Web sites that were designed to trick users into thinking they are personal Web pages. In an upcoming issue of the International Journal of Business Intelligence and Data Mining, Takahiro Hayashi from Niigata University and colleagues describe their approach, which extracts subjective expressions from Web pages and scores them based on the degree of subjectivity. The researchers tested their system against 1,200 Web pages on products, tourist spots, restaurants, and movies, and found that their method is more effective in finding personal opinion pages than a general search engine.
The researchers say that finding genuine personal opinions is significantly harder than finding commercially-biased sites because search engines tend to ignore personal home pages, personal blogs, Web forum sites, and smaller customer opinion sites. The system relies on the fact that marketers and advertisers tend not to report negative comments on a product or service, while personal opinion sites tend to be filled with both positive and negative comments.
Expressions with a negative meaning, sentence-final particles, interjections, and specific symbols can be extracted from a Web page and fed into the researchers' algorithm, which determines a weighted and categorized ratio of negative to positive expressions and provides an indicator of whether a page is commercial or personal.
From Inderscience Publishers
View Full Article
Abstracts Copyright © 2009 Information Inc., Bethesda, Maryland, USA