Site Map | Search

  • New Media Bytes and Briefs
  • Articles
  • Presentations
  • Links
  • Glossary
  • Key Online Metrics
  • Reading List


  • v-Fluence Interactive Public Relations > Resources > Articles > Web Page Categorization

    Web Page Categorization

    By Jiyan Wei

    v-Fluence’s benchmark research products identify and evaluate online environments to support our clients’ needs and enable effective and measurable actions.  A core component of this research involves establishing evaluation criteria and definitions to help discover patterns and trends of information on the Web.  One key element is the categorization of individual Web pages identified as influencing a specific topic area.  There are several models, academic and commercial, for such categorizations.  For the purposes of effective public affairs, issues management and marketing support online we use the following categorization system, which is defined and rationalized in this paper.  Web-page-type information is enhanced with a range of other page- and site-specific evaluations and categorizations such as stakeholder type, site features, etc. This paper focuses specifically on page categorization definitions and rationales.





    Why Web Page Categorization?

    According to a 2005 study conducted by Gulli & Signorini, the size of the indexable (1) Web is nearing 12 billion pages (2).  Major search engines provide coverage of 82 percent of the indexable Web.  For Internet users, it appears that the challenge of access to information has taken on an entirely new meaning in the Information Age.  That challenge is no longer caused by insufficient privileges or obstacles to transmission of data; but by the sheer vastness of the data available to Web users. 

    The conventional search engine process represents some of the most in-depth work done yet to categorize and organize Web pages.  Search engines are predominantly focused on a process of semantic matching in which the user’s search term is matched with a corresponding list of Web documents, organized according to the search engine’s algorithm.  Search engines have played a major role in helping users locate and retrieve information on the Web, but the methodologies used by the major engines are still very much a work in progress.  Indeed, organizing and categorizing documents throughout the Web to facilitate efficient search and retrieval of is one of the key issues in making it more useful and accessible to the general public. 

    Categorization of Web page type is not offered through the search engine model, in part because a process has not yet been developed to automate categorization while keeping it accurate; currently, categorization of Web page type remains a qualitative analytical process used mainly for highly focused research projects.   

    Next: Mainstream (Mis)perceptions of Web Page Type

    ________________________________

    1 - An indexable Web page is one that can be crawled and indexed by a search engine spider.
    2 - A Web page (also referred to as a Web document) is an HTML document that is stored on a Web server.  This document can be viewed over the World Wide Web through a Web browser.  One key identifier of a distinct Web page is its Uniform Resource Locator (URL).  Aggregations of Web pages sharing a common domain are collectively referred to as a Web site.  

     

    Internet Developments
    12-19-07- Web 2.0 Year in Review
    03-29-07- Mobile Internet Adoption Slow to Grow (copy 1)
    02-26-07- PR Tactics and The Strategist Online



    Copyright © 2006 v-Fluence Interactive Public Relations