Series
GVU Technical Report Series

Series Type
Publication Series
Description
Associated Organization(s)
Associated Organization(s)
Organizational Unit

Publication Search Results

Now showing 1 - 10 of 19
  • Item
    Life, Death, and Lawfullness on the Electronic Frontier
    (Georgia Institute of Technology, 1997) Pitkow, James Edward ; Pirolli, Peter
    To facilitate users' ability to make sense of large collections of hypertext we present two new techniques for inducing clusters of related documents on the World Wide Web. Users' ability to find relevant information might also be enhanced by finding lawful properties of document behavior and use. We present models and analyses of document use and change for the World Wide Web.
  • Item
    Surveying the Territory: GVU's Five WWW User Surveys
    (Georgia Institute of Technology, 1997) Kehoe, Colleen Mary ; Pitkow, James Edward
    Five years is not very long on most historical scales, but for the World Wide Web (WWW) it constitutes a lifetime. A question almost as old as the web itself is, "Who is using it, and for what?" One way to answer this question is to use paper surveys, telephone surveys, or diaries which are some of the the same methods used to measure the audiences of other one-way media such as television and radio. However, something interesting happened in early 1994: the implementation of HTML Forms turned the web into a two-way medium which made it possible to contact the audience directly. To test the viability of the web as a survey medium and collect preliminary data on the web population, the first GVU WWW User Survey was conducted in January 1994. Subsequent surveys have been conducted approximately every six months. The collection of responses from over 55,000 Web users over five surveys has given us a unique perspective on the advances in surveying technology and methodology and changes in the web population itself. In the following sections, we discuss what we have learned in each of these areas.
  • Item
    In Search of: Reliable Usage Data on the World Wide Web
    (Georgia Institute of Technology, 1997) Pitkow, James Edward
    The WWW is currently the hottest testbed for future interactive digital systems. While much is understood technically about how the WWW functions, substantially less is known about how this technology is used collectively and on an individual basis. This disparity of knowledge exists largely as a direct consequence of the decentralized nature of Web. Since each user of the Web is not uniquely identifiable across the system and the system employs various levels of caching, measurement of actual usage is problematic. This paper establishes terminology to frame the problem of reliably determining usage of WWW resources while reviewing current practice and their shortcomings. A review of the various metrics and analyses that can be performed to determine usage is then presented. This is followed by a discussion of the strengths and weaknesses of the hit-metering proposal [Mogul and Leach 1997] currently in consideration by the HTTP working group. Lastly, new proposals, based upon server-side sampling are introduced and assessed against the other proposal. It is argued that server-side sampling provides more reliable and useful usage data while requiring no change to the current HTTP protocol and enhancing user privacy.
  • Item
    Federal Trade Commission Public Workshop on Consumer Information Privacy : Supplemental Comments for Project Number: P954807 Document Number: 18
    (Georgia Institute of Technology, 1997) Pitkow, James Edward ; Kehoe, Colleen Mary
    The following document contains supplemental comments made to the Federal Trade Commision Workshop on Consumer Information Privacy held June 10 - June 13, 1997. The supplemental filing contains results from GVU's Seventh WWW User Survey.
  • Item
    Characterizing World Wide Web Ecologies
    (Georgia Institute of Technology, 1997) Pitkow, James Edward
    One of the fastest growing sources of information today is the World Wide Web (WWW), having grown from only fifty sources of information in January of 1993 to over a half million four years later. The exponential growth of information within the Web has created an overabundance of information and a poverty of human attention, with users citing the inability to navigate and find relevant information on the Web as one of the biggest problems facing the Web today. The primary goal of the research presented here is to put forth new techniques and models that can be used to help efficiently manage peoples attentional processes when dealing with large, unstructured, heterogeneous information environments. The primary model is based upon the desirability of items on the Web. This research searches for lawful patterns of structure, content, and use. Methods are developed to exploit these patterns to organize and optimize users' information foraging and sense-making activities. These enhancements rely on predicting, categorization and allocation of attention. Several methods are explored for inducing categorical structures for the WWW. Some of these enhancements involve clustering in a high-dimensional space of content, use, and structural features. Others derive from cocitation analysis methods used in the study of scientific communities. A user would also be aided by retrieval mechanisms that predicted and returned the most likely needed WWW pages, given that the user is attending to some given page(s). The approach of this research uses a spreading activation mechanism to predict the needed, relevant information, computed using past usage patterns, degree of shared content, and WWW hyperlink structure.
  • Item
    Federal Trade Commission Workshop on Consumer Information Privacy: Consumer Privacy 1997 - request to participate, P954807
    (Georgia Institute of Technology, 1997) Pitkow, James Edward ; Kehoe, Colleen Mary
    The following document contains the initial comments made to the Federal Trade Commision Workshop on Consumer Information Privacy held June 10 - June 13, 1997. The initial filing contains results from GVU's Sixth WWW User Survey.
  • Item
    Supporting the Web: A Distributed Hyperlink Database System
    (Georgia Institute of Technology, 1996) Pitkow, James Edward ; Jones, R. Kipp
    In our last paper [Pitkow & Jones 1995], we presented an integrated scheme for an intelligent publishing environment that included a locally maintained hyperlink database. This paper takes our previous work full cycle by extending the scope of the hyperlink database to include the entire Web. While the notion of hyperlink databases has been around since the beginnings of hypertext, the Web provides the opportunity to experiment with the largest open distributed hypertext system. The addition of hyperlink databases to the Web infrastructure positively impacts several areas including: referential integrity, link maintenance, navigation and visualization. This paper presents an architecture and migration path for the deployment of a scalable hyperlink database server called Atlas. Atlas is designed to be scalable, autonomous, and weakly consistent. After introducing the concept and utility of link databases, this paper discusses the Atlas architecture and functionality. We conclude with a discussion of subscriber and publisher policies that exploit the underlying hyperlink infrastructure and intelligent publishing environments.
  • Item
    Emerging Trends in the WWW User Population
    (Georgia Institute of Technology, 1996) Pitkow, James Edward ; Kehoe, Colleen Mary
    Vast amounts of attention and resources have recently been devoted towards the World Wide Web (WWW) [Berners-Lee 94], but relatively little research has been conducted that examines Web usage and societal implications. With the goals of understanding the Web user population and promoting the Web as a viable surveying medium, GVU's WWW User Surveys were initially conducted during January 1994. Subsequent surveys were administered approximately every six months thereafter. The surveys employ non-random sampling techniques, which limit the ability of the results to generalize to the entire Web population. Each survey is conducted using the limited interactivity of the Web, where users point and click on responses within their Web browsers and submit results to a centralized server for processing. Each survey is conducted for a one month period. This paper examines the emerging trends of the WWW user population.
  • Item
    Silk from a Sow's Ear: Extracting Usable Structures from the Web
    (Georgia Institute of Technology, 1996) Pirolli, Peter ; Pitkow, James Edward ; Rao, Ramana
    In its current implementation, the World-Wide Web lacks much of the explicit structure and strong typing found in many closed hypertext systems. While this property probably relates to the explosive acceptance of the Web, it further complicates the already difficult problem of identifying usable structures and aggregates in large hypertext collections. These reduced structures, or localities, form the basis for simplifying visualizations of and navigation through complex hypertext systems. Much of the previous research into identifying aggregates utilize graph theoretic algorithms based upon structural topology, i.e., the linkages between items. Other research has focused on content analysis to form document collections. This paper presents our exploration into techniques that utilize both the topology and textual similarity between items as well as usage data collected by servers and page meta-information lke title and size. Linear equations and spreading activation models are employed to arrange Web pages based upon functional categories, node types, and relevancy.
  • Item
    Results from the Third WWW User Survey
    (Georgia Institute of Technology, 1996) Pitkow, James Edward ; Kehoe, Colleen Mary
    The tremendous success of the World Wide Web has led to an ever-increasing user base. Intuitively, one would expect this base to change over time as more people from different segments of the population become Web users and advocates. What exactly have these changes been? How do the original Web users differ from the new users from major online service providers like Prodigy? What trends exist and what picture do they paint for the future of the Web user population? This paper, drawing on results from three User Surveys spanning over a year and a half, attempts to answer these and other questions about who is using the Web and why. Additionally, a review of the methodology, questionnaires, and new architectural enhancements is presented. Although the surveys lack the scientific rigor of controlled and accepted methods of surveying, we discuss analyses that help us understand the limitations and process of this new type of surveying. Finally, new quantitative analysis techniques are presented based upon post-hoc log file analysis, yielding guidelines for Web-based survey design.