Navathe, Shamkant B.
ArchiveSpace Name Record
Publication Search Results
Now showing 1 - 10 of 14
ItemA methodology for application design using active database technology(Georgia Institute of Technology, 1993) Navathe, Shamkant B. ; Georgia Institute of Technology. Office of Sponsored Programs ; Georgia Institute of Technology. College of Computing ; Georgia Institute of Technology. Office of Sponsored Programs
ItemA Mathematical Optimization Approach To Improve Server Scalability In Intermittently Synchronized Databases(Georgia Institute of Technology, 1999) Yee, Wai Gen ; Navathe, Shamkant B. ; Datta, Anindya ; Mitra, SabyasachiThis paper addresses a scalability problem in the process of synchronizing the states of multiple client databases which only have deferred access to the server. It turns out that the process of client update file generation is not scalable with the number of clients served. In this paper we concentrate on developing an optimization model to address the scalability problem at the server by aiming for an optimal grouping of data fragments at the server given the "interest sets" of the clients - the set of fragments the client deals with for its"local" processing. The objective is to minimize the total cost of server operation which includes processing updates from all clients and transmission cost of sending the right set of updates to each client based on the client's interest set. An integer programming formulation is developed and solved with an illustrative problem, yielding interesting results.
ItemToward A Method of Grouping Server Data Fragments for Improving Scalability in Intermittently Synchronized Databases(Georgia Institute of Technology, 1999) Yee, Wai Gen ; Donahoo, Michael J. ; Navathe, Shamkant B.We consider the class of mobile computing applications with periodically connected clients. These clients wish to share data; however, due to the expense of mobile communication, they only connect periodically -- and not necessarily synchronously -- to a common network. Traditionally, a continuously-connected server, containing an aggregate of client data, facilitates sharing amongst clients by allowing the clients to upload local updates and download updates submitted by other clients. The server computes and transmits these updates on a client-by-client basis; consequently, the complexity of these operations is on the order of the number of clients, limiting scalability. Recent research proposes exploiting client data overlap by grouping updates according to how the data is shared amongst clients (data-centric) instead of on a client-by-client basis (client-centric). Each client downloads updates for the relevant set of groups. By grouping, update operation distribution is computed only once per group, irrespective of the number of clients downloading a particular group's updates. Additionally, we may gain bandwidth scalability by employing broadcast delivery since, unlike the case in the per-client approach, multiple clients may be interested in a group's updates. Clearly, group composition directly affects the scalability of this approach. Given a relative cost of resources such as server processing, bandwidth, and storage space, we focus on developing a group derivation approach that significantly improves the scalability of the resources. We construct a formal specification of this problem and discuss the intractability of an optimal solution. Based on observations from the specification, we derive a heuristically based approach and evaluate its efficacy with respect to the client-centric approach. We run experiments on an implemented system that demonstrates that as the amount of overlap increases between client subscriptions, the data-centric approach with groups generated by our heuristic-based algorithm yields significant cost reduction when compared to the traditional client-centric approach.
ItemModeling of database constraints in active databases(Georgia Institute of Technology, 1993) Navathe, Shamkant B. ; Georgia Institute of Technology. Office of Sponsored Programs ; Georgia Institute of Technology. College of Computing ; Georgia Institute of Technology. Office of Sponsored Programs
ItemAn Efficient Algorithm for Mining Association Rules in Large Databases(Georgia Institute of Technology, 1995) Omiecinski, Edward ; Navathe, Shamkant B. ; Savasere, AshokMining for association rules between items in a large database of sales transactions has been described as an important database mining problem. In this paper we present an efficient algorithm for mining association rules that is fundamentally different from known algorithms. Compared to the previous algorithms, our algorithm reduces both CPU and I/O overheads. In our experimental study it was found that for large databases, the CPU overhead was reduced by as much as a factor of seven and I/O was reduced by almost an order of magnitude. Hence this algorithm is especially suitable for very large size databases. The algorithm is also ideally suited for parallelization. We have performed extensive experiments and compared the performance of the algorithm with one of the best existing algorithms.
ItemAdaptive and Automated Index Selection in Relational DBMS(Georgia Institute of Technology, 1994) Omiecinski, Edward ; Navathe, Shamkant B. ; Frank, Martin RobertWe present a novel approach for a tool that assists the database administrator in designing an index configuration for a relational database system. A new methodology for collecting usage statistics at run time is developed which lets the optimizer estimate query execution costs for alternative index configurations. Defining the workload specification required by existing index design tools may be very complex for a large integrated database system. Our tool automatically derives the workload statistics. These statistics are then used to efficiently compute an index configuration. Execution of a prototype of the tool against a sample database demonstrates that the proposed index configuration is reasonably close to the optimum for test query sets.
ItemA Clustering Algorithm to Discover Low and High Density Hyper-Rectangles in Subspaces of Multidimensional Data.(Georgia Institute of Technology, 1999) Omiecinski, Edward ; Navathe, Shamkant B. ; Ezquerra, Norberto F. ; Ordońẽz, CarlosThis paper presents a clustering algorithm to discover low and high density regions in subspaces of multidimensional data for Data Mining applications. High density regions generally refer to typical cases, whereas low density regions indicate infrequent and thus rare cases. For typical applications there is a large number of low density regions and a few of these are interesting. Regions are considered interesting when they have a minimum "volume" and involve some maximum number of dimensions. Our algorithm discovers high density regions (clusters) and low density regions (outliers, negative clusters, holes, empty regions) at the same time. In particular, our algorithm can find empty regions; that is, regions having no data points. The proposed algorithm is fast and simple. There is a large variety of applications in medicine, marketing, astronomy, finance, etc, where interesting and exceptional cases correspond to the low and high density regions discovered by our algorithm.
ItemA knowledge-based approach to integrating and querying distributed heterogeneous information systems(Georgia Institute of Technology, 1995) Navathe, Shamkant B. ; Georgia Institute of Technology. Office of Sponsored Programs ; Georgia Institute of Technology. College of Computing ; Georgia Institute of Technology. Office of Sponsored Programs
ItemThe Impact Of Data Placement Strategies On Reorganization Costs In Parallel Databases(Georgia Institute of Technology, 1995) Omiecinski, Edward ; Navathe, Shamkant B. ; Achyutuni, Kiran JyotsnaIn this paper, we study the data placement problem from a reorganization point of view. Effective placement of the declustered fragments of a relation is crucial to the performance of parallel database systems having multiple disks. Given the dynamic nature of database systems, the optimal placement of fragments will change over time and this will necessitate a reorganization in order to maintain the performance of the database system at acceptable levels. This study shows that the choice of a data placement strategy can have a significant impact on the reorganization costs. Up until now, data placement heuristics were designed with the principal purpose of balancing the load. However, this paper shows that such a policy can be beneficial only in the short term. Long term database designs should take reorganization costs into consideration while making design choices.
ItemTowards Transactional Data Management over the Cloud(Georgia Institute of Technology, 2010) Tiwari, Rohan G. ; Navathe, Shamkant B. ; Kulkarni, Gaurav J. ; Georgia Institute of Technology. College of Computing ; Georgia Institute of Technology. Database Research GroupWe propose a consistency model for a data store in the Cloud and work towards the goal of deploying Database as a Service over the Cloud. This includes consistency across the data partitions and consistency of any replicas that exist across different nodes in the system. We target applications which need stronger consistency guarantees than the applications currently supported by the data stores on the Cloud. We propose a cost-effective algorithm that ensures distributed consistency of data without really compromising on availability for fully replicated data. This paper describes a design in progress, presents the consistency and recovery algorithms for relational data, highlights the guarantees provided by the system and presents future research challenges. We believe that the current notions of consistency for databases might not be applicable over the Cloud and a new formulation of the consistency concept may be needed keeping in mind the application classes we aim to support.