Organizational Unit:
H. Milton Stewart School of Industrial and Systems Engineering

Research Organization Registry ID
Description
Previous Names
Parent Organization
Parent Organization
Organizational Unit
Includes Organization(s)

Publication Search Results

Now showing 1 - 4 of 4
  • Item
    Sequential estimation in statistics and steady-state simulation
    (Georgia Institute of Technology, 2014-04-09) Tang, Peng
    At the onset of the "Big Data" age, we are faced with ubiquitous data in various forms and with various characteristics, such as noise, high dimensionality, autocorrelation, and so on. The question of how to obtain accurate and computationally efficient estimates from such data is one that has stoked the interest of many researchers. This dissertation mainly concentrates on two general problem areas: inference for high-dimensional and noisy data, and estimation of the steady-state mean for univariate data generated by computer simulation experiments. We develop and evaluate three separate sequential algorithms for the two topics. One major advantage of sequential algorithms is that they allow for careful experimental adjustments as sampling proceeds. Unlike one-step sampling plans, sequential algorithms adapt to different situations arising from the ongoing sampling; this makes these procedures efficacious as problems become more complicated and more-delicate requirements need to be satisfied. We will elaborate on each research topic in the following discussion. Concerning the first topic, our goal is to develop a robust graphical model for noisy data in a high-dimensional setting. Under a Gaussian distributional assumption, the estimation of undirected Gaussian graphs is equivalent to the estimation of inverse covariance matrices. Particular interest has focused upon estimating a sparse inverse covariance matrix to reveal insight on the data as suggested by the principle of parsimony. For estimation with high-dimensional data, the influence of anomalous observations becomes severe as the dimensionality increases. To address this problem, we propose a robust estimation procedure for the Gaussian graphical model based on the Integrated Squared Error (ISE) criterion. The robustness result is obtained by using ISE as a nonparametric criterion for seeking the largest portion of the data that "matches" the model. Moreover, an l₁-type regularization is applied to encourage sparse estimation. To address the non-convexity of the objective function, we develop a sequential algorithm in the spirit of a majorization-minimization scheme. We summarize the results of Monte Carlo experiments supporting the conclusion that our estimator of the inverse covariance matrix converges weakly (i.e., in probability) to the latter matrix as the sample size grows large. The performance of the proposed method is compared with that of several existing approaches through numerical simulations. We further demonstrate the strength of our method with applications in genetic network inference and financial portfolio optimization. The second topic consists of two parts, and both concern the computation of point and confidence interval (CI) estimators for the mean µ of a stationary discrete-time univariate stochastic process X \equiv \{X_i: i=1,2,...} generated by a simulation experiment. The point estimation is relatively easy when the underlying system starts in steady state; but the traditional way of calculating CIs usually fails since the data encountered in simulation output are typically serially correlated. We propose two distinct sequential procedures that each yield a CI for µ with user-specified reliability and absolute or relative precision. The first sequential procedure is based on variance estimators computed from standardized time series applied to nonoverlapping batches of observations, and it is characterized by its simplicity relative to methods based on batch means and its ability to deliver CIs for the variance parameter of the output process (i.e., the sum of covariances at all lags). The second procedure is the first sequential algorithm that uses overlapping variance estimators to construct asymptotically valid CI estimators for the steady-state mean based on standardized time series. The advantage of this procedure is that compared with other popular procedures for steady-state simulation analysis, the second procedure yields significant reduction both in the variability of its CI estimator and in the sample size needed to satisfy the precision requirement. The effectiveness of both procedures is evaluated via comparisons with state-of-the-art methods based on batch means under a series of experimental settings: the M/M/1 waiting-time process with 90% traffic intensity; the M/H_2/1 waiting-time process with 80% traffic intensity; the M/M/1/LIFO waiting-time process with 80% traffic intensity; and an AR(1)-to-Pareto (ARTOP) process. We find that the new procedures perform comparatively well in terms of their average required sample sizes as well as the coverage and average half-length of their delivered CIs.
  • Item
    Variance parameter estimation methods with re-use of data
    (Georgia Institute of Technology, 2008-08-25) Meterelliyoz Kuyzu, Melike
    This dissertation studies three classes of estimators for the asymptotic variance parameter of a stationary stochastic process. All estimators are based on the concept of data "re-use" and all transform the output process into functions of an approximate Brownian motion process. The first class of estimators consists folded standardized time series area and Cramér-von Mises (CvM) estimators. Detailed expressions are obtained for their expectation at folding levels 0 and 1; those expressions explain the puzzling increase in small-sample bias as the folding level increases. In addition, we use batching and linear combinations of estimators from different levels to produce estimators with significantly smaller variance. Finally, we obtain very accurate approximations of the limiting distributions of batched folded estimators. These approximations are used to compute confidence intervals for the mean and variance parameter of the underlying stochastic process. The second class --- folded overlapping area estimators --- are computed by averaging folded versions of the standardized time series corresponding to overlapping batches. We establish the limiting distributions of the proposed estimators as the sample size tends to infinity. We obtain statistical properties of these estimators such as bias and variance. Further, we find approximate confidence intervals for the mean and variance parameter of the process by approximating the theoretical distributions of the proposed estimators. In addition, we develop algorithms to compute these estimators with only order-of-sample-size work. The third class --- reflected area and CvM estimators --- are computed from reflections of the original sample path. We obtain the expected values and variance of individual estimators. We show that it is possible to obtain linear combinations of reflected estimators with smaller variance than the variance of each constituent estimator, often at no cost in bias. A quadratic optimization problem is solved to find an optimal linear combination of estimators that minimizes the variance of the linearly combined estimator. For all classes of estimators, we provide Monte Carlo examples to show that the estimators perform as well in practice as advertised by the theory.
  • Item
    Folded Variance Estimators for Stationary Time Series
    (Georgia Institute of Technology, 2005-04-19) Antonini, Claudia
    This thesis is concerned with simulation output analysis. In particular, we are inter- ested in estimating the variance parameter of a steady-state output process. The estimation of the variance parameter has immediate applications in problems involving (i) the precision of the sample mean as a point estimator for the steady-state mean and #956;X, and (ii) confidence intervals for and #956;X. The thesis focuses on new variance estimators arising from Schrubens method of standardized time series (STS). The main idea behind STS is to let such series converge to Brownian bridge processes; then their properties are used to derive estimators for the variance parameter. Following an idea from Shorack and Wellner, we study different levels of folded Brownian bridges. A folded Brownian bridge is obtained from the standard Brownian bridge process by folding it down the middle and then stretching it so that it spans the interval [0,1]. We formulate the folded STS, and deduce a simplified expression for it. Similarly, we define the weighted area under the folded Brownian bridge, and we obtain its asymptotic properties and distribution. We study the square of the weighted area under the folded STS (known as the folded area estimator ) and the weighted area under the square of the folded STS (known as the folded Cram??von Mises, or CvM, estimator) as estimators of the variance parameter of a stationary time series. In order to obtain results on the bias of the estimators, we provide a complete finite-sample analysis based on the mean-square error of the given estimators. Weights yielding first-order unbiased estimators are found in the area and CvM cases. Finally, we perform Monte Carlo simulations to test the efficacy of the new estimators on a test bed of stationary stochastic processes, including the first-order moving average and autoregressive processes and the waiting time process in a single-server Markovian queuing system.
  • Item
    Estimation Techniques for Nonlinear Functions of the Steady-State Mean in Computer Simulation
    (Georgia Institute of Technology, 2004-12-08) Chang, Byeong-Yun
    A simulation study consists of several steps such as data collection, coding and model verification, model validation, experimental design, output data analysis, and implementation. Our research concentrates on output data analysis. In this field, many researchers have studied how to construct confidence intervals for the mean u of a stationary stochastic process. However, the estimation of the value of a nonlinear function f(u) has not received a lot of attention in the simulation literature. Towards this goal, a batch-means-based methodology was proposed by Munoz and Glynn (1997). Their approach did not consider consistent estimators for the variance of the point estimator for f(u). This thesis, however, will consider consistent variance estimation techniques to construct confidence intervals for f(u). Specifically, we propose methods based on the combination of the delta method and nonoverlapping batch means (NBM), standardized time series (STS), or a combination of both. Our approaches are tested on moving average, autoregressive, and M/M/1 queueing processes. The results show that the resulting confidence intervals (CIs) perform often better than the CIs based on the method of Munoz and Glynn in terms of coverage, the mean of their CI half-width, and the variance of their CI half-width.