Performance optimization of complex resource allocation systems

Thumbnail Image
Li, Ran
Reveliotis, Spyros
Associated Organizations
Supplementary to
The typical control objective for a sequential resource allocation system (RAS) is the optimization of some (time-based) performance index, while ensuring the logical/behavioral correctness of the underlying automated processes. There is a rich set of results on logical control, and these results are quite effective in their ability to control RAS with very complex structure and behavior. On the other hand, the existing results on the performance-oriented control, i.e., the scheduling, of RAS are limited. The research program presented in this document seeks to provide a complete and systematic methodological framework for the RAS scheduling problem with the integration of the logical control, by leveraging the formal representations of the RAS behavior. These representations enable (i) the formulation of the RAS scheduling problem in a way that takes into consideration the RAS behavioral control requirements, and (ii) the definition of pertinent policy spaces that provide an effective trade-off between the representational and computational complexity of the pursued formulations and the operational efficiency of the derived policies. Although the presented methodological framework can be applied to any general RAS, this research mainly focuses on a class of RAS that abstracts the capacitated re-entrant line (CRL) model, and it uses this RAS class to demonstrate the overall methodology. The presented framework is divided into two parts: a “modeling” and an “algorithm” part. The “modeling” part consists of the procedures to model the RAS dynamics as a generalized stochastic Petri net (GSPN), that supports a seamless integration of, both, the logical and performance control problems. This part also formulates the scheduling problem of the performance optimization of a logically controlled GSPN as a mathematical programming (MP) problem that is derived from the semi-Markov process (SMP) modeling the timed dynamics of this GSPN. In the resulting MP formulation, the decision variables are parameters that can adjust the embedded transition probabilities of the SMP, and the objective function is the steady-state average reward with respect to a given immediate reward function. The problem of the explosive size of the MP formulation with respect to the size of the underlying RAS is addressed by (re-)defining the target policy space and its detailed representation. More specifically, three steps of complexity control are applied on the original policy space: The first step is a “refinement” process that simplifies the representation of the original policy space but does not harm its performance potential. The second step is a “restriction”, which further reduces the number of decision variables by coupling the decision-making logic that corresponds to “similar” states. Numerical studies show a dramatic reduction on the dimension of the solution space with the implementation of the first two steps. The third step of the proposed complexity control method is a partial “disaggregation” process that tries to break certain couplings formed in the second step, and thus obtain more degrees of freedom to pursue a further improvement on the optimized system performance under the applied aggregation. This third step is the mechanism that explicitly controls the trade-off between the representational and computational complexity of the target policies and their operational efficiency. Since the complexity control that is adopted in the “modeling” part is applied only on the policy space, the analytical solution of the resulting MP is still intractable because the evaluation of the objective function requires the underlying steady-state distribution of the system sojourn at each state. As a consequence, this MP is solved through a simulation optimization method called stochastic approximation (SA) in the “algorithm” part. To this end, the adopted GSPN representation has provided a succinct and efficient simulation platform, and it has facilitated the systematic estimation of the necessary gradients. At the same time, the adopted SA algorithms have been strengthened by the integration in their basic evaluation and exploration logic of results coming from the area of statistical inference. These results have enabled our SA algorithms to proceed to a near-optimal region in a robust and stable way, while avoiding the expenditure of computational resources in areas of the underlying response surface with little potential gain.
Date Issued
Resource Type
Resource Subtype
Rights Statement
Rights URI