Scaling synchronization primitives

Kashyap, Sanidhya

Title:

Scaling synchronization primitives

dc.contributor.advisor	Kim, Taesoo
dc.contributor.advisor	Min, Changwoo
dc.contributor.author	Kashyap, Sanidhya
dc.contributor.committeeMember	Gavrilovska, Ada
dc.contributor.committeeMember	Calciu, Irina
dc.contributor.committeeMember	Arulraj, Joy
dc.contributor.department	Computer Science
dc.date.accessioned	2020-09-08T12:48:21Z
dc.date.available	2020-09-08T12:48:21Z
dc.date.created	2020-08
dc.date.issued	2020-07-28
dc.date.submitted	August 2020
dc.date.updated	2020-09-08T12:48:21Z
dc.description.abstract	Over the past decade, multicore machines have become the norm. A single machine is capable of having thousands of hardware threads or cores. Even cloud providers offer such large multicore machines for data processing engines and databases. Thus, a fundamental question arises is how efficient are existing synchronization primitives— timestamping and locking—that developers use for designing concurrent, scalable, and performant applications. This dissertation focuses on understanding the scalability aspect of these primitives, and presents new algorithms and approaches, that either leverage the hardware or the application domain knowledge, to scale up to hundreds of cores. First, the thesis presents Ordo , a scalable ordering or timestamping primitive, that forms the basis of designing scalable timestamp-based concurrency control mechanisms. Ordo relies on invariant hardware clocks and provides a notion of a globally synchronized clock within a machine. We use the Ordo primitive to redesign a synchronization mechanism and concurrency control mechanisms in databases and software transactional memory. Later, this thesis focuses on the scalability aspect of locks in both virtualized and non-virtualized scenarios. In a virtualized environment, we identify that these locks suffer from various preemption issues due to a semantic gap between the hypervisor shceduler and a virtual machine scheduler—the double scheduling problem. We address this problem by bridging this gap, in which both the hypervisor and virtual machines share minimal scheduling information to avoid the preemption problems. Finally, we focus on the design of lock algorithms in general. We find that locks in practice have discrepancies from locks in design. For example, popular spinlocks suffer from excessive cache-line bouncing in multicore (NUMA) systems, while state-of-the-art locks exhibit sub-par single-thread performance. We classify several dominating factors that impact the performance of lock algorithms. We then propose a new technique, shuffling, that can dynamically accommodate all these factors, without slowing down the critical path of the lock. The key idea of shuffling is to re-order the queue of threads waiting to acquire the lock with some pre-established policy. Using shuffling, we propose a family of locking algorithms, called SHFLLOCKS that respect all factors, efficiently utilize waiters, and achieve the best performance.
dc.description.degree	Ph.D.
dc.format.mimetype	application/pdf
dc.identifier.uri	http://hdl.handle.net/1853/63677
dc.language.iso	en_US
dc.publisher	Georgia Institute of Technology
dc.subject	OS
dc.subject	Concurrency
dc.subject	Mutual exclusion
dc.subject	File system
dc.subject	Scalability
dc.subject	Timestamping
dc.subject	Database
dc.title	Scaling synchronization primitives
dc.type	Text
dc.type.genre	Dissertation
dspace.entity.type	Publication
local.contributor.advisor	Kim, Taesoo
local.contributor.corporatename	College of Computing
local.contributor.corporatename	School of Computer Science
relation.isAdvisorOfPublication	e96debb0-758f-49d4-8ed9-307227ecad78
relation.isOrgUnitOfPublication	c8892b3c-8db6-4b7b-a33a-1b67f7db2021
relation.isOrgUnitOfPublication	6b42174a-e0e1-40e3-a581-47bed0470a1e
thesis.degree.level	Doctoral