ERA

Download the full-sized PDF of Staged Grid NewSQL Database System for OLTP and Big Data ApplicationsDownload the full-sized PDF

Analytics

Share

Permanent link (DOI): https://doi.org/10.7939/R38911Z75

Download

Export to: EndNote  |  Zotero  |  Mendeley

Communities

This file is in the following communities:

Graduate Studies and Research, Faculty of

Collections

This file is in the following collections:

Theses and Dissertations

Staged Grid NewSQL Database System for OLTP and Big Data Applications Open Access

Descriptions

Other title
Subject/Keyword
NewSQL
Bigdata
Database
Type of item
Thesis
Degree grantor
University of Alberta
Author or creator
Wu, Lengdong
Supervisor and department
Professor Li-Yan Yuan (Computing Science)
Professor Jia-Huai You (Computing Science)
Examining committee member and department
Professor Denilson Barbosa (Computing Science)
Professor Davood Rafiei (Computing Science)
Professor Ke Wang (Computing Science)
Department
Department of Computing Science
Specialization

Date accepted
2016-05-02T14:26:02Z
Graduation date
2016-06
Degree
Doctor of Philosophy
Degree level
Doctoral
Abstract
Big data applications demand and consequently lead to developments of diverse scalable data management systems, ranging from NoSQL systems to the emerging NewSQL systems. In order to serve thousands of applications and their huge amounts of data, data management systems must be capable of scale-out to clusters of commodity servers. The overarching goal of this dissertation is to propose principles, paradigms and protocols to architect efficient, scalable and practical NewSQL database systems that address the unique set of challenges posed by the big data trend. This dissertation shows that with careful choice of design and features, it is possible to implement scalable NewSQL database systems that efficiently support transactional semantics to ease application design. In this dissertation, we first investigate, analyze and characterize current scalable data management systems in depth and develop comprehensive taxonomies for various critical aspects covering the data model, the system architecture and the consistency model. On the basis of analyzing the scalability limitations of current systems, we then highlight the key principles for designing and implementing scalable NewSQL database systems. This dissertation advances the state-of-the-art by improving and providing satisfactory solutions to critical facets of NewSQL database systems. In particular, first we specify a staged grid architecture to support scalable and efficient transaction processing using clusters of commodity servers. The key insight is to disintegrate and reassemble system components into encapsulated staged modules. Effective behavior rules for communication are then defined to orchestrate independent staged modules deployed on networked computing nodes into one integrated system. Second, we propose a new formula-based protocol for distributed concurrency control to support thousands of concurrent users accessing data distributed over commodity servers. The formula protocol for concurrency is a variation of the multi-version time-stamp concurrency control protocol, which guarantees serializability. We reduce the overhead of conventional implementation by technologies including logical formula caching and dynamic timestamp ordering. Third, we identify a new consistency model-BASIC (Basic Availability, Scalability, Instant Consistency) that matches the requirements where extra efforts are not needed to manipulate inconsistent soft states of weak consistency models. BASIC extends the current understanding of CAP theorem by characterizing precisely different degree of dimensions that can be achieved rather than simply what cannot be done. We introduce all these novel ideas and features based on the implementation of Rubato DB, a highly scalable NewSQL database system. We have conducted extensive experiments that clearly show that Rubato DB is highly scalable with efficient performance under both TPC-C and YCSB benchmarks. These results verify that the staged grid architecture and the formula protocol provide a satisfactory solution to one of the important challenges in the NewSQL database systems: to develop a highly scalable database management system that supports various consistency levels from ACID to BASE.
Language
English
DOI
doi:10.7939/R38911Z75
Rights
This thesis is made available by the University of Alberta Libraries with permission of the copyright owner solely for the purpose of private, scholarly or scientific research. This thesis, or any portion thereof, may not otherwise be copied or reproduced without the written consent of the copyright owner, except to the extent permitted by Canadian copyright law.
Citation for previous publication
Lengdong Wu, Li-Yan Yuan, Jia-Huai You: Survey of Large-Scale Data Management Systems for Big Data Applications. J. Comput. Sci. Technol. 30(1): 163-183 (2015)Li-Yan Yuan, Lengdong Wu, Jia-Huai You, Yan Chi: A Demonstration of Rubato DB: A Highly Scalable NewSQL Database System for OLTP and Big Data Applications. SIGMOD Conference 2015: 907-912Lengdong Wu, Li-Yan Yuan, Jia-Huai You: BASIC: An alternative to BASE for large-scale data management system. BigData Conference 2014: 5-14Li-Yan Yuan, Lengdong Wu, Jia-Huai You, Yan Chi: Rubato DB: A Highly Scalable Staged Grid Database System for OLTP and Big Data Applications. CIKM 2014: 1-10

File Details

Date Uploaded
Date Modified
2016-05-02T20:26:11.926+00:00
Audit Status
Audits have not yet been run on this file.
Characterization
File format: pdf (Portable Document Format)
Mime type: application/pdf
File size: 4109255
Last modified: 2016:11:16 13:15:10-07:00
Filename: wu_lengdong_201604_PhD.pdf
Original checksum: 0a634af8f0e03bd201d1acb9504fe5b9
Well formed: true
Valid: true
File title: thesis front page
File title: Lengdong Wu
File author: gracet
Page count: 148
Activity of users you follow
User Activity Date