Information systems have evolved from small servers to complex architectures that enable these systems to operate most of the time as possible (high availability) and provide service to a number of customers that increases everyday (scalability). That is translated into offering a 24x7 service (24 hours a day, 7 days a week).
Databases are a widely used solution for efficient data storage. However, a database by itself does not provide high availability since it is a single point of failure. Database replication provides high availability. Having more than one data copies (replicas) at different nodes, if one or more nodes fail, the rest continue offering the service. Moreover, a replicated database can provide also scalability if it can exploit the processing capacity of all nodes. This Ph.D. Thesis focuses on high performance and high available replicated databases.
Traditional solutions to database replication deal with full replication (each node stores a copy of all data). Full replication offers limited scalability, even from a theoretical point of view [JPAK03].
The limits come from maintaining consistency between replicas. That is, when a data is modified, it must be done in all its replicas. This thesis has made a theoretical study of the potential scalability of partially replicated databases, where each node stores only part of the data and compares it against the scalability of full replication. It has been been also proposed, implemented and empirically evaluated different partial replication protocols.
Using partial replication it might happen that given a query, there could be no node able to execute it since the query accesses data that is not stored together. If that happens, the query is executed in multiple nodes and the execution time is increased because nodes executing it need to exchange messages. This thesis proposes the use of optimization techniques to reduce the message exchange ratio. Specifically, it presents, implements and evaluates an autonomic reconfiguration mechanism that determines the data distribution minimizing the query execution across multiple nodes.
Finally, this thesis proposes and evaluates a recovery protocol [BHG87] for partially replicated databases that allows to reincorporate failed nodes. Traditionally, recovery implies to stop the system in order to transfer data to the node that rejoins the system. This inactivity period does not fulfills the requirements of high availability. Therefore, this thesis has proposes an online recovery protocol, i.e., without stopping the replicated database.