Ayuda
Ir al contenido

Dialnet


Resumen de Dataclay: next generation object storage

Jonathan Martí

  • Existing solutions for data sharing are not fully compatible with multi-provider contexts. Traditionally, providers offer their datasets through hermetic Data Services with restricted APIs. Therefore, consumers are compelled to adapt their applications to current functionality, and their chances of contributing with their own know-how are very limited.

    With regard to data management, current database management systems (DBMSs) that sustain these Data Services are designed for a single-provider scenario, forcing a centralized administration conducted by the single role of the database administrator (DBA). This DBA defines the conceptual schema and the corresponding integrity constraints, and determines the external schema to be offered to the end users. The problem is that a multi-provider environment cannot assume the existence of a central role for the administration of all the datasets.

    In terms of data processing, the different representations of the data model at different tiers, from the application level, to the Data Service or DBMS layers; causes the applications to dedicate between 20\% and 50\% of the code to perform the proper transformations. This causes a negative impact both on developers' productivity and on the global performance of data-intensive workflows.

    In light of the foregoing, this thesis proposes three novel techniques that enable a data store to support a multi-provider ecosystem, facilitating the collaboration within all the players, and the development of efficient data-intensive applications. In particular, and after the convenient decentralization of the database administration, this thesis contributes to the community with:

    1) the proper mechanisms to enable consumers to extend current schema and functionality without compromising providers constraints.

    2) the proper mechanisms to enable any provider to define his own policies and integrity constraints in a way that will never be jeopardized.

    3) the integration of a parallel programming model with the data model to drastically reduce data transformations and being designed to be compliant with near future storage devices.

    These contributions have been validated by means of the design and implementation of dataClay, as an example of a multi-provider data store that fulfills the defined requirements. Furthermore, regarding the first and third contributions, different performance analysis are exposed to evaluate and prove their feasibility (notice that the second contribution is merely logical).


Fundación Dialnet

Dialnet Plus

  • Más información sobre Dialnet Plus