High-performance and fault-tolerant techniques for massive data distribution in online communities

e-Archivo Repository

e-Archivo estará en modo consulta durante los próximos días. Por favor, NO DEPOSITE ningún trabajo. Los enlaces a través del handle no están accesibles, si necesita hacer una búsqueda de sus publicaciones, pinche en "Navegar por" "Autores". Disculpen las molestias.

Show simple item record

dc.contributor.advisor Carretero Pérez, Jesús
dc.contributor.advisor Isaila, Florín Daniel
dc.contributor.author Higuero Alonso-Mardones, Daniel
dc.date.accessioned 2013-12-19T08:59:15Z
dc.date.available 2013-12-19T08:59:15Z
dc.date.issued 2013-06
dc.date.submitted 2013-07-15
dc.identifier.uri http://hdl.handle.net/10016/18074
dc.description.abstract The amount of digital information produced and consumed is increasing each day. This rapid growth is motivated by the advances in computing power, hardware technologies, and the popularization of user generated content networks. New hardware is able to process larger quantities of data, which permits to obtain finer results, and as a consequence more data is generated. In this respect, scientific applications have evolved benefiting from the new hardware capabilities. This type of application is characterized by requiring large amounts of information as input, generating a significant amount of intermediate data resulting in large files. This increase not only appears in terms of volume, but also in terms of size, we need to provide methods that permit a efficient and reliable data access mechanism. Producing such a method is a challenging task due to the amount of aspects involved. However, we can leverage the knowledge found in social networks to improve the distribution process. In this respect, the advent of the Web 2.0 has popularized the concept of social network, which provides valuable knowledge about the relationships among users, and the users with the data. However, extracting the knowledge and defining ways to actively use it to increase the performance of a system remains an open research direction. Additionally, we must also take into account other existing limitations. In particular, the interconnection between different elements of the system is one of the key aspects. The availability of new technologies such as the mass-production of multicore chips, large storage media, better sensors, etc. contributed to the increase of data being produced. However, the underlying interconnection technologies have not improved with the same speed as the others. This leads to a situation where vast amounts of data can be produced and need to be consumed by a large number of geographically distributed users, but the interconnection between both ends does not match the required needs. In this thesis, we address the problem of efficient and reliable data distribution in a geographically distributed systems. In this respect, we focus on providing a solution that 1) optimizes the use of existing resources, 2) does not requires changes in the underlying interconnection, and 3) provides fault-tolerant capabilities. In order to achieve this objectives, we define a generic data distribution architecture composed of three main components: community detection module, transfer scheduling module, and distribution controller. The community detection module leverages the information found in the social network formed by the users requesting files and produces a set of virtual communities grouping entities with similar interests. The transfer scheduling module permits to produce a plan to efficiently distribute all requested files improving resource utilization. For this purpose, we model the distribution problem using linear programming and offer a method to permit a distributed solving of the problem. Finally, the distribution controller manages the distribution process using the aforementioned schedule, controls the available server infrastructure, and launches new on-demand resources when necessary.
dc.format.mimetype application/pdf
dc.language.iso eng
dc.rights Atribución-NoComercial-SinDerivadas 3.0 España
dc.rights.uri http://creativecommons.org/licenses/by-nc-nd/3.0/es/
dc.subject.other Redes sociales
dc.subject.other Arquitecura de redes
dc.subject.other Sistemas tolerantes a fallos
dc.title High-performance and fault-tolerant techniques for massive data distribution in online communities
dc.type doctoral thesis
dc.type.review PeerReviewed
dc.subject.eciencia Informática
dc.rights.accessRights open access
dc.contributor.departamento UC3M. Departamento de Informática
 Find Full text

Files in this item

*Click on file's image for preview. (Embargoed files's preview is not supported)


The following license files are associated with this item:

This item appears in the following Collection(s)

Show simple item record