April 12, 2021
Maroua Tikat (WIMMICS)
Interactive multimedia visualization for exploring and fixing a multi-dimensional metadata base of popular music
This PHD thesis is concerned by the use of information visualization techniques as a mean to allow the exploration of a large dataset of music metadata. In this paper we review some of the major music datasets available, the data they contain, and how information visualization techniques have been used to explore such data. As we shall see, music is a complex entity that can be described as a multitude of multimedia attributes (ex. lyrics, chords, audio, graphics describing sound analysis, etc.). Thus, music datasets are often created by collecting data from specialized datasets. The integration of data from diverse sources might create problems of data quality (ex. ambiguities, imprecision, incomplete sources, conflicts, etc.). Traditionally, information visualization techniques are used to understand the corpus of data and identify causal relationships, trends, patterns of data concentrations. Nonetheless, we suggest that information visualization techniques be used to inspect the data quality of multivariate data sets and highlight the parts of the data sets that need to be fixed/improved. Moreover, using interactive techniques, we suggest that information visualization techniques could be used as entry point for repairing the data set. In the context of this PhD thesis, the research questions are: how to communicate problems related to data quality to the users, and how to visually represent the outcomes of methods used for data completion and correction (such as crowdsourcing, matrix vectorization, graph reasoning, among other).