Nowadays, with large amounts of data becoming available, solving biological quests is becoming more and more a data-driven activity. To support this, there is a need for tools that enable the integration of the many sources of data. This thesis presents several avenues that can be taken, showing how integration can support research in the life sciences. Biological data can be integrated at several levels. We present a categorization of these different strategies within the context of the Data-Information-Knowledge-Wisdom paradigm. We argue that bioinformatics research should not only concentrate on the individual levels, but also on the transitions between these domains. Throughout the thesis we present possible solutions for a number of these different transformations. One of these transformations bridges the gap that exists between the two lowest levels of data integration: data representations (e.g. databases) and data analysis (e.g. pattern recognition, statistical analysis). The wealth of different data sources, each describing a different aspect of the molecular system (such as gene expressions, locations on genome, physical binding partners etc.), have driven data representation approaches towards flat and flexible formats. In contrast, data analysis prefers structured multi-dimensional array-based data formats. We argue that this gap cannot be closed, but rather needs to be bridged through the use of novel query systems. As a solution, we introduce the tool IBIDAS, which allows one to easily handle not only tables, but also more complexly structured data, making it a flexible tool for the exploration and analysis of data. Another question concerns at what point different data sources need to be integrated. One strategy (`late' integration) is to analyse each data source separately, after which the results are integrated. This strategy however prevents the discovery of connections that transcend the individual data sources. The alternative `early' integration strategy, in which the data is first concatenated (i.e. as feature vector) before analysis, is however not always feasible when complex data types, such as DNA sequences, need to be taken into account. We advocate an `intermediate' data integration approach, in which each data source is first transformed into a suitable “kernel space”. In this space, data can be integrated in a straightforward manner. This can even be done in a non-linear fashion, after which the data can be analyzed together. We show the strength of such “kernel method” when combining data sources to predict interacting proteins. When combining similar data, we emphasize that one should consider this as a data integration problem too, instead of just concatenating the data. As an example, batch-effects can seriously affect the data distributions of gene expression experiments. These effects need to be resolved when analyzing these experiments jointly. One way to solve this is to normalize data before it is joined. We show that it is necessary to take into account as much information as possible about the way in which the data is created into a normalization scheme. By modeling the effects that deteriorate your data, seemingly uninformative data sets can become again a rich source of information. As an example we applied this to data sets that study the relationship between the transcriptome of stem cells and the effectivity of these cells in bone regeneration. Instead of integrating data for a single problem, one can also integrate data for a class of problems. We show that the machine learning concept can elegantly solve such integration problems. By making use of the similarities between the problem domains, learning parameters can be restricted. This approach was applied in the analysis of 'materiomics' data for the new TopoChip platform. Measurements that characterized the reactions of cells to individual material surfaces were noisy, making it difficult to adequately compare these surface effects. However, by taking into account the similarities between surfaces, and by integrating data across these similar surfaces, results were improved significantly. As an encompassing example of data integration we finally show how a combination of integration methods can be put together to link two other integration levels: pattern recognition and causal model inference. In this example, numerous data sources are being used to predict cause-effect relationships between genes in perturbation experiments. The used data sources describe various aspects of proteins, protein-protein interactions and protein-DNA interactions. These descriptions of the physical components of a cell are related to cause-effect interactions between the genes, in such a way that data from perturbation experiments is explained. We combine kernel-based integration methods with a method that constructs a causal model, showing that cause-effect relationships can be accurately predicted. Taken together, this thesis explores several data integration levels and approaches. Given the complexity of biology, we believe that data integration will become more and more essential in bioinformatics and that this dissertation only has set the first steps on this road.