Finding cancer genes in copy number data and insertional mutagenesis data

Klijn, C.N.

Finding cancer genes in copy number data and insertional mutagenesis data

Title

Finding cancer genes in copy number data and insertional mutagenesis data

Author

Klijn, C.N.

Contributor

Reinders, M.J.T. (promotor)

Faculty

Electrical Engineering, Mathematics and Computer Science

Department

Pattern Recognition and Bioinformatics

Date

2011-09-09

Abstract

Cancer is a genetic disease. Step-wise alteration of genes that have a normal function in the cell can lead to the transformation of a healthy cell into a malignant cancer cell. Cancer genes provide several traits to the cell that allow it to become malignant. These traits have been researched for many years, and currently one knows quite well what has to change in a normal cell before a tumor can be formed. For example, cells must divide continuously, escape the immune system and cause the growth of new blood vessels among others. There are many genes that can cause these processes when deregulated, and each individual tumor alters a different combination of genes to acquire its tumorigenic traits. Knowing which combination of mutations was sustained by a tumor is important as this might make the tumor susceptible or resistant to certain treatments. There are many ways in which cancer genes get mutated. This thesis studies two ways in which cancer genes are mutated. The first way of mutation comes from gains and losses of gene DNA called DNA copy number alterations (CNAs). These alterations occur due to the fact that tumors generally lose their ability to correctly repair damage to their DNA. CNAs can alter the expression of cancer genes and thereby cause cancer. Not only cancer-related genes will be affected by CNAs, also non-related DNA can be damaged. The challenge is to separate the truly oncogenic CNAs from the non-oncogenic passenger CNAs, as these oncogenic CNAs point to novel cancer genes that can be new drug targets. This thesis introduces two methods of finding cancer genes by examining DNA copy number alterations. Multiple comparable tumor samples are used to detect regions in the DNA that are altered significantly more often than other regions, indicating that they are more important for tumor development and therefore probably causative. Analogously, a novel method is introduced to find pairwise regions in the DNA that are preferentially lost or gained together (co-occurring) or preferentially not together (mutually exclusive). It is shown that co-occurring CNAs primarily target genes that are highly similar in function. A detailed analysis of a group of three mutually exclusive CNAs made it possible to associate a novel function to a known cancer gene. The second source of mutations concerns insertions of viral or transposon DNA. These agents insert their DNA in the host genome, which can cause activation or inactivation of host genes. Occasionally they can perturb genes that allow the cell to acquire cancer-related traits. In the end these insertions will cause a tumor. By carefully examining the tumor DNA one is able to reconstruct which genes caused the cancer. Of course, not all insertions were instrumental in the development of the cancer, so also in this case the passenger events and truly causal events have to be separated. This thesis used a novel approach called Shear-Splink that allows determination of the relative number of integrations in a single tumor. Each tumor will present a variety of insertions, each with its own abundance. By examining this abundance it is possible to distinguish between insertions that happened early in tumor development (and are therefore highly abundant) and insertions that are simple passengers or only essential for a small number of cells in the tumor (who will be lowly abundant). In this thesis this has been applied to a study of mouse mammary tumor virus (MMTV), a retrovirus that causes breast cancer in mice through insertion of its DNA in the mouse genome. Results show that by examining the insertion abundance a model through which the tumor has developed can be recovered. Overall this thesis contributes to the analysis of tumor-causing events and especially to the determination of which combination of events is necessary to cause a tumor.

Subject

cancer
computation biology
genetics
bioinformatics
pattern recognition
machine learning
biology

To reference this document use:

http://resolver.tudelft.nl/uuid:1d00019e-3d2e-4719-9d9f-650dad2fae0d

Embargo date

2011-09-09

ISBN

9789461082046

Part of collection

Institutional Repository

Document type

doctoral thesis

Rights

Files

PDF

Thesis_Klijn_final_.pdf

15.14 MB

Close viewer