The human genome consists of all the genes that determine the blueprint of human life. The functioning of genes that are made up of three million nucleotide bases was not known. The project named Encyclopedia of DNA Elements or ENCODE has recorded and mapped the transcriptional regions, chromatin structure, histone modification and transcription factor association. The information obtained through recordings could help the scientists to recognize biochemical functions for about 80 percent of the genome especially in the region that is away from the protein coding regions. Several mechanisms of gene regulation were explored with the discovery of regulatory elements that are related physically with another regulatory element as well as the expressed gene. The elements that are identified in this project study showed statistical association with the genes that have sequence variations representing the human diseases. The newly identified elements can thus help in explaining the reason for this variation. The structure and information of the genes and genome are explored further in this project. This project is the reserve of functional interpretations for biomedical studies.
The human biology is explained clearly by the principal code of the human genome sequence. A large exhaustive study was done on genome particularly to ascertain the protein coding genes. Still the understanding of genome is not yet considered complete, especially in the case of alternatively spliced transcripts, non-coding RNAs and regulatory sequences. The knowledge of human biology and human diseases as well as the documentation of regulatory regions and genes are performed by systematic analysis of regulatory and transcript information. This analysis can also give solid views on the gene variability and their structure, and the regulatory information on several cellular types, various species and several individuals.
The ENCODE project works in exploring the functional elements of the human genome. The functional element gives a meaning that a separate genome segment which codes for a protein or non-coding RNA. The functional element might also represent a biochemical signature like chromatin structure, meant for a specific purpose or the region that helps in protein binding.
In this study, the researchers have arranged for 1640 data sets for the primary analysis of functional elements in the whole genome. The results taken from diverse experiments are comparatively analyzed in few of the cell types. The results from experiments associated with 147 cell types and the data related to ENCODE were formulated with the help of assets like evolutionarily constrained regions and candidate regions from genome-wide association studies. All the above mentioned efforts led to the establishment of significant features regarding the structure and function of human genome. The information gathered is abridged as given below.
- Nearly 80.4 percent of human genome is known to be involved in the events associated with at least one biochemical RNA and one chromatin connected event in at least one cell type. Most of the genome appears to be associated with the regulatory events. Nearly 95 percent of the genome exists below the size of eight kilobasepairs, possessing the function of DNA-protein interaction. Nearly, 99 percent of genome that is below the size of 1.7kilobasepairs was part of at least a single biochemical event.
- The genetic elements that are specific to primates and those which do not possess identifiable mammalian constraints show entirely the evidence of negative selection.
- The genome is classified into seven chromatin states. The genome is said to have 3, 99, 124 regions accompanied by regions similar to the enhancers and regions similar to the promoter features found at around 70, 292 regions. There are several thousands of quiescent regions seen in the genome. Numerous narrow states with divergent functional properties constitute various subdivisions of the genome identified by the procedure of high resolution analysis.
- The production and processing of RNA sequence were correlated with the binding of transcription factors at the promoter regions and chromatin marks. It was evaluated that the changes in the RNA expression can be explained by the promoter functionality.
- The non-coding variants of the genome sequence were observed to be present in the functional regions that are interpreted by ENCODE. These variants are as numerous as those seen in the protein coding regions of the genome.
- According to the genome wide association studies, a disease that is associated with a single nucleotide polymorphism are mostly seen in non-coding functional elements. All these elements are observed to be mostly available in ENCODE representing regions that are not part of protein coding genes. In several cases, the disease phenotypes are related to specific transcription factors or specific cell types.