In the past years, top scientists, researchers and analysts worldwide determine Big Data as a revolution in scientific research and one of the most promising trends in IT, that has given impetus to the intensive development of methods and technologies for their processing and has resulted in the emergence of a new paradigm for scientific research “Data-Intensive Scientific Discovery“ (DISD) as well.

The project involves theoretical study and experimental activities to create an innovative intelligent method and tools for adaptive in silico knowledge discovery and decision making, based on Big data streams analysis for scientific research, accumulated as a result of computer modeling and simulation experiment. The method is based on machine learning and procedures for generating rules tailored to the target of scientific research. The major advantage of the method is the automatic generation of hypotheses and options for decisions, as verification and validation are performed using standard data sets and expertise of scientists from the scientific target area. Consequently, researchers from a wide range of scientific areas will be given opportunities to apply the new paradigm “DISD”in their research, which in turn will stimulate scientific discovery and innovation. The tools for utilizing the method are scalable framework and scientific platform to access the in silico knowledge base and the software tools, as well as opportunities to share knowledge, experience, best practices, knowledge and technology transfer.  The method will be applied for scientific investigation in the areas of molecular biology and medical genetics for two specific case studies:

(1) Genetic regulatory elements identification for unknown genes detection in sequenced genomes and for the aim of genomic mapping.

(2) To predict the type and malignancy of breast cancer based on information for mutations in genes associated with it, the level of expression and the associated epigenetic information. The driving force of the project is an interdisciplinary team, combining expertise in information science and technology, engineering foundations and technical deployment of software methods and tools, as well as established scholars in the areas of molecular biology and medical genetics.