We aim to develop appropriate bioinformatics tools for the management and harnessing of the next generation sequencing data and to make them available first to all partners of the ARCAD project and subsequently to an extended community. This will require the best new technologies to explore
- complex high throughput sequence analysis ranging from assemblers, multialignment searches to SNP detection and phylogenetic analyses,
- several data models and databases to store heterogeneous data that will enable interoperability between systems,
- several Web interfaces to launch analyses, to synthesize, visualize, query and edit the results.
Our objectives are:
(i) to set up a collaborative development environment to avoid redundancy and to facilitate future bioinformatics developments across organizations,
(ii) to provide training in bioinformatics and support for bioinformatics projects hosted on the ARCAD platform,
(iii) to collaborate (share software, workshop, mailing lists, and good practices) with other national as well as international bioinformatics platforms
(iv) to ensure quality control in bioinformatics research though a scientific user committee, documentation, data traceability and reliability, CECILL licences, indicator measurement.
The project makes use of computer facilities built around a 240 processors computer cluster, a low latency Infiniband network, and a 65 To storage capacity. The amount and nature of sequencing data brought many bioinformatics issues in terms of algorithms for NGS analyses and data storage.