Cocoa pods, Venezuela. Photo: C.Lanaud ©CIRAD
Arabica coffee, Ethiopia. Photo: ©Jean-Pierre Labouisse
Yams in Benin. Photo: J-L Pham ©IRD
Rice harvest, Guinea. Photo: J-L Pham ©IRD
Maize corn. Photo: ©Brigitte Gouesnard

Main results

a)   We developed a unique Web portal South Green (http://southgreen.cirad.fr/) to give access to tools and databases for managing genetic and genomic resources of tropical and Mediterranean plants, analyzing transcriptomes, predicting orthologs by phylogenomics, determining SSR and SNP, analyzing genetic diversity data, and performing structural, functional and comparative annotations. The South Green web portal contains currently 20 information systems and tools and targets about 30 plants. These tools are available on-line and are used massively (50,000 queries per month).

 b)   We developed new workflows for NGS (Next Generation Sequencing) sequence high-throughput analyses with different steps: cleaning, assembly, mapping, SNP detection, annotations, and phylogenetic analyses. We developed a package gathering the scripts for the analysis of high-throughput sequencing data from the ARCAD project. These scripts were mainly used in support of SP1 and SP5. The package is available at https://github.com/SouthGreenPlatform/arcad-hts.

 c)   For the SP1, we evaluated different methods for de novo short-read assemblies using data from two transcriptomes of crops with reference genomes: grape and sorghum. Then, we have chosen the best methods and parameters to release 29 new transcriptomes of plants including the key species and outgroups sequenced in the SP1. We produced functional annotations and SNPs detection for the different crop individuals. The data are available at http://arcad-bioinformatics.southgreen.fr/.

 d)   We implemented a Galaxy instance (http://gohelle.cirad.fr/galaxy), a workflow manager which permits to run several bioinformatic analyses using a simple Web interface (Figure SP4-1). The South Green Galaxy instance is opened to anyone, but anonymous users are limited to 10 Mo data (Maillol, V., et al., 2012, Role of Galaxy in a bioinformatic plant breeding platform, 2012 Galaxy Community Conference). This instance contains a large collection of exclusive tools of the platform. The access to workflows developed for the ARCAD SP1 project is currently restricted to users with specific login.

 e)   In March 2013, we obtained the AFNOR certification (ISO 9001:2008) for the following activity: provision of bioinformatics software and equipment for the agronomy industry.

 f)   We developed a Web-based tool that provides the means to quickly build search interfaces over existing databases, without the need of any programming effort. It is particularly suited for scientific data that can conveniently be displayed in tables. This application was used for the information system TropGeneDB which will integrate part of the ARCAD data (Hamelin C., et al., 2013. TropGeneDB, the multi-tropical crop information system updated and extended. Nucleic acids research).

 g)   GreenPhylDB is a Web resource designed for comparative and functional genomics in plants. The database contains a catalogue of gene families based on complete genomes, covering a broad taxonomy of green plants. The version 3 of was released (http://www.greenphyl.org).