List of 4075 genes selected by D. Studholme at Exeter
Helder has just sent you a list of 4075 genes selected by D. Studholme since they have not any of the following three obvious potential problems:
(1) Missing stop codon (i.e. final predicted exon lacks and in-frame stop codon; given that no UTRs are annotated, then I assume that the final codon should contain an in-frame stop codon).
(2) Multiple possible ATG start codons. In other words, it is possible that the gene model has the wrong start codon.
(3) Very short first exon.
This selection does not imply that these 4075 gene models are right, but here is no obvious evidence that they are wrong. Therefore, they were prioritised by Exeter for their ORFeome project.
We agreed in July to have a look the first 600 of these genes (20 for each annotator) as a pilot study in order to have your feedback on the actual settings of Zt WEB Apollo site, before opening it to the community. These genes are likely more easy to check than a random selection.
When annotating, please check that the gene you are working on do not contain an open reading frame from a transposon using the track . Transposon ORFs are difficult to annotate and are therefore excluded from the community annotation process
Use it for comparison to other well annotated fungal genomes
Use first the bw tracks of RR files. If not sufficient find the best samples in other tracks.
Overall, select only few RNAseq tracks, ie those for which your gene display a sufficient number of reads (>20), and if possible RNAseq conditions in which the neighboring genes are not expressed. Indeed, the gene coding part of Zt genome is compact, and frequently there is an overlap of the RNAseq reads among neighboring genes, making difficult to define the start of its 5’UTR and end of its 3’UTR.
Gene naming (to give to the curated gene in the main track)
We agreed in July to use the number defined by WUR (track), using the prefix ZtWA for
Zymoseptoria tritici Web Apollo.
JGI annotation being wrong (introns not supported by RNAseq data in this case), the Wageningen model is chosen for the used-created annotation track and renamed ZtWA_66101.
This topics will be further discussed at Kiel in September 2017.
We need your feedback on the tracks displayed (are they all useful or not?), do we need additional ones? What are the problems you have encountered?
You can use the google sheet sent by Helder to leave on line your comment gene per gene. We will open a dedicated mailing list so you could send questions/comments to all annotators early next week.