Curated data sets for assessing predictions
Standard 1 (Adh.std1.gff) “conservative gene set”
- 43 gene structures (7 single- and 36 multi- coding exon genes)
- Criteria for inclusion:
- >=95% (most >=99%) of the cDNA aligned to genomic DNA (using sim4)
- “GT”/”AG” splice site consensus sequences
- Splice site score from neural net
- 5’ splice sites: >=0.35 threshold ( 98% True Positive score)
- 3’ splice sites: >=0.25 threshold ( 92% True Positive score)
- Start codon and stop codon annotations from Standard 3 (derived from Adh paper)
- These 43 genes represent “typical” genes