Narrow width layout Medium width layout Full-screen width layout    Small text Medium text Large text     Register  Login  
  Tools & Resources » EADGENE Oligo Sets Annotation Files

Version 4, Ensembl 54: June 2009

Version 3.2, Ensembl 52: 26th January 2009

Following the EADGENE workshop in Lelystad and the work done with other the teams involved in annotation about pipeline harmonization we have :

  • Changed from Kane criteria to Zhili He's et als ones:
    According to Zhili threshold to have a good hit move from 75% to 85% id  and the threshold to have a noise move from 15bp to 20 bp, because they are adapted to 70mers. As a result we have less oligo annotated however, this new annotation is more accurate.
  • We have used a lager set of transcripts including Ensembl ncrna gene.
  • Adding of the "origin" columns in the general file.
    By default the origin columns is set to 'intron', this means the annotation comes from ensembl transcript. Some extra annotation is fetched thanks to unigene, it retreives hit in an utr area or in an intron part according to Ensembl. In this case the origin value is set to "utr" or "intron".
  • Adding of the "reverse_transcript" columns in the general file.
    The potential hybridisation on the opposite strand is resolved thanks to the "reverse_transcript" columns.
    '' means the good hit is on the transcript strand, '1' means the only good hit is on the reverse transcript strand, and '2' means there is a good hit on the transcript strand but also on the opposite one.  We choose to consider this later case ('2') as not to be a multi
    (category 7), the gene given is the one from the transcript strand.
  • For the chicken and the bovine gene annotation, we don't provide the hgnc XREF since,
    our data source which is Ens52 API don't give it anymore. However the hgnc on human, mouse and rat orthologue is still available.

Note: 11th February 2009

We chose to move from V3.1 to V3.2. First, the two following bugs have been noticed and fixed.

  • The way the global identity percentage was computed according to Zely He & all criteria was wrong, for some oligo we had a value too high.
  • The UTR/intron extention using unigene was only considering gene on the plus strand.

And secondly we have two improvements.

  • There is a new column in generals file: The gene description.
  • The GO annotation has been extended thank to uniprot-swissprot data.

Note: 26th January 2009

We chose to move from V3 to V3.1 because of the three folowing bugs have been noticed and fixed:

  • In general annotation file, Ortholgues description is now given not only for Rat but also for Mouse and Human.
  • All GO from "GO annotation file" and "GO matrices" are now validated on the original Gene Ontology database.
  • In GO matrix, probes are now linked not only to their "terminal go"  but also to all of their ancestors.

EADGENE Oligo Set Annotation Files, Version 2, Ensembl 50: 11th September 2008

Following the EADGENE and SABRE Annotation Workshop (12th Nov 2008, Lelystad), the annotation files below have been updated.  For the EADGENE chicken, bovine, and pig chips, oligos are now blasted only on the 'plus' strand of the transcripts. More over, for the chicken (only) we provide an annotation for oligos that (i) hit against unigene and an extended ensembl transcript linked to the unigene hit; (ii) hit against unigene and an ensembl "intron" linked to the unigene hit.  Extra annotation for the EADGENE pig and bovine chips will come soon.  3031 gene annotation have been modified for the chicken eadgene annotation files, 830 gene annotation have been modified for the bovine eadgene annotation files, 2357 gene annotation have been modified for the pig eadgene annotation files.

New features in Version 2 release:

1.  The "only one noise category" in the probe quality analysis has in two separate categories

  • probes with one "noise" hit having more 30 bp
  • probes with one "noise" hit having less 30 bp

2.  KEGG annotation file split in two files:

  • KEGGgene annotations 
    With the following fields:
    - probe_name,
    - ko_id,
    - species,
    - gene_name,
    - definition,
    - ec
  • KEGGpathway annotations
    - probe_name,
    - hgnc,
    - pathway_id,
    - pathway_description

3.  Four new "matrix of correspondence" files between probes and GO categories or pathways added:

  • first line contains the GO definition or the pathway definition 
  • second line contains the GO ID or pathway ID 
  • first row contains the probe names
  • If a given probe is annotated with a given GO ID or pathway ID then the intersection of the line and the column contains '1', if not it contains '0'.

For pathway, you have one matrix file (...matrix_pathway...) and for GO you have three matrix files (...matrix_GObp..., ...matrix_GOcc..., ...matrix_GOmf...) respectively for biological process, cellular component and molecular function.  

Bugs fixed:

  • orthologous gene definition fields are no more truncated to 64 characters
  • "General annotation file" has a header 
  • files are csv compliant 

EADGENE Oligo Set Annotation Files, Version 1, Ensembl 49: 30th June 2008

Results of Work from the WorkPackage 1.3 (especially INRA and WUR)

The description file is a 2-page long description (pdf format) giving general information about the oligo-set and figures about the annotation results.

The three other files correspond to the annotation results:

The general annotation file contains:

  • oligo ID
  • specificity category (1 to 6 see pdf file)
  • oligo position (chromosome, start, end, strand)
  • gene name
  • Human orthologous gene ID, HGNC and description
  • Mouse orthologous gene ID, HGNC and description
  • Rat orthologous gene ID, HGNC and description
  • link to Uniprot/SWISSPROT
  • link to UniGene
  • link to RefSeq_peptide
  • link to HGNC

The GO annotation file contains:

  • oligo ID
  • GO ontology
  • GO ID
  • GO evidence code
  • GO description

The KEGG annotation file contains:

  • oligo ID
  • KO link
  • EC number
  • pathway ID
  • pathway description

Annotation Version 1, Ensembl 49 : June 30th, 2008
Results of Work from the WorkPackage 1.3 (especially INRA and WUR)


EADGENE is a Network of Excellence supported by funding under the 6th Research Framework Programme of the European Union European Commission. EU Contract No. FOOD-CT-2004-506416. This website represents the views of the Authors, not the European Commission. The Commission is not liable for any use that may be made of the information.


Partner Login
© 2006 by EADGENE   Terms Of Use  Privacy Statement