How do we incorporate evidence codes?

From HAO Wiki

Jump to: navigation, search

We should think about how we assign and incorporate evidence codes (http://bioportal.bioontology.org/visualize/36625) to homology statements. The evidence code ontology is also available as OBO.

The most relevant ones for us are likely:

id: ECO:0000001
name: inferred by curator
def: "To be used for those cases where an annotation is not supported by any evidence, 
but can be reasonably inferred by a curator from other GO annotations, for which evidence 
is available." [ECO:go]
id: ECO:00000067
name: inferred from electronic annotation
def: "Used for annotations that depend directly on computation or automated transfer of annotations 
from a database, particularly when the analysis is performed internally and not published. A key 
feature that distinguishes this evidence code from others is that it is not made by a curator; use IEA 
when no curator has checked the specific annotation to verify its accuracy. The actual method used 
(BLAST search, Swiss-Prot keyword mapping, etc.) doesn't matter." [GO:IEA]
id: ECO:0000012
name: inferred from functional complementation
def: "Used when an annotation is made based on a functional complementation assay in which a wild-type 
copy of the gene in question is inserted into a mutant background in the organism of the gene's origin 
or a heterologous organism with a mutation in the homologous gene." [TAIR:TED]
id: ECO:0000027
name: inferred from structural similarity
def: "Used when an annotation is made based on the structural similarity of the annotated gene/gene 
product to a single other gene or group of genes. In the case of a single gene, an accession for the 
related gene's sequence is entered in the evidence_with field." [TAIR:TED]
id: ECO:0000033
name: traceable author statement
def: "The TAS evidence code covers author statements that are attributed to a cited source. Typically this 
type of information comes from review articles. Material from the introductions and discussion sections of 
non-review papers may also be suitable if another reference is cited as the source of experimental work or 
analysis. When annotating with this code the curator should use caution and be aware that authors often cite 
papers dealing with experiments that were performed in organisms different from the one being discussed in 
the paper at hand. Thus a problem with the TAS code is that it may turn out from following up the references 
in the paper that no experiments were performed on the gene in the organism actually being characterized in 
the primary paper. For this reason we recommend (when time and resources allow) that curators track down the 
cited paper and annotate directly from the experimental paper using the appropriate experimental evidence code.
When this is not possible and it is necessary to annotate from reviews, the TAS code is the appropriate code 
to use for statements that are associated with a cited reference. Once an annotation has been made to a given 
term using an experimental evidence code, we recommend removing any annotations made to the same term using 
the TAS evidence code." [GO:TAS]
id: ECO:0000034
name: non-traceable author statement
def: "The NAS evidence code should be used in all\ncases where the author makes a statement that a curator 
wants to capture but\nfor which there are neither results presented nor a specific reference cited\nin the 
source used to make the annotation. The source of the information may\nbe peer reviewed papers, textbooks, 
database records or vouchered specimens." [GO:NAS]
id: ECO:0000035
name: no biological data
def: "Used for annotations when information about the molecular function, biological process, or cellular 
component of the gene or gene product being annotated is not available. Use of the ND evidence code indicates 
that the annotator at the contributing database found no information that allowed making an annotation to any 
term indicating specific knowledge from the ontology in question (molecular function, biological process, or 
cellular component) as of the date indicated. This code should be used only for annotations to the root terms, 
molecular function ; GO:0003674, biological process ; GO:0008150, or cellular component ; GO:0005575, which, when 
used in annotations, indicate that no knowledge is available about a gene product in that aspect of GO." [GO:ND]
id: ECO:0000057
name: inferred from phenotypic similarity
def: "Used when comparing organisms, in whole or in part, based on the outcomes of expressions of genotypes 
in their environments." [PhenoScape:IPTS]
id: ECO:0000060
name: inferred from positional similarity
def: "Used when an annotation is made based on the similarity of the location and or arrangement of structures." 
[PhenoScape:IPS]
id: ECO:0000063
name: inferred from compositional similarity
def: "Used when an annotation is made based on the similarity of the histological makeup of structures." 
[PhenoScape:ICS]
id: ECO:0000071
name: inferred from morphological similarity
def: "Used when an annotation is made based on the similarity of the shape, structure or overall configuration 
of structures." [PhenoScape:IMS]
id: ECO:0000080
name: inferred from phylogeny
def: "Used when an annotation is made based on the common ancestry of structures on a particular phylogenetic
tree. Typically, other evidence (a type of similarity) supports a prior hypothesis of homology for these 
structures." [PhenoScape:IP]


Though there may be some instances where these ECOs are appropriate:

id: ECO:0000006
name: inferred from experiment
def: "Used in an annotation to indicate that an experimental assay has been located the cited reference, 
whose results indicate a gene product's function, process involvement, or subcellular location (indicated by 
the GO term). The IE code is the parent code for the IDA, IMP, IGI, IEP and IPIexperimental codes. The IE 
evidence code can be used where any of the assays described for the IDA, IMP, IGI, IPI or IEP evidence codes 
is reported. However it is highly encouraged that groups should annotate to one of the more granular experimental 
codes (IDA, IMP, IGI, IPI or IEP ) instead of IE, and all curators directly involved in the GO Reference Genome 
annotation effort are obliged to use these and not IE. The IE code exists for groups who would like to contribute 
high-quality GO annotations that are produced from directly associating GO terms to gene products by citing 
experimental published results, but where the group is unable to fit the appropriate specific experimental GO 
evidence codes to each annotation." [GOC:ecd]
id: ECO:0000015
name: inferred from mutant phenotype
def: "The IMP evidence code covers those cases when the function, process or cellular localization of a gene 
product is inferred based on differences in the function, process, or cellular localization between two different 
alleles of the corresponding gene. The IMP code is used for cases where one allele may be designated 'wild-type' 
and another as 'mutant'. It is also used in cases where allelic variation occurs naturally and no specific allele 
is designated as wild-type or mutant. Caution should be used when making annotations from gain-of-function 
mutations as it may be difficult to infer a gene's normal function from a gain of function mutation, although it 
is sometimes possible." [GO:IMP]
id: ECO:0000037
name: not_recorded
def: "Used for annotations done before curators began tracking evidence types (appears in some legacy
annotations). It may not be used for new annotations." [GO:NR]
id: ECO:0000067
name: inferred from developmental similarity
def: "Used when an annotation is made based on the similarity of embryological and/or post-embryonic 
origin of structures." [PhenoScape:IDS]
id: ECO:0000094
name: inferred from bioassay
def: "Used when an annotation is made based on assays using living organisms to measure the effect of a 
substance, factor, or condition." [TAIR:TED]
id: ECO:0000174
name: inferred from physiological response
def: "Used when an annotation is made based on the physiological response of a mutant to an external stimulus; 
for example, abnormal growth of the root in response to gravity, delay in flowering in response to varying 
light conditions." [TAIR:TED]
id: ECO:0000179
name: inferred from animal model system

Are any missing? What's the strategy for incorporating these codes. Do we create a homologous_to relationship that's stored elsewhere (i.e., in another database)?

Personal tools
HAO Wiki