An official website of the United States government.

Official websites use .gov
A .gov website belongs to an official government organization in the United States.

Secure .gov websites use HTTPS
A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

Knowledge Representation Resources for Animal Agricultural Researchers

Objective

For the first time in history, biologists have access to technologies that enable them to rapidly generate enormous amounts of data about the genomes of our agricultural species. However, researchers using these technologies now face a major bottleneck in deriving knowledge from data to use it for improving agricultural productivity. Our goal is to enable researchers to accelerate knowledge delivery from research investments by giving them the tools to avoid the current bottleneck. We will do this by linking existing information about how genes work to biological data; developing novel and improved methods for predicting links between our existing knowledge and biological data; and by providing new tools for viewing how biological data relates to different species. The tools, training and resources that we develop are easily extended to other species. Not only will we provide data and tools but we also provide integrated, practical training for the next generation of US researchers. This training ensures that researchers are able to use the resources we provide.

More information

<p>NON-TECHNICAL SUMMARY:<br/> For the first time in history, biologists have access to technologies that enable them to rapidly generate enormous amounts of data about the genomes of our agricultural species. However, researchers using these technologies now face a major bottleneck in deriving knowledge from data to use it for improving agricultural productivity. Our goal is to enable researchers to accelerate knowledge delivery from research investments by giving them the tools to avoid the current bottleneck. We will do this by linking existing information about how genes work to biological data; developing novel and improved methods for predicting links between our existing knowledge and biological data; and by providing new tools for viewing how biological data relates to different species. The tools, training and resources that we develop are easily extended to other
species. Not only will we provide data and tools but we also provide integrated, practical training for the next generation of US researchers. This training ensures that researchers are able to use the resources we provide. Our training component also specifically targets traditionally under-represented minorities in science and technology. The outcome of this project is that researchers will be able to more effectively and efficiently convert the power of genomic research into gains for use agriculture and consumers. Overall, the impact of our work is the improved ability for researchers to benefit society through improved agricultural systems, renewable energy, aquaculture, human nutrition, food safety and biotechnology. The societal impact of our education initiative is recruitment of minorities to emerging areas of biology via novel education and training opportunities.

<p>APPROACH:
<br/>4. EXPERIMENTAL APPROACH Aim 1: Targeted GO biocuration for agricultural animals. Our strategy of targeting our manual biocuration to specific gene products enables us to make most efficient use of AgBase biocurator time to meet community needs. This aim's experimental design has three parts: (A) Ranking gene products for targeted, manual biocuration. (B) Targeted manual biocuration for key genes being intensively studied in each species. (C) Providing continuing outreach and education to support functional modeling and undergraduate bioinformatics education. Aim 2: Computational pipelines for generating rapid functional annotations. This aim addresses the critical need for new computational pipelines to rapidly provide functional annotations required to address the increasing quantity of omics data that we confront. This need is supported by the
knowledge that even for GO founder species, that have been doing manual biocuration of literature for a decade, > 90% of GO annotations are computationally derived. Our experimental design has two parts, corresponding to different types of transcriptional elements identified by functional genomics experiments such as RNA-Seq and microarray data. (A) Known mRNA/genes. Previously annotated transcripts and genes (including 'predicted' genes computationally identified during genome sequence and assembly) are identified in RNA-Seq data just as they are in microarray data. GO annotation proceeds exactly as we have done already for annotating microarrays. Briefly, gene products may be GO annotated using published functional literature (if available) or using computational annotation. In cases where there is literature available we will develop a GO annotation strategy that maps
iTerms to GO Terms using names (and synonyms) and use this information to provide additional GO annotation. (B) Novel transcripts. Sequences that map to the genome in regions not currently annotated as being transcribed can be GO annotated using existing computational pipelines (such as InterProScan and our ISO Pipeline), exactly as we do for gene products that have no functional literature. We will expand our efforts to provide GO annotations for novel sequences to include new agricultural animal species (pig, horse, catfish and then turkey and sheep). In addition, we will also develop procedures for updating these annotations and link these update procedures to our regular AgBase updates (currently updated bimonthly). Many novel sequences may not contain recognizable functional motifs; in such cases we cannot provide GO annotation. To provide initial functional annotation in these
cases, however, we will use the BRaunschweig ENzyme DAtabase (BRENDA) tissue ontology to annotate tissue expression for these gene products. We will make this data publicly available at AgBase. Aim 3: Linking phenotypes and traits to functions. Our experimental design for this specific aim involves using comparative genomic mapping combined with iTerms to link genes to agricultural traits of interest. We will do this by developing (A) appropriate comparative genomic browsers to allow simultaneous visualization of genomic data across multiple species; and (B) new resources that use computational text mining and knowledge extraction tools to provide large-scale prediction of candidate genes for known QTLs.</p>

Investigators
McCarthy, Fiona
Institution
University of Arizona
Start date
2012
End date
2015
Project number
ARZW-2013-06024
Accession number
1001965