Tutorial Abstracts

AM1. Computational analyses across the BioCyc collection of Pathway/Genome Databases
AM2. Synthetic biology with BioBrick parts
AM3. Flux-balance analysis of metabolic networks
PM1. The science behind 23andme
PM2. Next generation sequencing technologies and applications
PM3. Decoding ENCODE and other genomics information at UCSC

AM1. Computational analyses across the BioCyc collection of Pathway/Genome Databases
Peter Karp, Bioinformatics Research Group at SRI International

BioCyc is a collection of 370 Pathway/Genome Databases for many organisms whose genomes have been completely sequenced. It is a large and comprehensive resource for systems biology research. We expect that many bioinformatics and computational biology researchers will be interested in computing with BioCyc to address global biological questions, such as studying the phylogenetic distribution and evolution of metabolic pathways. The goal of this tutorial will be to provide researchers with the information they need to perform global analyses of BioCyc. The tutorial will cover the methodologies used to create BioCyc, a description of the database schema and ontologies that underly BioCyc, and descriptions of the APIs (in Perl, Java, and Lisp) that are available to query BioCyc. The tutorial will also present the Pathway Tools semantic inference layer, which is a library of commonly used queries that we have encoded to save researchers time.

Expected outcomes and goals: Students will learn how to perform computational analyses across the large BioCyc collection of Pathway/Genome Databases.

Prequisites: Basic familiarity with programming and databases, and basic familiarity with biological concepts in genomics and metabolic pathways.

Return to Top or Tutorial Schedule

AM2. Synthetic biology with BioBrick parts
Mackenzie Cowell and Jason Morrison, MIT

Synthetic biology is awesome! It's the application of engineering principles and process to the design and assembly of biological systems, including a cycle of specification, design, modeling, implementation, and testing It is enabled by augmenting the genetic engineer's toolbox (recombinant DNA, PCR, and automated sequencing) with nascent foundational technologies: automated construction, standardization, and abstraction.

One of the first goals of this growing field is to create a robust set of freely available standard biological parts - modular biological functions engineered to be easily combined while retaining predictable behavior - used to build novel biological devices and systems. The production of these interchangeable biological parts depends on the successful development of standards for both their structural and functional assembly, while their utility depends on the development of a usable abstraction hierarchy, in which complexity can be compartmentalized and hidden, greatly accelerating the design process of devices and systems.

Already over 2000 standard biological parts exist and are available for use from the Registry of Standard Biological Parts at MIT. Last year, the International Genetically Engineered Machine (iGEM) competition shipped over 82,000 of these biobricks(tm) standard biological parts to more that 53 participating teams around the world. These teams used the parts to create a variety of novel devices and systems, and to create new parts, all of which were contributed back to the community.

In this tutorial we will go over in detail the principles and practice of Synthetic Biology, as described above, and will introduce you to some of the main community resources and participants in the field, such as the Registry of Standard Biological Parts, the iGEM competition, the BioBricks Foundation, SynBERC, and others.

The BioBricks Foundation (BBF) is a non-profit organization devoted to enabling the development of free collections of standard biological parts, currently by supporting an open technical standards setting process and developing a BioBricks(tm) legal scheme.

Return to Top or Tutorial Schedule

AM3. Flux-balance analysis of metabolic networks
Markus Covert, Stanford

This tutorial will demonstrate the power of linear optimization for large-scale modeling, particularly of metabolism. First, we will discuss the need for new approaches to model large biological networks. Next, we will describe linear optimization methods, and show how they greatly reduce the parameter problems that are associated with large-scale network modeling. We will then discuss how to reconstruct the stoichiometric matrix for a metabolic network and show examples of how the flux balance approach can drive an experimental discovery process.

Return to Top or Tutorial Schedule

PM1. The science behind 23andme
Serge Saxonov and Brian Naughton, 23andme

23andme is a personal genomics startup focused on creating web-based tools and content that allow consumers to access and benefit from published scientific research related to their genetic information. 23andme currently uses an Illumina BeadChip to assay a standard panel of 550K tagging SNPs, and a custom panel of 30K targeted SNPs.

This tutorial will cover the following:
• pros and cons of various genotyping platforms, including the design of 23andMe's custom panel of SNPs
• finding, parsing and relating new genetic association research to the public
• designing tools and algorithms to help users explore their genomes
• the potential for new research discoveries with community-driven research

Participants will learn the lay of the land of the new personal genomics field; understand how SNP chips work, and how they compare to present- and next-generation sequencing; learn how the now-ubiquitous genome-wide association studies like Wellcome Trust's Case Control Consortium efforts are performed, and how community-driven research could drive future discoveries.

Return to Top or Tutorial Schedule

PM2. Next generation sequencing technologies and applications
Nader Pourmand, Biomolecular Engineering, UCSC

Over the past few years, exciting advances in next-generation sequencing technology have seen the production of commercial instruments capable of profound improvements in sequence throughput and economy. Early applications include cancer genome analysis, microbial genomics, and the study of the Neanderthal genome. Competition in the field is increasingly intense, with start-ups vying with established life science instrument makers to provide the gold-standard technology that will drive us towards the threshold of the "$1000 genome.” A challenge for the Next Generation Sequencing Systems has been assembling relatively short read lengths to overcome the complexity of nearly all genomes. We will discuss the development, chemistry and application of both the Genome Sequencer FLX system and the SOLiD™ System.

Return to Top or Tutorial Schedule

PM3. Decoding ENCODE and other genomics information at UCSC
Jim Kent, UCSC

The Encyclopedia of DNA Elements (ENCODE) Project contains a treasure trove of information on the function of various pieces of the genome based on a variety of modern high throughput experimental and computational methods. The project is especially interesting for biologists studying chromatin, transcription factors, noncoding RNA, and those interested in a very high quality, curated, collection of human coding genes. All of the ENCODE data is available at http://genome.ucsc.edu, though some, as a courtesy to the labs producing the data, should be considered pre-publication. The UCSC web site contains a number of useful tools for browsing and analyzing ENCODE and other genomics data. All the data is also downloadable in dense, easy to parse, and well documented file formats for analysis on your own computers.

This tutorial will start with an overview of the ENCODE data. This includes chromatin immunoprecipitation (ChIP) of histones in various methylation and acylation states, and nuclease hypersensitivity experiments, and RNA expression assays on many cell lines. Together these experiments shed much light on the chromatin state of various cell types, and the implications for transcription. There is also ChIP data for a large number of transcription factors that control the expression of specific genes, and which together are helpful in determining the combinatorical logic of the regulation of transcription. Other experiments aimed at understanding transcriptional regulation include formaldehyde-assisted isolation of regulatory elements, and assays for the methylation status of CpG rich regions. There is a sub-project, GENCODE, that aims to produce an exhaustive and high quality list of functional human transcripts employing a variety of methods including high throughput sequencing of the ends of G-cap selected RNA, manual curation of existing RNA data, computational predictions, and reverse-transcriptase PCR confirmation of predictions. In addition to these genome-wide experiments the ENCODE project includes smaller, pilot projects to develop high throughput technologies to explore DNA methylation, RNA/protein interactions and long-range DNA/DNA interactions, and new DNA/protein interaction assays both computationally and experimentally. In all the ENCODE data is of interest to a wide variety of biomedical researchers, especially those interested in gene regulation.

The ENCODE data is stored and displayed, alongside other useful genomics data, at the UCSC Genome Informatic web site (genome.ucsc.edu http://genome.ucsc.edu ). In this tutorial we'll explore some of the high points of the ENCODE data as it relates to some interesting genes using the UCSC Genome Browser. In the process we'll review the basic operations of the Genome Browser, and introduce some new and advanced features such as sending a view in email so it can be shared with a colleague, and adding your own tracks that can be seen alongside the genes, comparative genomics, and ENCODE data that is built into the Genome Browser. We'll then shift to tools that are more useful for mining data from the genome as a whole rather than a particular gene. We'll use the Gene Sorter to rapidly find and explore sets of genes that can be defined by many criteria including tissue expression levels, protein domains, and Gene Ontology terms. We'll use the Table Browser to look at the databases underlying the genome.ucsc.edu http://genome.ucsc.edu site, and to combine annotation tracks to focus on particularly interesting regions of the genome - whether transcribed or regulatory in nature. Finally we'll see how to download the data for further analysis on your own computers.

Return to Top or Tutorial Schedule