<OL> <LI> To collect complete genomic sequence of six STEC non-O157 stains of the following serotypes: O26, O111, O103, O121, O45, and O145.
<LI> To search for serotype-specific DNA sequences that can be used to develop a single PCR-based assay for the detection of the top six CDC non-O157 STECs.
APPROACH: Individually sequence genomic DNA from one strain of each of the top six STEC non-O157 serotypes that cause disease in humans (CDC). <BR> 1. Generate random, sheared genomic libraries and sequence using an in-house Life Sciences 454 GS FLX sequencer to approximately 20x genome coverage. Our pilot study using an E. coli O157:H7 isolate generated about 100 segments with a median size 140 kilobases and an average size 49 kilobases. <BR> 2. Use manual finishing techniques to join the segments (approximately 100) into a final, complete genome consensus for each serotype, using PCR technology and an in-house ABI 3730 sequencer.<BR> 3. Use both proprietary and publicly available software to annotate genomic sequence with known genes (including putative toxin genes) and contrast gene content among serotypes. Deposit fully annotated sequence in a public database.