Skip to main content

New Bounds for Motif Finding in Strong Instances

  • Conference paper
Book cover Combinatorial Pattern Matching (CPM 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4009))

Included in the following conference series:

Abstract

Many algorithms for motif finding that are commonly used in bioinformatics start by sampling r potential motif occurrences from n input sequences. The motif is derived from these samples and evaluated on all sequences. This approach works extremely well in practice, and is implemented by several programs. Li, Ma and Wang have shown that a simple algorithm of this sort is a polynomial-time approximation scheme. However, in 2005, we showed specific instances of the motif finding problem for which the approximation ratio of a slight variation of this scheme converges to one very slowly as a function of the sample size r, which seemingly contradicts the high performance of sample-based algorithms. Here, we account for the difference by showing that, for a variety of different definitions of “strong” binary motifs, the approximation ratio of sample-based algorithms converges to one exponentially fast in r. We also describe “very strong” motifs, for which the simple sample-based approach always identifies the correct motif, even for modest values of r.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Brejova, B., Brown, D.G., Harrower, I.M., Lopez-Ortiz, A., Vinar, T.: Sharper upper and lower bounds for an approximation scheme for Consensus-Pattern. In: Apostolico, A., Crochemore, M., Park, K. (eds.) CPM 2005. LNCS, vol. 3537, pp. 1–10. Springer, Heidelberg (2005)

    Chapter  Google Scholar 

  2. Hertz, G.Z., Stormo, G.D.: Identifying DNA and protein patterns with statistically significant alignments of multiple sequences. Bioinformatics 15(7-8), 563–577 (1999)

    Article  Google Scholar 

  3. Hoeffding, W.J.: Probability inequalities for sums of bounded random variables. Journal of the American Statistical Association 58, 713–721 (1963)

    Article  MathSciNet  Google Scholar 

  4. Li, M., Ma, B., Wang, L.: Finding similar regions in many strings. Journal of Computer and System Sciences 65(1), 73–96 (2002)

    Article  MathSciNet  Google Scholar 

  5. Liang, C.: COPIA: a new software for finding consensus patterns in unaligned protein sequences. Master’s thesis, University of Waterloo (October 2001)

    Google Scholar 

  6. Liu, J.: A combinatorial approach for motif discovery in unaligned DNA sequences. Master’s thesis, University of Waterloo (March 2004)

    Google Scholar 

  7. McDiarmid, C.: Concentration. In: Habib, M. (ed.) Probabilistic methods for algorithmic discrete mathematics, pp. 195–248. Springer, Heidelberg (1998)

    Google Scholar 

  8. Panconesi, A., Srinivasan, A.: Randomized distributed edge coloring via an extension of the Chernoff-Hoeffding bounds. SIAM Journal on Computing 26, 350–368 (1997)

    Article  MathSciNet  MATH  Google Scholar 

  9. Pevzner, P.A., Sze, S.: Combinatorial approaches to finding subtle signals in DNA sequences. In: Proceedings of the 8th International Conference on Intelligent Systems for Molecular Biology (ISMB 2000), pp. 269–278 (2000)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2006 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Brejová, B., Brown, D.G., Harrower, I.M., Vinař, T. (2006). New Bounds for Motif Finding in Strong Instances. In: Lewenstein, M., Valiente, G. (eds) Combinatorial Pattern Matching. CPM 2006. Lecture Notes in Computer Science, vol 4009. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11780441_10

Download citation

  • DOI: https://doi.org/10.1007/11780441_10

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-35455-0

  • Online ISBN: 978-3-540-35461-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics