A Polynomial-time Regular Expressions Implementation

Autores

  • Paulo Eustaquio Duarte Pinto Universidade do Estado do Rio de Janeiro
  • Juan Pedro Alves Lopes Universidade do Estado do Rio de Janeiro
  • Maria Alice Silveira de Brito Universidade do Estado do Rio de Janeiro

DOI:

https://doi.org/10.12957/cadinf.2016.13778

Resumo

Regular expressions are a notation to define regular languagesin terms of simple composable operations. They are equivalentto finite automata in expressive power. In practice, however, modernregular expressions implementations diverge from the original theory.Most changes are made to allow greater expressive power. This conveniencecomes at the cost of making language membership a harderproblem than it could be. In many modern languages, the regex languagemembership is a NP-complete problem. Besides, the way they areimplemented sometimes causes expressions that could be processed inlinear time to take exponential time. This fact may be seen as a securityrisk for many applications that use regular expressions. In this work, wesuggest a simple implementation (based on Thompson’s ConstructionAlgorithm) that has superior worst-case performance than many popularimplementations. We also introduce a didactic notation for automatacreated by this algorithm that makes the adopted implementation easierto understand.

Biografia do Autor

Paulo Eustaquio Duarte Pinto, Universidade do Estado do Rio de Janeiro

IME/DICC

Downloads

Publicado

2017-03-24

Como Citar

Pinto, P. E. D., Lopes, J. P. A., & Brito, M. A. S. de. (2017). A Polynomial-time Regular Expressions Implementation. Cadernos Do IME - Série Informática, 37(Junho), 22–36. https://doi.org/10.12957/cadinf.2016.13778