New Algorithms for δγ-Order Preserving Matching

Nuevos Algoritmos para Búsqueda de Orden δγ

  • Juan Mendivelso Universidad Nacional de Colombia
  • Rafael Niquefa Institución Universitaria Politécnico Grancolombiano
  • Yoan Pinzón Pontificia Universidad Javeriana. Cali - Colombia
  • Germán Hernández Universidad Nacional de Colombia. Sede Bogotá
Keywords: String searching, experimental algorithm analysis, strings similarity metric. (en_US)

Abstract (en_US)

Context: Order-preserving matching regards the relative order of strings. However, its application areas require more flexibility in the matching paradigm. We advance in this direction in this paper that extends our previous work [27].

Method: We define γ -order preserving matching as an approximate variant of order-preserving matching. We devise two solutions for it based on segment and Fenwick trees: segtreeBA and bitBA.

Results: We experimentally show the efficiency of our algorithms compared to the ones presented in [26] (naiveA and updateBA). Also, we present applications of our approach in music retrieval and stock market analysis.

Conclusions: Even though the worst-case time complexity of the proposed algorithms (namely, O(nm log m)) is higher than the Ѳ(nm)-time complexity of updateBA, their Ω (n log n) lower bound makes them more efficient in practice. On the other hand, we show that our approach is useful to identify similarity in music melodies and stock price trends through real application examples.

Abstract (es_ES)

Contexto: El emparejamiento de cadenas según el orden compara la estructura de las cadenas de texto. Sin embargo, sus áreas de aplicación requieren mayor flexibilidad en el criterio de comparación. Este artículo avanza en esta dirección al extender [27]. 

Método: Se define la búsqueda de orden – γ como una variante aproximada del problema de emparejamiento de cadenas según orden. Se proponen dos soluciones basadas en árboles de segmentos y árboles Fenwick: segtree BA and bit BA.

Resultados: La eficiencia de los algoritmos propuestos se muestra experimentalmente comparándolos con los algoritmos presentados en [26] (naive A y update BA). Además, se presentan aplicaciones.

Conclusiones: A pesar de que la complejidad en tiempo de peor-caso de los algoritmos propuestos (a decir, O (nm log m)) es mayor que la complejidad de update BA (Ѳ (nm)), su cota baja Ω(n log n) los hace más eficientes en la práctica. También se muestran aplicaciones del enfoque propuesto en recuperación de música y análisis del mercado de acciones con ejemplos reales.

Downloads

Download data is not yet available.

Author Biography

Juan Mendivelso, Universidad Nacional de Colombia

Ingeniero de sistemas, Msc, PhD. Grupo de investigación DiscreMath: Matemáticas Discretas y Ciencias de la Computación, Facultad de Ciencias, Departamento de Matemáticas, Universidad Nacional de Colombia, Bogotá, Colombia.

References

MIDI Association. Midi official webpage. [Online]. Available https://www.midi.org/".

Peter Brass. Advanced Data Structures. Cambridge University Press, 2008".

https://doi.org/10.1017/CBO9780511800191

Emilios Cambouropoulos, Maxime Crochemore, Costas Iliopoulos, Laurent Mouchard, and Yoan Pinzon. Algorithms for computing approximate repetitions in musical sequences. International Journal of Computer Mathematics, 79(11):1135–1148, 2002".

https://doi.org/10.1080/00207160213939

Tamanna Chhabra, M. Oguzhan Kulekci, and Jorma Tarhio. Alternative algorithms for order-preserving matching. In Jan Holub and Jan Zdárek, editors, Proceedings of the Prague Stringology Conference 2015, pages 36–46, Czech Technical University in Prague, Czech Republic, 2015".

Tamanna Chhabra and Jorma Tarhio. Order-preserving matching with filtration. In Proceedings of the 13th International Symposium on Experimental Algorithms - Volume 8504, pages 307–314, New York, NY, USA, 2014. Springer-Verlag New York, Inc.".

https://doi.org/10.1007/978-3-319-07959-2_26

Tim Crawford, Costas S. Iliopoulos, and Rajeev Raman. String-Matching Techniques for Musical Similarity and Melodic Recognition. Computing in Musicology, 11:71–100, 1998".

Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Marcin Kubica, Alessio Langiu, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, and Tomasz Wale´n. Order-Preserving Incomplete Suffix Trees and Order- Preserving Indexes, pages 84–95. Springer International Publishing, Cham, 2013".

https://doi.org/10.1007/978-3-319-02432-5_13

Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Marcin Kubica, Alessio Langiu, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walen. Order-preserving suffix trees and their algorithmic applications. CoRR, abs/1303.6872, 2013".

Maxime Crochemore, Costas S. Iliopoulos, Tomasz Kociumaka, Marcin Kubica, Alessio Langiu, Solon P. Pissis, Jakub Radoszewski, Wojciech Rytter, and Tomasz Wale’n. Order-preserving indexing. Theor. Comput. Sci., 638(C):122–135, July 2016".

https://doi.org/10.1016/j.tcs.2015.06.050

Mark de Berg, Otfried Cheong, Marc van Kreveld, and Mark Overmars. More geometric data structures: Windowing. Computational Geometry: Algorithms and Applications, pages 219–241, 2008".

Simone Faro and M. Oguzhan Kulekci. Efficient algorithms for the order preserving pattern matching problem. CoRR, abs/1501.04001, 2015".

Peter M Fenwick. A new data structure for cumulative frequency tables. Software: Practice and Experience, 24:327–336, 1994".

https://doi.org/10.1002/spe.4380240306

Paweł Gawrychowski and Przemysław Uznanski. Order-preserving pattern matching with k mismatches. Theoretical Computer Science, 638:136 – 144, 2016". Pattern Matching, Text Data Structures and Compression Issue in honor of the 60th birthday of Amihood Amir".

Md Mahbubul Hasan, ASM Shohidull Islam, Mohammad Saifur Rahman, and M Sohel Rahman. Order preserving pattern matching revisited. Pattern Recognition Letters, 55:15–21, 2015".

https://doi.org/10.1016/j.patrec.2014.11.013

Md Mahbubul Hasan, ASM Sohidull Islam, Mohammad Saifur Rahman, andMSohel Rahman. Order preserving prefix tables. In International Symposium on String Processing and Information Retrieval, pages 111–116. Springer, 2014".

Inc IMDb. John Williams imdb profile. [Online]. Available http://www.imdb.com/name/

Inc IMDb. Star Wars: Episode V - The Empire Strikes Back (original title) imdb page. [Online]. Available http://www.imdb.com/title/tt0080684/".

Jinil Kim, Peter Eades, Rudolf Fleischer, Seok-Hee Hong, Costas S. Iliopoulos, Kunsoo Park, Simon J. Puglisi, and Takeshi Tokuyama. Order-preserving matching. Theoretical Computer Science, 525:68 – 79, 2014. Advances in Stringology".

https://doi.org/10.1016/j.tcs.2013.10.006

Marcin Kubica, Tomasz Kulczynski, Jakub Radoszewski, Wojciech Rytter, and Tomasz Walén. A linear time algorithm for consecutive permutation pattern matching. Information Processing Letters, 113(12):430–433, 2013".

https://doi.org/10.1016/j.ipl.2013.03.015

Inbok Lee, Juan Mendivelso, and Yoan J. Pinzón. γδ– Parameterized Matching, pages 236-248. Springer Berlin Heidelberg, Berlin, Heidelberg, 2009".

Juan Mendivelso. Definition and solution of a new string searching variant termed γδ -parameterized matching. Master's thesis, National University of Colombia, Bogota, Colombia, 2010".

Juan Mendivelso, Inbok Lee, and Yoan J. Pinzón. Approximate Function Matching under δ- and γ- Distances, pages 348–359. Springer Berlin Heidelberg, Berlin, Heidelberg, 2012".

Juan Mendivelso, Camilo Pino, Luis F. Ni-o, and Yoan Pinzón. Approximate Abelian Periods to Find Motifs in Biological Sequences, pages 121–130. Springer International Publishing, Cham, 2015".

https://doi.org/10.1007/978-3-319-24462-4_11

Juan Mendivelso and Yoan Pinzón. A novel approach to approximate parikh matching for comparing composition in biological sequences. In Proceedings of the 6th International Conference on Bioinformatics and Computational Biology (BICoB 2014), 2014".

Rafael Niquefa. Definition and solution of a new approximate variant of the order preserving matching problem. Master's thesis, Universidad Nacional de Colombia, 2017".

Rafael Niquefa, Juan Mendivelso, Germán Hernández, and Yoan Pinzón. Order Preserving Matching under δγ-approximation. In Congreso Internacional de Ciencias Básicas e Ingeniería, 2017".

Rafael Niquefa, Juan Mendivelso, Germán Hernández, and Yoan Pinzón. Segment and fenwick trees for approximate order preserving matching. In Workshop on Engineering Applications, pages 131–143. Springer, 2017".

https://doi.org/10.1007/978-3-319-66963-2_13

Thomas Scarff. MIDI note numbers. [Online]. Available http://www.electronics.dit.ie/staff/

How to Cite
Mendivelso, J., Niquefa, R., Pinzón, Y., & Hernández, G. (2018). New Algorithms for δγ-Order Preserving Matching. Ingeniería, 23(2), 190-202. https://doi.org/10.14483/23448393.13248
Published: 2018-05-31
Section
Computational Intelligence