AGRUPAMIENTO DE DATOS DE SERIES DE TIEMPO. ESTADO DEL ARTE
Palabras clave:Time series data, time seriesclustering (es).
Time series clustering has been an important research field in the last decade, providing useful and effective information in diverse domain. As outcome of the great existing interest for part of the scientific community of data mining area, innumerable research works have arisen that propose new algorithms and methodologies to identify cluster in the data time series. To provide an overview, this paper surveys and summarizes works that investigated the data time series clustering in diverse applications field. The basic concepts of time series clustering are presented and the surveyed works are organized into three groups: temporal-proximity-based, model-based and representation-based. The application areas are summarized with a brief description of the used data. The characteristics and particularities of some works are discussed.
Alonso, A.M., et al. (2006). Time series clustering based on forecast densities. ScienceDirect, 15.
Bandyopadhyay, S., Baragona, R. y Maulik, M. (2010). Fuzzy clustering of univariate and multivariate time series by genetic multiobjective optimization. Comisef Working Papers Series. Computational Optimization Methods in Statistics, Econometrics and Finance.
Chandrakala, S. and Sekhar, C.C. (2008). A density based method for multivariate time series clustering in kernel feature space. International Joint Conference on Neural Networks, 6.
Chen, J.R. (2007). Useful clustering outcomes from meaningful time series clustering. En Proc. 6th Australasian Data Mining Conference (AusDM’07). Gold Coast (Australia).
Chiu, T.Y., Hsu, T.C. y Wang, J.S. (2010). Apbased consensus clustering for gene expression time series. En 20th IAPR International Conference on Pattern Recognition. Istanbul, Turkey, IEEE, 2512-2515.
Cowpertwait, P.S.P. y Metcalfe, A.V. (2009). Introductory time series with r. New York, NY: Springer.
Debeljak, M. et ál. (2010). Analysis of time series data on agroecosystem vegetation using predictive clustering trees. Ecologicall Modeling, Volume 222, Issue 14 6.
Ding, H. et ál. (2008) Querying and mining of time series data: Experimental comparison of representations and distance measures. En Proceedings of the VLDB Endowment. Auckland, New Zealanda: ACM.
Douzal-Chouakria, A., Diallo, A. y Giroud, F. (2009). Adaptive clustering for time series: Application for identifying cell cycle expressed genes. Computational Statistics and Data Analysis, 53, 13.
Fujimaki, R., Hirose, S. y Nakata, T. (2008). Theoretical analysis of subsequence timeseries clustering from a frequency-analysis viewpoint. Society fro Industrial and Applied Mathematics, SIAM.
Guo, C., Jia, H. y Zhang, N. (2008). Time series clustering based on ica for stock data analysis. en Proceedings of the fourth international conference wireless communications, networking and mobile computing, WiCOM ‘08. IEEE, 4
Guo, H. et ál. (2008). An application on time series clustering based on wavelet decomposition and denoising. En Fourth International Conference on Natural Computation. Jinan, Shandong, China: IEEE, 4.
Horenko, I. (2010). On clustering of non-stationary meteorological time series. Dynamics of Atmospheres and Oceans, 49 (2-3), 23.
Hsu, Y.C. y Chen, A.P. (2008). Clustering time series data by som for the optimal hedge ratio estimation. En Third 2008 International Conference on Convergence and Hybrid Information Technology. Daejeon, (Korea): IEEE, 6
Kavitha, V. y Punithavalli, M. (2010). Clustering time series data stream a literature survey. International Journal of Computer Science and Information Security, 8 (1), 6.
Keogh, E. y Kasetty, S. (2002). On the need for time series data mining benchmarks: Asurvey and empirical Knowl. Data Discov, 6, 102-111.
Kitagawa, G. (2010). Introduction to time series modeling. Monographs on satatistics and applied probability. Boca Raton, (FL): Chapman & Hall/CRC.
Kuenzel, L. (2010). Gene clustering methods for time series microarray data.
Lai, C.P., Chung, P.C. y Tseng, V.S. (2010). A novel two-level clustering method for time series data analysis. Expert Systems with Applications, 37, 8.
Liao, T.W. (2007). A clustering procedure for exploratory mining of vector time series. Pattern Recognition, 40, 13.
Liao, T.W. (2005). Clustering of time series data a survey. The Journal of the pattern Recognition Society, 38, 18.
Luo, Y., Liao, M. y Zhan, Z. A similarity analysis and clustering algorithm for video based on moving trajectory time series wavelet transform of moving object in video. En 2nd International Conference on Image Analysis and Signals Processing, IASP 2010. XiaMen, (China):
IEEE, 5 Lytkin, N.I., Kulikowski, C.A. y Muchnik, I.B. (2008). Variance-based criteria for clustering and their application to the analysis of management styles of mutual funds based on time series of daily returns. New Jersey (USA): New Brunswick.
Maharaj, E.A. y D’urso, P. (2010). Fuzzy clustering of time series in the frequency domain. Information Sciences, 25.
Montesino-Pouzols, F. y Barriga-Barros, A. (2010). Automatic clustering-based identification of autoregressive fuzzy inference models for time series. Neurocomputing, 73, 13.
Olier, I. y Vellido, A. (2008). Advances in clustering and visualization of time series using gtm through time. Neural Networks, 21, 10.
Otranto, E. (2008). Clustering heteroskedastic time series by model-based procedures. Computational Statistics & Data Analysis, 52, 14.
Palit, A.K. y Popovic, D. (2005). Computational intelligence in time series forecasting: Theory and engineering applications. En M.J. Grimble y M.A. Johnson (ed.). Advances in industrial control (381). Glasgw (Scotland, UK): Springer.
Pamminger, C. y Frühwirth-Schnatter, S. (2010). Model-based clustering of categorical time series. Bayesian Analysis articulo.
Papanastassiou, D. Classification and clustering of garch time series. En XIII International Conference Applied Stochastic Models and Data Analysis ASMDA 2009. 2009. Vilnius, Lithuania. 5
Piccardi, C. y Calatroni, L. Clustering time series by network community analysis.
En COMPENG 2010 Complexity in Engineering. Roma (Italy): IEEE, 94-96.
Plant, C., Wohlschläger, A.M. y Zherdin, A. (2009). Interaction-based clustering of multivariate time series. En Ninth International Conference on Data Mining. Venice, (Italy): IEEE, 914-919.
Pylvänen, M., Äyrämö, S. y Kärkkäinen, T. (2009). Visualizing time series state changes with prototype based clustering.
Qu, J., Ng, M. y Chen, L. (2010). Constrained subspace clustering for time series gene expresion data. En The Fourth International Conference on Computational Systems Biology (ISB2010). Suzhou, (China): ORSC & APORC, 323-330.
Savvides, A., Promponas, V.J. y Fokianos, K. (2008) Clustering of biological time series by cepstral coefficients based distances. Pattern Recognition, 41, 15.
Toshniwal, D. y Joshi, R.C. (2005). Using cumulative weighted slopes for clustering time series data. GESTS Int’l Trans. Computer Science and Engr, 20, 12.
Tsiporkova, E. y Boeva, V. (2008). A novel gene- centric clustering algorithm for standardization of time series expression data. En 4th International IEEE Conference “Intelligent Systems”. Varna, (Bulgaria): IEEE.
Vilar, J.A., Alonso, A.M. y Vilar, J.M. (2010). Non-linear time series clustering based on non-parametric forecast densities. Computational Statistics & Data Analysis, 2010, 54, 16.
Vilar, J.A., Alonso, A.M. y Vilar, J.M. (2010). Non-linear time series clustering based on non-parametric forecast densities. Computational Statistics & Data Analysis, 54, 2850-2865.
Wang, X. (2008). Structure-based multivariate time series clustering. En Computer Science Colloquium. Hong Kong, China.
Wei, L.L. y Jiang, J.Q. (2010). A hidden markov model-based k-means time series clustering algorithm. In International Conference on Information Systems (ICIS) 2010 (135-138). Saint Louis, Missouri, (USA): IEEE.
Wei, W.W.S. (2006). Time Series Analysis:Univariate and Multivariate Methods (2a ed.). New York (NY, USA): Pearson Education, Inc.
Yang, Y. y Chen, K. (2010). Time series clustering via RPCL network ensemble with different representations. IEEE Transactions on Systems, Man, and Cyberneticspart C: Applications and Reviews, 10.
Yin, J., Zhou, D. y Xie, Q.Q. (2006). A clustering algorithm for time series data. En Proceedings of the Seventh International Conference on Parallel and Distributed Computing, Applications and Technologies (PDCAT’06). IEEE, 4.
Zhang, W.F., Liu, C.C. y Yan, H. (2010). Gene time series data clustering based on continuos representations and an energy based similarity measure. En Proceedings of the Ninth International Conference on Machine Learning and Cybernetics. Qingdao, Shandong (China): IEEE. 2079-2083.
Zhao, G. y Deng, W. (2010). An hmm-based hierarchical clustering method for gene expression time series data. En The Fifth International Conference on Bio-Inspired Computing: Theories and Applications (BIC-TA 2010). Liverpool (United Kingdom): IEEE, 219-222.
Zhou, D., Li, J. y Ma, W. (2009). Clustering based on lle for financial multivariate time series. En International Conference on Management and Service Science (MASS 2009). Wuhan/Beijing (China): IEEE, 4.
Este obra está bajo una licencia Creative Commons Atribución 4.0