Sistemas de recuperación de información

  • Sonia Ordoñez Salinas Universidad Distrital Francisco José de Caldas y Universidad Nacional de Colombia
  • Fabio A. González O. Universidad Nacional de Colombia y University of Memphis.
Palabras clave: Sistemas de Recuperación de Información, representación de documentos, categorización de documentos. (es_ES)

Resumen (es_ES)

Este documento presenta una revisión de los principales aportes que se han hecho en el tema de los sistemas de Recuperación de Información (RI). Dado que la eficiencia y el desempeño de dichos sistemas depende de varios subsistemas y que cada uno de ellos ha ido creciendo y sufriendo cambios de manera independiente, esta revisión discrimina a grandes rasgos los sistemas de recuperación de información en 4 grandes temáticas a saber: Representación de documentos y consultas, estructuras de datos, selección de documentos relevantes y eficiencia de los Sistemas de Recuperación. Por último y a partir de los diferentes documentos estudiados se plantea el trabajo futuro basado en la semántica.

Resumen (en_US)

This document presents a survey of the main contributions that has been made in the topic of the Information Recovery systems (IR). Since the efficiency and the performance of this systems depend on several subsystems that have been growing and suffering changes in an independent way, this survey is organized around four broad thematic areas: representation of documents and queries, data structures, selection of relevant documents and efficiency of Recovery Systems. Finally IRS based en semantics are discussed.

Descargas

La descarga de datos todavía no está disponible.

Referencias

Rijsbergen Van C.J. Information Retrievel. Departamente of Computing Science, University of Glasgow Second edition . 1.979

Huang Lan, A survey on Web Information Retrieval Technologies, Computer Science Department, State University of New York at Stony Brook, Stony Brook, NY 11794-4400, 2003.

Luhn, h.p "A statistical approach to mechanized encoding an seraching of library information" IBM journal and Research and Development,1,309-317 (.1957)

Maron, M.E. and KuhnS, J.L., `On relevance, probabilistic indexing and information retrieval', Journal of the ACM, 7, 216244 (1960).

Sparck Jones, K. "Automatic Keyword Classification for Information Retrieval", Butterworths, London 1971

Klabbankoh Bangorn, Pinngern PH.D. Applied Genetic Algorithms in Information Retrievel Faculty of Information Technology, King Mongkut´s Institute of Techology Ladkrabang, 2000.

Vrajitoru Dana, Crossover Improvement For The Genetic Algorithm in Information Retrieval, Universite de Nauchatel, Intitui interfacultaire d`informatique, 1998.

Shian_Hua Lin. Member, IEEE, Meng Chang Chen, Jan-Ming, "ACIRD: Intelligent Internet Document Organization and Retrieval." IEEE Transactions on Knowledge and Data Engineering, Vol 14 No 3 May/June 2002.

Arazu Arvid, Garcia-Molina Hector Extracting Structured Data From Web Pages, Project Stanford InfoLab; Database Group, Stanford University, 2002

Salton, G. "Advanced Study Institute for on-line mechanized information retrieval systems", Nato (1972)

Kobayashi Mei, Takeda Koichi, Information Retrieval on the Web, IBM Research, ACM Computing Surveys. Vol. 32, No2, june 2000.

Ruthven Ian, Lalmas Mounia Lalmas, Using Dempster-Shafer´s Theory of Evidence to Combine Aspects of Informations Use, Department of Computing Science, University of Glasgow, 2001.

Becker Shirley, "Effective Databases for Text & Document Management" IRM Press. Publisher of innovative scholarly and professional information technology titles in the cyber age 2003.

Baldi Pierre, Frasconi Paolo, Smyth Padhraic, Modeling the Internet and the Web, Johm Wiley & Sons Ltd, The Atrium, Southern Gate, Chichester West Sussex PO198SQ. England , ISBN 0-470-84906-1,2003.

Melnik, Sergey; Raghavan, Sriram; Yang, Beverly; Garcia-Molina, Hector, Building a Distributed Full-Text Index for the Web, Databases and the Web; Digital Libraries, Stanford University, 2000

. Goodman, L. and Kruskal, W., `Measures of association for crossclassifications II:Further discussions and references', Journal of the American Statistical Association, 54, 123-163 (1959).

Fairthorne, R.A, "The mathematics of classification" Towards Information Retrievel, Butterworths, London, 1-10 1961

Lewis, D. D. (1998). Naive (Bayes) at forty: The independence assumption in information retrieval. ECML-98.

Tzeras, K., & Hartmann, S. (1993). Automatic indexing based on Bayesian inference networks. SIGIR-93 (pp. 22­34).

Lam, W., & Ho., C. (1998). Using a generalized instance set for automatic text categorization. SIGIR-98 (pp. 81­89).

Masand, B., Linoff, G., & Waltz., D. (1992). Classifying news stories using memory based reasoning. SIGIR-92 (pp.59­64).

Wiener, E. D., Pedersen, J. O., & Weigend, A. S. (1995). A neural network approach to topic spotting. 4th nnual Symposium on Document Analysis and Information Retrieval (pp. 317­332).

Apte, C., Damerau, F., & Weiss, S. Text mining with decision rules and decision trees. Proceedings of the orkshop with Conference on Automated Learning and Discovery: Learning from text and the Web. (59­64), 1998

Cohen, W. W., & Singer, Y. Context-sensitive learning methods for text categorization. Proceedings of he 19th Annual International ACM SIGIR Conference on Research and Development in Information (pp. 307­315). 1996

Joachims, T, A statistical learning model of text classification with support vector machines. SIGIR-01 pp. 128­136), 2001

Godbole, S., Sarawagi, S., & Chakrabarti, S,Scaling multi-class support vector machine using inter-class confusion. SIGKDD02 (pp. 513­518), 2002.

Nigam, K., Lafferty, J., & McCallum, A. Using maximum entropy for text classification. In IJCAI-99 workshop on Machine Learning for Information Filtering (pp. 61­67)., 1999

, Schapire, R. E., & Singer, Y. Boostexter: A boosting-based system for text categorization. Machine Learning, 39, 135­168, 2000.

Syan-Wen Li, Candan K. Selkuk, Vu Quoc, Agrawal Divyakant. Query Relaxation by Structure and Semantics for Retrieval of Logical Web Documents. IEEE Transactions On Knowledge and Data Enginering Vol 14, 2002.

Srikanth Munirathnam, Srihari Rohini. Exploiting Systactic Structure of Queries in a Language Modeling Approach to IR, State University of New York at Buffalo, 2003 ACM 1-58113-7230/03/0011

SkillSoft Corporation, Natural Language Processing ReferencePoint Suite, Published by SkillSoft Corporation, 20 Industrial Park Drive, Nashua, NH 03062 (603) 324-3000, 2004.

Hofmann Thomas. Probabilistic Latent Semantic Analysis. Eecs Departament, Computer Science Division, University of California, Berkeley & International Computer Science Institute, Berkeley, CA. 1999.

Hofmann Thomas. Probabilistic Latent Semantic Index. Eecs Departament, Computer Science Division, University of Caligornia, Berkeley & International Computer Science Institute, Berkeley, CA. 1999.

Li Tao, Zhu Shenghuo, Ogihara Mitsunori, Efficient Multi-Way Text Categorization via Generalized Discriminant Analysis, Computer Science Dept. University of Rochester. ACM 2003.

Goodman, L. and Kruskal, W., `Measures of association for crossclassifications', Journal of the American Statistical Association, 49, 732-764 (1954).

Kuhns, J.L., `The continuum of coefficients of association'. In Statistical Association Methods for Mechanised Documentation, (Edited by Stevens et al.) National Bureau of Standards, Washington, 33-39 (1965).

Cormack, R.M., `A review of classification', Journal of the Royal Statistical Society, Series A, 134, 321-353 (1971).

Sneath, P.H.A. and Sokal, R.R., Numerical Taxonomy: The Principles and Practice of Numerical Classification, W.H. Freeman and Company, San Francisco (1973).

Salton, G. Relevance assessments and Retrieval system evaluation, Information Storage an retrieval (1969),

Jardine, N. and Sibson, R., Mathematical Taxonomy, Wiley, London and New York (1971).

Simonnot, B. and Smail, M. (1996). Modele flexible por la recherche interactive de documents multimedias. Proceedings of Inforsid (pp. 165­178) Bordeaux, 1996

Nie, J. An outline of a general model for information retrieval systems. Proceedings of the ACM SIGIR International Conference on Research and Development in Information Retrieval (pp. 495­506). 1988

Klabbankoh Bangorn, Pinngern PH.D. Applied Genetic Algorithms in Information Retrievel Faculty of Information Technology, King Mongkut´s Institute of Techology Ladkrabang, 2000.

Pathek98 Praven, Gordon Michael, Fan Weiguo. Effective Information Retrievl using Genetic Algorithms bases Matching Functions Adaptation Departement of Computer & Information Ssystem. University of Michigan School 701 Tappan Street; Ann Arbor, 1998.

Vrajitoru Dana, Crossover Improvement For The Genetic Algorithm in Information Retrieval, Universite de Nauchatel, Intitui interfacultaire d`informatique, 1998.

Praveen Pathak, Michael Godon, Weiguo Fan, "Effective Information Retrieval using Genetic Algorithms based Matching Functions Adaptation" Department of Computer & Information Systems, University of Michigan Business School. 1998

Wang Ke, Liu Huiqing, Discovering Strucutral Association of Semistructured Date. IEEE. Transactions on Knowledge and Data Engineering, Vol 12 No 3 , 2000.

Thuraisingham Bhavani, "Web Data Mining and Applications in Business Intelliegence and Counter ­Terrerism" CRC PRESS Boca Raton London New York Washington, D.C, ISBN 0-84931460-7, 2003.

Cleverdon, C.W., Mills, J. and Keen, M., Factors Determining the Performance of Indexing Systems, Vol. II, Test Results, SLIB Cranfield Project, Cranfield, 1966

Cuadra, A.C. and Katter, R.V., Opening the black box of "relevance", Journal measures, Journal of Documentation, 25, 93-107. 1969

Cómo citar
Ordoñez Salinas, S., & González O., F. A. (2003). Sistemas de recuperación de información. Ingeniería, 9(1), 57-62. https://doi.org/10.14483/23448393.2743
Publicado: 2003-11-30
Sección
Ciencia, investigación, academia y desarrollo

Artículos más leídos del mismo autor/a