Padmini Srinivasan, Professor

Tippie Research Fellow

The University of Iowa

School of Library Information Science

Department of Management Sciences

Computer Science Department (courtesy)

College of Nursing (courtesy)

Phone: 319-335-5708  Fax: 319-335-5374

padmini-srinivasan@uiowa.edu

Education:

M.Sc. (Hons) Biological Sciences, Birla Institute of Technology & Science, Pilani, India, 1978;
Ph.D. Information Studies, Syracuse University, 1985

Research Interests:

Text mining, web mining, topical crawlers, text categorization, information filtering, formal models for information retrieval (with special emphasis on biomedical applications).

Text Retrieval & Text Mining Reading Group:     Current Semester

Ph.D Students:

Miguel Ruiz. Associate Professor. University of North Texas. PhD. 2001

Gautam Pant Assistant Professor, The University of Utah. PhD. 2004

Xin Ying Qiu Assistant Professor, Christopher Newport University. PhD. 2007

Aditya Sehgal Research Scientist, Parity Computing, San Diego.

Ha Thuc Viet (current)
Hudong Wang (current)

Brian Almquist (current)

Prospective Ph.D Students:  If you are interested in working with me on text retrieval and text mining research then the following three options are available.  Ph.D in Informatics (sub tracks available in Health Informatics and in Information Science), (2) Ph.D in Management Sciences and (3) Ph.D in Computer Science.  Please follow the appropriate application guidelines.

Prospective MA SLIS Students:

If you are interested in applying for the IMLS funded fellowships in Digital Librarianship please see the SLIS IMLS Website for application details.

Current Funded Projects:

P.I. Recruiting and Educating the 21st Century Digital Librarian. Institute of Museum and Library Services. December 2006 - December 2009. Co-PIs Jim Elmborg and Paul Soderdahl.  UI Press Release SLIS IMLS Website

Recently Funded Projects:

P.I. Text Metadata Mining: Extending the Frontiers of Text-Based Applications in Biomedicine. National Science Foundation. August 2003 - July 2007. Manjal - our Prototype Text Mining System

Sample Papers

Text Mining and Allied Problems:

Prototype Text Mining System: Manjal: Mining MEDLINE using a single topic, a pair of topics or a larger group of topics. Implementation partly supported by NSF grant no. IIS-0312356.

  1. Ha-Thuc, V., Srinivasan, P. A Robust Learning Approach for Text Classification.  Text Mining Workshop. Eighth SIAM International Conference on Data Mining (SDM 2008). Atlanta, April 2008.
  2. Srinivasan, P. Adaptive classifiers, topic drifts and GO annotations.  Proceedings of the American Medical Informatics Conference. November 2007. word file (Distinguished Paper Award Finalist)
  3. Lee, W-J., Raschid, L., Srinivasan, P.,  Shah, N., Rubin, D., and Noy, N.  Using Annotations from Controlled Vocabularies to Find Meaningful Associations. Proceedings of the Workshop on Data Integration in the Life Sciences (DILS 2007). pdf file
  4. Ha-Thuc, V., Srinivasan, P. Exploiting synonym relationships in biomedical named entity matching. (short paper) BioLINK 2007 Workshop of ISMB 2007. word file
  5. Qiu, Xin Ying and Srinivasan, P. GO for Gene Documents. BMC Bioinformatics, 2007, 8(Suppl 9):S3doi:10.1186/1471-2105-8-S9-S3.
  6. Sehgal, A. K., Qiu, X. Y., Srinivasan, P. Analyzing LBD Methods using a General Framework.  To appear in: Bruza, P. and Weeber, Marc (eds) Literature-Based Discovery. Springer's series on ``Information Science and Knowledge Management''. 2007
  7. Sehgal, A.K. and Srinivasan, P. Retrieval with Gene Queries. BMC Bioinformatics (Research Paper), 7:220, 2006.
  8. Qiu, Xin Ying and Srinivasan, P. GO for Gene Documents. ACM First International Workshop on Text Mining in Bioinformatics (TMBIO). In conjunction with ACM 15th Conference on Information and Knowledge Management (CIKM 2006). November 2006. pdf file
  9. Sehgal A.K. and Srinivasan P. Improving Retrieval for Gene Queries. ACM SIGIR Workshop on Predicting Query Difficulty. Salvador, Brazil. August 2005. pdf file
  10. Zhou L. and Srinivasan P. Concept Space Comparisons: Explorations with Five Health Domains. Proceedings of the American Medical Informatics Conference. October 2005
  11. Sehgal A.K, Srinivasan P, Bodenreider O. Gene Terms and English Words: An Ambiguous Mix. SIGIR 2004 Workshop on Search and Discovery for Bioinformatics. July 29, 2004. pdf file
  12. Srinivasan P. and Libbus B. Mining MEDLINE for Implicit Links between Dietary Substances and Diseases. ISMB 2004 and in Bioinformatics (Supplement) pdf file
  13. Srinivasan P., Libbus B. and Sehgal A. K. Mining MEDLINE: Postulating a Beneficial Role for Curcumin Longa in Retinal Diseases. HLT Biolink 2004.  pdf file
  14. Light M., Qiu X.Y. and Srinivasan P. The Language of Bioscience: Facts, Speculations, and Statements in between. HLT Biolink 2004.  pdf file
  15. Catona E., Srinivasan P. and Street W. Protein Annotation with GO Codes. Poster. MEDINFO 2004, Sept. 7-11, 2004, San Francisco, California.
  16. Srinivasan P. and Hristovski D. Distilling Conceptual Connections from MeSH Co-Occurrences. MEDINFO 2004., Sept. 7-11, 2004, San Francisco, California.
  17. Srinivasan P. Text Mining: Generating Hypotheses from MEDLINE. JASIST 55(5), 396-413, March 2004. pdf file
  18. Srinivasan P. and Sehgal A.K. Mining MEDLINE for Similar Genes and Similar Drugs Techincal Report, Department of Computer Science, The University of Iowa, July 2003, TR# 03-02 pdf file
  19. McKnight L. and Srinivasan P. Categorization of Sentence Types in Medical Abstracts. Proceedings of the 2003 AMIA conference. November 2003, Washington D.C. word file
  20. Sehgal A. Qiu X.Y. and Srinivasan P. Mining MEDLINE Metadata to Explore Genes and their Connections. Proceedings of the SIGIR 2003 Workshop on Text Analysis and Search for Bioinformatics. July 2003. pdf file
  21. Srinivasan P. and Wedemeyer M. Mining Concept Profiles with the Vector Model or Where on Earth are Diseases being Studied? Proceedings of the Text Mining Workshop. Third SIAM International Conference on Data Mining. San Francisco May 2003. pdf file
  22. Srinivasan P. and Rindflesch T. Exploring Text Mining from MEDLINE. Annual Conference (2002) of the American Medical Informatics Association (AMIA 2002). pdf file.
  23. Srinivasan P. MeSHmap: A Text Mining Tool for MEDLINE. Proceedings of the Annual Conference (2001) of the American Medical Informatics Association (AMIA). March 2001.  word file

Web based research (Web Mining, Topical Crawlers etc.)

  1. Sehgal, A.K. and Srinivasan, P. Profiling Topics on the Web. WWW Conference Workshop on I3: Identity, Identifiers, Identification. Entity-Centric Approaches to Information and Knowledge Management on the Web (WWW 2007). May 2007. pdf file
  2. Pant G. and Srinivasan P. Link Contexts in Classifier-Guided Topical Crawlers. IEEE Transactions on Knowledge and Data Engineering, 18(1), 107-122, January 2006. pdf file
  3. Pant G. and Srinivasan P. Learning to Crawl: Comparing Classifier Schemes. ACM Transactions on Information Systems, 23(4), 430-462, 2005.  pdf file
  4. Srinivasan P. Menczer F. and Pant G. A General Evaluation Framework for Topical Crawlers. Information Retrieval 8(3): 417-447, 2005. pdf file
  5. Menczer F., Pant G., and Srinivasan P. Topic-driven crawlers: Machine learning issues. ACM Transactions on Internet Technology. Special Issue on Machine Learning for the Internet. 4(4), November 2004. pdf file
  6. Pant G., Srinivasan P. and Menczer F. Crawling the Web. in M. Levene and A. Poulovassilis, editors: Web Dynamics, Springer-Verlag, 2004. pdf file
  7. Srinivasan P., Menczer F. and Gautam P. Defining Evaluation Methodologies for Topical Crawlers (Position paper). Proceedings of the SIGIR 2003 Workshop on Defining Evaluation Methodologies for Terabyte-Scale Collections. July 2003. pdf file
  8. Pant G., Srinivasan P. and Menczer F. Exploration versus Exploitation in Topic Driven Crawlers. WWW 2002 Workshop on Web Dynamics. pdf file
  9. Srinivasan P., Mitchell J., Boderreider O., Pant G. and Menczer F. Web Crawling agents for Retrieving Biomedical Information. Proceedings of NETTAB 2002 Workshop on Agents in Bioinformatics. Bologna, Italy, July 2002. pdf file
  10. Menczer F., Pant G., Ruiz M., and Srinivasan P. Evaluating Topic-Driven Web Crawlers. Proceedings of the 2001 Annual Conference of the Association of Computing Machinery, Special Interest Group in Information Retrieval, 241-249. New Orleans, September 2001. postscript file

Some of my Course Web Pages

6K:234 Information and Knowledge Management

6K:278 Web Mining

21:120 Computing Foundations

21:122 Conceptual Foundations

21:230/6K:233 Text Retrieval
21:226 Digital Libraries

21:259 Information Needs of Special Populations and for Disaster Recovery
21:224 Electronic Publishing
074:191 Medical Informatics and Networking

Thesaurus construction software

From the Thesaurus chapter in INFORMATION RETRIEVAL: Data Structures and Algorithms. Edited by William Frakes and Ricardo Baeza-Yates. Prentice-Hall, (1992). gzipped file