Skip to content


  • Research Article
  • Open Access

Question Processing and Clustering in INDOC: A Biomedical Question Answering System

  • 1Email author,
  • 1,
  • 1 and
  • 1
EURASIP Journal on Bioinformatics and Systems Biology20072007:28576

  • Received: 12 April 2007
  • Accepted: 22 September 2007
  • Published:


The exponential growth in the volume of publications in the biomedical domain has made it impossible for an individual to keep pace with the advances. Even though evidence-based medicine has gained wide acceptance, the physicians are unable to access the relevant information in the required time, leaving most of the questions unanswered. This accentuates the need for fast and accurate biomedical question answering systems. In this paper we introduce INDOC—a biomedical question answering system based on novel ideas of indexing and extracting the answer to the questions posed. INDOC displays the results in clusters to help the user arrive the most relevant set of documents quickly. Evaluation was done against the standard OHSUMED test collection. Our system achieves high accuracy and minimizes user effort.


  • Relevant Information
  • Exponential Growth
  • System Biology
  • Require Time
  • Wide Acceptance


Authors’ Affiliations

Department of Electronics and Computer Engineering, Indian Institute of Technology Roorkee, Roorkee, 247667, India


  1. []
  2. Gorman P, Ash J, Wykoff L: Can primary care physicians' questions be answered using the medical journal literature? Bulletin of the Medical Library Association 1994, 82(2):140-146.Google Scholar
  3. Straus SE, Sackett DL: Bringing evidence to the point of care. Journal of the American Medical Association 1999, 281: 1171-1172. 10.1001/jama.281.13.1171View ArticleGoogle Scholar
  4. Guyatt GH, Meade MO, Jaeschke RZ, Cook DJ, Haynes RB: Practitioners of evidence based care. British Medical Journal 2000, 320(7240):954-955. 10.1136/bmj.320.7240.954View ArticleGoogle Scholar
  5. Sackett DL, Straus SE, Richardson WS, Rosenberg W, Haynes RB: Evidence-Based Medicine: How to Practice and Teach ENB. Churchill Livingstone, New York, NY, USA; 1997.Google Scholar
  6. Gorman PN, Helfand M: Information seeking in primary care: how physicians choose which clinical questions to pursue and which to leave unanswered. Medical Decision Making 1995, 15(2):113-119. 10.1177/0272989X9501500203View ArticleGoogle Scholar
  7. Jacquemart P, Zweigenbaum P: Towards a medical question-answering system: a feasibility study. In Proceedings of Medical Informatics Europe (MIE '03), Studies in Health Technology and Informatics. Volume 95. Edited by: Beux PL, Baud R. IOS Press, San Palo, Calif, USA; 2003:463-468.Google Scholar
  8. Schultz S, Honeck M, Hahn H: Biomedical text retrieval in languages with complex morphology. Proceedings of the Workshop on Natural Language Processing in the Biomedical domain, Philadelphia, Pa, USA, July 2002 61-68.Google Scholar
  9. Ely J, Osheroff JA, Ebell MH: Analysis of questions asked by family doctors regarding patient care. British Medical Journal 1999, 319(7206):358-361.View ArticleGoogle Scholar
  10. Ely JW, Osheroff JA, Ebell MH, et al.: Obstacles to answering doctors' questions about patient care with evidence: qualitative study. British Medical Journal 2002, 324(7339):710-713. 10.1136/bmj.324.7339.710View ArticleGoogle Scholar
  11. Bergus GR, Randall CS, Sinift SD, Rosenthal DM: Does the structure of clinical questions affect the outcome of curbside consultations with specialty colleagues? Archives of Family Medicine 2000, 9(6):541-547. 10.1001/archfami.9.6.541View ArticleGoogle Scholar
  12. Niu Y, Hirst G: Analysis of semantic classes in medical text for question answering. Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics, Workshop on Question Answering in Restricted Domains, Barcelona, Spain, July 2004 54-61.Google Scholar
  13. Niu Y, Hirst G, McArthur G, Rodriguez-Gianolli P: Answering clinical questions with role identification. Proceedings of 41st Annual Meeting of the Association for Computational Linguistics, Workshop on Natural Language Processing in Biomedicine, Sapporo, Japan, July 2003 73-80.Google Scholar
  14. Sang ETK, Bouma G, De Rijke M: Developing offline strategies for answering medical questions. Proceedings of the AAAI-05 Workshop on Question Answering in Restricted Domains, Pittsburgh, Pa, USA, 2005 WS-05-10: 41-45.Google Scholar
  15. Cohen AM, Hersh WR: A survey of current work in biomedical text mining. Briefings in Bioinformatics 2005, 6(1):57-71. 10.1093/bib/6.1.57View ArticleGoogle Scholar
  16. []
  17. []
  18. Aronson AR: Effective mapping of biomedical text to the UMLS Metathesaurus: the MetaMap program. Proceedings of the AMIA Symposium, 2001 17-21.Google Scholar
  19. McCray AT, Burgun A, Bodenreider O: Aggregating UMLS semantic types for reducing conceptual complexity. Medinfo 2001, 10(part 1):216-220.Google Scholar
  20. Bodenreider O, McCray AT: Exploring semantic groups through visual approaches. Journal of Biomedical Informatics 2003, 36(6):414-432. 10.1016/j.jbi.2003.11.002View ArticleGoogle Scholar
  21. Hersh WR: OHSUMED: an interactive retrieval evaluation and new large test collection for research. Proceedings of the 17th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '94), Springer, Dublin, Ireland, July 1994 192-201.Google Scholar
  22. []
  23. MacQueen JB: Some methods for classification and analysis of multivariate observations. Proceedings of 5th the Berkeley Symposium on Mathematical Statistics and Probability, Berkeley, Calif, University of California Press, USA, June-July 1967 281-297.Google Scholar
  24. Ely JW, Osheroff JA, Gorman PN, et al.: A taxonomy of generic clinical questions: classification study. British Medical Journal 2000, 321(7258):429-432. 10.1136/bmj.321.7258.429View ArticleGoogle Scholar


© Parikshit Sondhi et al. 2007

This article is published under license to BioMed Central Ltd. This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.