Generative AI & RAG Mar 9, 2026 Published project
Asthma RAG Diagnostic Assistant

Evidence-grounded diagnostic reasoning experiment

This project studies how retrieval and prompt design affect asthma-related reasoning in a controlled setting. It compares zero-shot, chain-of-thought, and neutral reasoning prompts over synthetic patient cases with local retrieval and evaluation metrics.

PythonLangChainFAISSHuggingFaceQwen2.5RAG

Challenge

  • Clinical-style reasoning can be biased by leading prompts.
  • Asthma-like symptoms need comparison against alternative explanations such as COPD-like cases.
  • A controlled experiment needs synthetic cases, retrieved context, and metrics that reveal both fluency and repetition.

System architecture

Synthetic case input
FAISS retrieval
Prompt variant
Local Qwen response

Data and inputs

Custom asthma knowledge base, recursive chunks, sentence-transformer embeddings, FAISS vector store, and 10 synthetic positive/negative patient cases.

Technical approach

  • Build a local retrieval index for asthma-related context.
  • Compare zero-shot, chain-of-thought, and neutral chain-of-thought prompting.
  • Evaluate responses with BLEU, ROUGE-L, METEOR, Distinct-2, perplexity, and Self-BLEU.

Evaluation and results

Key indicators

10 synthetic patient cases

Key indicators

3 prompting styles

Key indicators

BLEU / ROUGE-L / METEOR / Self-BLEU

  • Neutral chain-of-thought improved objectivity in a COPD-like negative case.
  • Perplexity, Self-BLEU, and Distinct-2 added useful signals beyond lexical overlap metrics.
  • The project highlights that lexical metrics alone are not enough for judging reasoning quality.

Implementation and code

Implementation focus

The implementation connects data preparation, modeling, evaluation, and interpretation in a structured workflow that makes the technical decisions clear.

Source code

The code is available for exploring the implementation details and extending the experiment when needed.

Open source code

Scope and responsible use

This project demonstrates modeling and evaluation on health-related data and is not intended for clinical decision-making. Any clinical use would require external validation, expert review, calibration, and regulatory oversight.

Future development

  • Add stronger clinical reasoning rubrics for evaluation.
  • Compare more retrieval strategies and larger local models.
  • Separate citation support from final-answer fluency.

Technical contribution

The project connects RAG, prompt design, diagnostic-style reasoning, and evaluation discipline in a safety-sensitive setting.