Portfolio

Symptom-Disease Healthcare Chatbot

Project Description

The Symptom-Disease Healthcare Chatbot is a Python-based tool designed to predict potential diseases based on user-reported symptoms. Using a Naive Bayes classifier, the chatbot provides immediate diagnosis suggestions, helping users understand possible health conditions. The project involves data preprocessing, model training, and evaluation, all managed through a simple command-line interface.

Role and Contributions

Developed the idea for a healthcare chatbot to assist users in identifying possible diseases based on reported symptoms.
Collected and curated a dataset of symptoms and associated diseases.
Performed data preprocessing to clean and prepare the data for model training.
Implemented a Naive Bayes classifier to train the model on the symptom-disease dataset.
Created Python scripts for data loading, model training, and prediction functionalities.
Developed a command-line interface to interact with the chatbot and provide symptom-based predictions.
Conducted rigorous testing to ensure the accuracy and reliability of the model predictions.

Outcomes and Results

Successfully developed a chatbot that provides disease predictions based on user-reported symptoms.
Integrated a Naive Bayes classifier to analyze symptoms and suggest possible diseases with reasonable accuracy.
Achieved an improved accuracy rate with the final model, demonstrating the effectiveness of the implemented classification algorithm.

GitHub Repository

Technologies Used

Python: Used for developing the chatbot, implementing machine learning models, and data processing.
scikit-learn: For building and evaluating the Naive Bayes classification model.
pandas: For data manipulation and preprocessing.
NumPy: For numerical operations and handling arrays.
pytest: Used for writing and running unit tests to validate the chatbot’s functionality.

Challenges Faced and Solutions

Challenge: The initial dataset used for training the model had limited examples and did not cover a broad range of symptoms and diseases, leading to poor model performance and inaccurate predictions.
Solution: Expanded the dataset by including a more comprehensive list of symptoms and diseases to provide the model with a richer set of examples.Improved data preprocessing to clean and standardize the data, making it more suitable for training.

Challenge: The Naive Bayes model initially showed low accuracy, as evidenced by its performance metrics and confusion matrix.
Solution: Fine-tuned the model parameters and re-evaluated the training process to improve accuracy.Investigated other classification algorithms, such as Logistic Regression or Support Vector Machines (SVM), to potentially enhance performance.

Discover More