Project Name

Last updated: 2/26/18, 5:57 PM


Currently, there are people that are experts in the medical procedures, but they are not able to easily select and analyze their data, since they are not familiar enough with the technical tools to do so. Also, when planning for procedures, physicians often make educated guesses at the outcome of radiation dosing in a particular portion of the lung. They have to make these guesses, since there is no easy way for them to see the statistically expected outcome of radiation in a particular location.

To try to solve this problem, we hope to design an easy-to-use machine learning pipeline to:

  • Allow users to easily select and query both features and outcomes related to lung cancer treatment for analysis
  • Run a variety of machine learning models to analyze selected data
  • Find linkages between various clinical patterns and practices to treatment outcomes

Project Members:

  • Students: Sunny Thodupunuri and Archan Patel
  • Mentor(s): Dr. Todd McNutt and Pranav Lakshminarayanan

Project Components

User Interface

  • Longitudinal Feature Selection UI
  • Radiation Dose Features
  • Outcome Selection
  • Longitudinal Outcome Selection UI

Other outcomes

  • E.g. Survival at 5 years, pulmonary function, etc.

Machine Learning Method

  • Select an effective method
  • Run method with selected features and outcomes
  • Validation Model

Background, Specific Aims, and Significance

The feature and outcome selector part of the project will allow users to select features and outcomes in an easy to use UI. We hope to make this part of the project as a web application using Python and Flask. The UI will consists of charts and longitudinal selectors that we hope to create with Charts.js and DataTables.js. Once the user selects features or outcomes, we will query a database of lung cancer treatment data. We hope to design efficient SQL queries. Once the data is queried, it will be outputted in chart form to the user and will also be outputted in text form as feature and outcome vectors. These vectors can then be used in the machine learning portion of the project.

For the machine learning portion of the project, we hope to build a neural network using Keras (a Python library) or TensorFlow. We will test different types of networks to see which works best on the data that we will be using. Additionally, we will train on the data set and measure the accuracy of the model by looking at the loss of the network. We will also develop a validation framework that will perform x-fold cross validation on our model. It will validate the model on unseen data, and then it will execute and report the results to users.


  • Minimum: (Expected by Mid-April)
    1. We shall create a user interface to retrieve statistical data over a custom time window.
    2. We will develop a application program interface to provide access to frequently accessed data queries.
    3. We will create an interface to train Neural Network using Keras (a Python library) to associate data features with outcomes
  • Expected: (Expected by End of April)
    1. We will enable a view to select longitudinal data and plot and filter it using Dash.
    2. Add multiple machine learning models to the training pipeline and perform cross validation.
  • Maximum: (End of Semester)
    1. We can enable users to customize parts of the data retrieval and machine learning pipeline to suit their needs.
    2. We can add representation learning to patient data to determine the most important factors in patient outcome.

Technical Approach

here describe the technical approach in sufficient detail so someone can understand what you are trying to do


  • Keras, Flask, and Dash Python Libraries. These libraries are freely available to use on Unix.
  • Treatment and outcome data from Oncospace.
  • Computing resources for hosting application and training model. We have access to free education AWS credits and $300 in Google Cloud Credits that we can use for this project.

Milestones and Status

  1. Milestone name: Access Database, setup credentials, run sample queries
    • Planned Date: 02/25/18
    • Expected Date: 02/26/18
    • Status: Done.
  2. Milestone name: Write queries to get required data from database
    • Planned Date: 03/05/18
    • Expected Date: 03/05/18
    • Status: Done.
  3. Milestone name: Create python webserver UI
    • Planned Date: 03/12/18
    • Expected Date: 03/15/18
    • Status: Done.
  4. Milestone name: Begin work on the machine learning pipeline. Determine which models are most useful to researchers and report findings.
    • Planned Date: 4/16/18
    • Expected Date: 4/17/18
    • Status: In Progress.
  5. Milestone name: Use representation learning to determine important features which relate to patient outcome.
    • Planned Date: 04/30/18
    • Expected Date:
    • Status: Pending.
  6. Milestone name: Run integration tests on machine learning pipeline and write final report
    • Planned Date: 05/03/18
    • Expected Date:
    • Status: Pending

Reports and presentations

Project Bibliography

  • W.H.S.D Gunarathne, K.D.M Perera, K.A.D.C.P Kahandawaarachchi, “Performance Evaluation on Machine Learning Classification Techniques for Disease Classification and Forecasting through Data Analytics for Chronic Kidney Disease (CKD)”, Bioinformatics and Bioengineering (BIBE) 2017 IEEE 17th International Conference on, pp. 291-296, 2017, ISSN 2471-7819.
  • A. Charleonnan, T. Fufaung, T. Niyomwong, W. Chokchueypattanakit, S. Suwannawach and N. Ninchawee, “Predictive analytics for chronic kidney disease using machine learning techniques,” 2016 Management and Innovation Technology International Conference (MITicon), Bang-San, 2016, pp. MIT-80-MIT-83.
  • Yoo, Kyung Don, et al. “A Machine Learning Approach Using Survival Statistics to Predict Graft Survival in Kidney Transplant Recipients: A Multicenter Cohort Study.” Scientific Reports, vol. 7, no. 1, 2017, doi:10.1038/s41598-017-08008-8.
  • Kang, John, et al. “Machine Learning Approaches for Predicting Radiation Therapy Outcomes: A Clinician's Perspective.” International Journal of Radiation Oncology*Biology*Physics, vol. 93, no. 5, 2015, pp. 1127-1135., doi:10.1016/j.ijrobp.2015.07.2286.

Other Resources and Project Files

Here give list of other project files (e.g., source code) associated with the project. If these are online give a link to an appropriate external repository or to uploaded media files under this name space.2018-17

courses/456/2018/456-2018-17/project-17.txt · Last modified: 2019/08/07 12:01 (external edit)