Presenter Abstracts – C.3: Educational Approaches to Data Science

Session Chairs
Dr. Homay Valafar, University of South Carolina (SC INBRE)
Dr. Luis Vázquez Quiñones, Inter American University of Puerto Rico - Arecibo Camps (Universidad Interamericana de Puerto Rico - Recinto de Arecibo) (PR INBRE)

Dr. Joel De La Cruz-Oller, Universidad Ana G Mendez

National and international stochastic events of Covid-19 pandemic, Ukrainian war, and natural disasters, in the increase in the prices of medical care, consumer goods and services in Puerto Rico

Joel De La Cruz-Oller1 and Angel Ojeda-Castro2
1Department of Health Sciences, Universidad Ana G. Méndez, Gurabo Campus, PR, 2Business Department, Universidad Ana G. Méndez, Gurabo Campus, PR

Introduction/Background. Rising prices for healthcare, consumer goods and services in Puerto Rico, challenge the sustainability of healthcare accessibility and affordable resources for patients. A generalized and sustained increase in the prices of these health services implies inflation. Inflation generates depreciation of money that negatively impacts the medical and social services of citizens in the countries. An increase in prices implies that citizens have less money to cover human needs, in which the satisfaction of the most basic or subordinate needs gives rise to the successive generation of higher or superordinate needs. Use this scenario as a practical and theoretical approach in the statistics course for undergraduate students.

Hypothesis/Goal of Study. H1: Consumer Goods Prices are different before and after stochastic events. The goal of this research was to create a learning environment using real life events, where undergraduate students could perform scientific data analysis and interpretation of international stochastic events. The goal is to establish the relationship with the increase in prices of accessibility to health care and consumer services in Puerto Rico before and after the Covid-19 pandemic, before and after the war in Ukraine, before and after Hurricane Fiona and rising oil prices. Finally, the student will create and use a database for descriptive and inferential statistical analysis.

Methods and Results. A database creation in excel will be explained taking ten variables of consumer goods and from the healthcare services from the Consumer Price Index (CPI) of Puerto Rico. Also, six variables of healthcare services in Puerto Rico from the CPI of Puerto Rico. After the creation of the database, the observations will coincide with the variables selected from the data that matched the before and after of the national or international events occurred. If there’s a sample of six observations of one variable as data science from where the events occurred to February 2023; This would be the after and the before would be six backward of the same variable from the observation before up to count six. Some of the following analysis would be investigated in a learning environment. First descriptive static using SPSS version 29; Fallowed by other analysis such as normality and paired sample T test. Finally, the result will be interpreted and explained.

Discussion/Conclusions. The outcome of this study demonstrates that either natural, man-made or circumstantial events influence the increase of prices of healthcare and consumers services in Puerto Rico. The government, in alliance with the private organization and the community, must establish the necessary preventive measures to avoid the decrease in the value of money and the quality of health services. Using different teaching approaches of technology such as science data combined with statistical analysis tests can prepare students to interpret and solve questions or concerns of today’s healthcare circumstances and economics using data analysis more accurately and efficiently. The rise in prices and the previously shown evidence of a relationship between national and international stochastic events, shows that in addition to telephone and electricity; most variables increase statistically significantly depending on the moment and the event. Sustainability equates to a good economic environment that favors access to quality healthcare performance and consumer services in Puerto Rico. The use of statistical analysis and artificial intelligence allow the disclosure of knowledge so that citizens can prevent the consequences of rising prices to meet their human and health needs. 30 of 45 (66.66%) students were positively influenced and successfully completed the project.

Grant/Funding Support. PR-INBRE program Supported by an Institutional Development Award (IDeA) from the National Institute of General Medical Sciences of the National Institutes of Health under grant number P20GM103475.

Dr. Nicole L. Garrison, West Liberty University

Three Year Report on the WV-INBRE Bioinformatics Bootcamp Initiative

Nicole L. Garrison and James Denvir
Department of Biomedical Sciences, West Liberty University, West Liberty, WV and Department of Biomedical Sciences, Marshall University, Huntington, WV

Introduction/Background. The West Virginia IDeA Network of Biomedical Research Excellence developed a three to six week bioinformatics bootcamp at Marshall University and West Liberty University for the last three summers, culminating in student presentations at
WV-INBRE Annual Summer Research Symposia.

Hypothesis/Goal of Study. This intensive workshop was designed to advance the computational knowledge of biology and biomedical science undergraduates enrolled in WV-INBRE network institutions and enhance bioinformatics training resources.

Methods and Results. Students (n=16) went from having very little or no previous bioinformatics and computing experience to executing pipelines for SARS-CoV-2 variant analysis and mammalian RNA-seq data analysis by the end of week three. Students were prepared for increasingly difficult programming and critical thinking problems associated with bioinformatic analysis throughout the course. Independent and highly collaborative projects completed by bootcamp graduates have spanned a range of topics in biomedical science utilizing publicly available sequencing data.

Discussion/Conclusions. The experience and knowledge gained through piloting these camps will be translated into a series of learning modules that can be shared through other platforms and venues. In future iterations of the bootcamp, organizers will offer broadened participation across the Southeast IDeA network.

Grant/Funding Support. Supported by NIH Grant P20GM103434 to the West Virginia IDeA Network for Biomedical Research Excellence.

Dr. Frances Heredia-Negrón, University of Puerto Rico - Medical Sciences Campus

Enhancing Workforce Capacity for Hispanic Biomedical Researchers: A Course on Applying Artificial Intelligence and Machine Learning to Address Health Disparities

Frances Heredia-Negrón, Natalie Álamo-Rodríguez, Brenda Nieves, Emma Fernández-Repollet, and Abiel Roche-Lima
Research Centers in Minority Institutions-Center for Collaborative Research in Health Disparities Program (RCMI-CCRHD), University of Puerto Rico - Medical Sciences Campus, San Juan, PR

Introduction/Background. Artificial intelligence (AI) and machine learning (ML) are pivotal in advancing groundbreaking medical techniques. However, the existing biases inherent in AI and ML methodologies contribute to persistent health disparities among minority populations.

Hypothesis/Goal of Study. Addressing this issue necessitates the cultivation of a diverse workforce. To tackle this challenge, we designed the "Artificial Intelligence and Machine Learning Applied to Health Disparities Research (AIML + HDR)" course, which applies comprehensive Data Science (DS) approaches to examine health disparities, with a particular focus on Hispanic populations.

Methods and Results. The course covers technical aspects such as the Jupyter Notebook Framework, data manipulation using R and Python programming languages, and employing ML libraries to develop predictive models. Furthermore, it delves into health disparities topics such as Electronic Health Records, Social Determinants of Health, and Data Bias. The course was successfully delivered to a group of 34 carefully selected Hispanic participants, and their feedback was gathered through a survey using a Likert scale (ranging from 0 to 4).

Discussion/Conclusions. The survey results indicated high satisfaction, with over 80% of participants expressing agreement regarding the course organization, activities, and covered topics. The students strongly agreed that the activities were relevant to the course and enhanced their learning experience (3.71 ± 0.21). Moreover, the students strongly agreed that the course positively contributed to their professional development (3.76 ± 0.18). The open-ended question section was subjected to quantitative analysis, which revealed that 75% of the participants' comments confirmed their significant satisfaction with the course.

Citation/Acknowledgements. This project was supported by RCMI grant U54 MD007600 (National Institute on Minority Health and Health Disparities) from the National Institutes of Health.

Dr. Alan Woessner, University of Arkansas

Identifying and Training Deep Learning Neural Networks on Biomedical-related Datasets

Alan Woessner1,2, Usman Anjum3, Hadi Salman4,  Jacob Lear4, JT Turner5, Ross Campbell5,  Laura Beaudry6, Justin Zhan3, Lawrence Cornett7,8,  Susan Gauch4, and Kyle Quinn1,2
1Arkansas Integrative Metabolic Research Center, University of Arkansas, Fayetteville, AR, 2Department of Biomedical Engineering, University of Arkansas, Fayetteville, AR, 3Department of Computer Science, University of Cincinnati, Cincinnati, OH, 4Department of Computer Science and Computer Engineering, University of Arkansas, Fayetteville, AR, 5Deliotte Consulting LLP, New York, NY, 6Google, Mountain View, CA, 7IDeA Network of Biomedical Research Excellence, University of Arkansas for Medical Sciences, Little Rock, AR, 8Department of Physiology and Cell Biology, University of Arkansas for Medical Sciences, Little Rock, AR

Introduction/Background. Biomedical-related datasets are widely used in both research and clinical settings. However, as the size and breadth of these datasets increases, the ability for professionally trained clinicians and researchers to interpret datasets becomes difficult and prone to systematic or user bias. Artificial intelligence (AI), and specifically deep learning convolutional neural networks (CNNs), have recently become an important tool in novel biomedical research, but are hard to utilize due to their computational requirements and confusion regarding different neural network architectures. 

Hypothesis/Goal of Study. The goal of this learning module is to provide a gentle introduction to the types of deep learning neural networks and practices that are commonly used in biomedical research via public and published datasets. 

Methods and Results. This module is subdivided into four submodules that cover image classification CNNs, data augmentation, image segmentation CNNs, and regression CNNs. Each complementary submodule was written as a Jupyter notebook on the Google Cloud Platform (GCP) and contains detailed code and explanations, as well as quizzes and challenges to facilitate user training. 

Discussion/Conclusions. Overall, the goal of this learning module is to enable users to identify and integrate the correct type of neural network with their data while highlighting the ease-of-use of cloud computing for implementing neural networks.

Citation/Acknowledgements. We would like to acknowledge the following funding grants for this work: 3P20GM103429-21S2, R01AG056560, R01EB031032, P20GM139768.