Episode 25
Episode 25: Public Health Informatics and Geospatial Analysis with Dr. Yan Li
In this episode on data analytics, informatics, and geospatial analysis, we speak to Dr. Yan Li, Associate Professor at the Center for Information Systems and Technology, CISAT at Claremont Graduate University.
We discuss Dr. Li's recent publication on health informatics, SDoH, and the development of a Vulnerable Population Healthcare Accessibility Framework (VPHAF) to help improve accessibility and quality of healthcare.
Article Link:
An informatics-driven intelligent system to improve healthcare accessibility for vulnerable populations
https://www.sciencedirect.com/science/article/pii/S1532046422002015
Learn more about the Public Health Podcast and Media Network: publichealthpodcasters.com
Transcript
April Moreno 0:04
Welcome to the public health networker, the official podcast of the public health Podcast Network. I'm your host Dr. April Marino. Join us as we speak to public health professionals around the country and around the world, in Global Community and Environmental Health Topics. Join us also as we speak to podcasters in this field of public health. To learn more about us visit public health podcasters.com. And in the meantime, enjoy the episode. Thank you for joining us for this episode of the public health networker. In this episode, we are speaking with Dr. Yan Lee of Claremont Graduate University, as she works with the department called the Center for Information Systems and Technology in Claremont, California. And this is about biomedical informatics, and the use of spatial analysis. So today we're talking about her most recent publication, it's entitled an informatics driven intelligent system to improve healthcare accessibility for vulnerable populations. She did this with Abdulazeez said Alberich. of King Faisal University. They co authored this paper, and today we're going to talk about some of the research that was involved in this process. It's an excellent study. And I had to reach out to Dr. Lee to hear this information with her and to discuss health informatics through the social determinants of health. So I hope you enjoy this episode. Thank you for joining us today. So we are here with Dr. Yan Lee. She is part of Claremont Graduate University Center for Information Systems and Technology, also where I graduated from. And so I'm really excited to have this conversation with you today, Dr. Lee, we're going to talk about your paper that you co authored with Dr. Alberich. On an informatics driven intelligent system to improve health care accessibility for vulnerable populations. A lot of this is kind of the spirit of the public health networker and the public health Podcast Network. So thank you so much for joining us today.
Dr Yan Li 2:01
Thank you for inviting me. Thank you. So please tell us a little bit more about yourself.
Dr Yan Li 2:05
Yes, I'm an associate professor at Center for Information Systems Technology, CISAT at Claremont Graduate University, and my research many areas of advanced analytics, data science, data management, and but I also because the nature of the data science research is very cross disciplinary. So just recent past, I would say five or six years have done quite a lot of work in really, in applying analytics in house in data in the health domain. I have worked with as healthcare providers, and I have worked with insurance payers, and have worked with also like pharmaceutical companies, School of Pharmacy, so it's a different spectrum of healthcare. So somehow that I got really actually attracted to a lot of health informatics words, although my original training is really more information systems and technology.
April Moreno 3:14
Yeah, thank you so much for sharing your story with us and information systems and technology, all of these things can be so interdisciplinary, so it's very exciting. Yeah, yeah. So thank you. And so, you know, this paper, it's, you know, addressing vulnerable populations. It's using informatics for healthcare accessibility. So, you know, in the abstract, it talks about those broad disparities that continue to exist and having accessibility to health care. Okay, so you talked about the behavioral model of health services use. And then also you worked on basically building a vulnerable population health care accessibility framework. This is really exciting. Can you tell us more about this framework?
Dr Yan Li 3:57
Yes. So more specifically, this framework is really incorporated to mainstream of research. Oh, I say I would say three mainstream, I would say Swiss firms. So one stream of research is really related to social determinants of health. So I think there's a lot of literature in public health and health care, people out understand that the social determinants of health are as to edge is a very important component of the healthcare outcomes and in providing health care. And within this like sto X variables. We are those factors that other than your Ebola is, is really related to your environment where you live, right, or you are, where you are, how you grew up, your education, your language, your literacy. Those factors are really something actually a lot of literature them Straight accounted for 30 to 40% of the overall health outcomes. And however, that is always is a struggle, how to incorporate that in healthcare, our health care systems to be able to specially focus on this, and we call those vulnerable populations. And those vulnerable populations are those higher higher risk to the exposure to those SDOH factors. Like for people who are economically not stable, like they have food insecurity, housing stability, right, for people with low literacy, you know, has language barriers, and they have, you know, the if they stay in a community really have discrimination doesn't have social cohesion, those are actually actually influence their health outcomes. And those people are actually, because of their risks. They have high healthcare demand, but know how low health access. So that's where we've tried to address some this component, that another part of literature, as you mentioned, this is built upon on called behavior model of health service use. So with this behavioral model of health service use is really kind of kind of linked that SDOH with Health Access, so really try to explain how the population characteristics were influencing their health care access patterns. Then third part of the literature we bring in is really a body of literature that costs special health care accessibility literature that has actually been investigated a lot in geography, Information Systems Research, GIS research, special, you know, special healthcare research. So pretty much in this we call that a special healthcare accessibility approach aisense, instead of traditionally, you are, you're considering us if you're considering access is defined as number of like, if you have a provider, you just divine access with within an area. How many providers are serving, how much public how many population, that's a simple ratio approach population divided by providers, right. So but in this school of salt, and we call floating catchment Messrs in special health care accessibility using very advanced GIS algorithms, try to use a supply and demand model. The idea is your other supply like providers, we would be able to cover a certain area called catchment areas. And then you have certain area boundaries usually define that catchment area, for example, have some definition, say, driving within 25 minutes, say this provider can cover that catch this many population during needs. On the other hand, as someone population I live in my location, I can also have my farmer demand side, right, I also can reach out within my 25 Driving radius, the suppliers is within my area. So that's the plan, supply and demand interaction are being realized through the algorithm and it's through the GIS mapping. So that GIS tool can do that. So this research, we say, let's bring this together. Let's bring that what house behavior model has told me told us that how were those those car SDOH characteristics would impact access, then we're using data to inform and inform us exactly how those impact would be right because otherwise, traditional SQL only impact is by an expert rules, saying this is 10% income is 10% is the education is 10%. But that's all by expert rules. So in this one when we're building a system for specific organization, we say let's based on your data based on your data tell us how exactly which attributes or which sth factors will impact access. And we of course, we focus on those vulnerable population characteristics, then we find those ways. Then we integrated in the floating catchment method, that GIS based method, they will create this framework. And then we calculate a call accessibility index. That index is that incorporated sth factors for the vulnerable population, then have that index displayed on the map So that's the framework. But it just framework itself is not enough. Because the index is very complex, and its interact with each other. It shows in the map, I can show the in tax, but the decision maker and its user come in and say how I'm going to make sense out of it, right? So we can do so big, we can do so by creating very intelligent spatial decision support system, where you can just hover over the system to understand a specific area, what is my population characteristics, how those asked you actual vulnerable population characteristic impact access. Then a step further, for example, we provide a scenario analysis, say, for decision makers, if we want to improve access in this area, by adding physicians or providers, how should I do so I could come in just adding, you know, number of prepositions to see how I can improve the accessibility for those areas. That's low access to a higher level. So that's how decision support comes in play. So the basic idea is algorithm is very complex incorporate all the crucial components of SDOH healthcare access, and a provider availability. But then we build in very decisions, a process and functions for policymakers can make decisions to satisfy different users needs. Thank you. And I'm looking here
April Moreno:at the figure to the vulnerable population, health care accessibility framework. So you have these different steps in the process from the data elements data collection, the data understanding and preparation, analytical modeling, weight assessment, and then the accessibility index calculation. Can you tell us more?
Dr Yan Li:Yes. So first of all, you know, in order to do that data driven, in order for the data driven, identify that way to calculate that index, right, so the first step, we're going to say, Wow, how do I find the data to do that? That's why go we we use the behavior model, BM model to help us identify those factors. Those factors includes the characteristics of the population, as early we say, like age groups, gender distribution, then also there's data related to house behavior. Some example house behavior is how often do you exercise? Do you smoke, you know, the smoking or drinking behavior, then the third big components, it's called consumer satisfaction, that is a huge component in the behavior model, pretty much saying, you know, whether whether a user are satisfied with the services really driven the utilization of the services, we often have seen this, if you know, the doctor with high ratings, for example, primary care doctors, they don't send new patients because they are in high demand. Right? Then you often although you, you were in your area, but you couldn't find a primary care doctor that you like, that often happens, right? So that consumer satisfaction, how do you really read a provider with the assumption that the provider who has a high consumer satisfaction usually has a higher demand? So they you know, more people would want it. So you kind of want to attract high quality providers? Then of course, there are other SDOH factors such as environmental score, like you know, your air quality water quality in your area. So all those once we identify the elements of interest, say, Oh, those are what we're interested in. Then the next step, say, where do we find this data? Right. So there's many things you could really want, but you may not be able to find the data. So I can give example. Like with population characteristics, in this specific case, we worked with a regional health care plan. So within their data warehouse, they do have those they have, you know, patient age, ethnicity, where do they live, what language they spoken, so that the data we could get from internal organizational data, then from the house care behavior, on the house behavior from the, again, we're doing a population, we are doing a population health management, or public health management approach. So we don't really care about individual characteristics. We're looking at a population base location based so we're looking in this study we looked at a specific census tract. So within this track, you know, what is the general health behavior? So with that, like tobacco, use your diet or exercises your drug use, we actually use a track level data from a simply analytic database that has created data usually updated. They infer the consumer satisfaction for us is we know the plan providers, we actually crawl Yelp review data, there's quite a few website providers data for example health grade, but there's certain data set or say just one lady you cry you have by their data. So in our case we use yelps data. There's also another called cap survey cap survey is a provider provider provider quantity. So we base on the service location, we just for each provider getting that their Yelp score as a proximity of their quality plans, what other you know, SDOH factors such as like environmental and water and air quality, we really broad from call or E at church a is environmental, and how Stata, then other like, there are some other sth factors such as priming DAX employment rates, income distribution, we really brought in from the Esri sense ESRI demographics. So those how we got to bring the data together. Of course, anyone who understand analytics or data science, you know, the next step is very important in a sense that, before you do any modeling, you have to understand your data and clean it and prepare it right. So I'm not going to go into that detail. But generally you have some data cleaning need to do, you have set data transformation, you need to do specific our data, we have data from organization as individual level that we have data from track level, so we need to have some data aggregation method at the same level before we can do modeling. So for the modeling is just we do we build a different kind of models. And we use this model to determine rates. So more specifically, we use decision trees, we use random decision, random forest decision trees, to build a very robust model. But we're now using the three results, we're using this inquiry to help us identify using variable importance to identify those weights. Very similarly, we are looking, you know, use off ratios to really, from the logistic regression to estimate the group rates within each factors. So I give you an example. For example, in the paper we show gender is being a most important discrimination factor for health care access, it's very important, but within the agenda, we have male and female. So right, so we know gender counts, you know, it's probably over the final world probably 30 or 40% of the demands. But then we say well, but how exactly in the sub within the agenda we have a group of male and female. So use odds ratio, we know that, you know, for males, health care access to men is twice as much as female, or vice versa, then we can base on the population distributions, we can assign a different kind of demand on the population perspective. So that goes our weight assignments. So the way assignment factor weights are related the SDOH factors, then we have subgroups ways, like I described a male and female, that's realized within each factor, you have multiple subgroups, then we also need to give a quality provide a quality weights, right. So proper water quality is based on the reviews, then eventually, all the weights had to be normalized, because any weighted factor approach there had to be add up with one. So any within the subgroup weights added up with one as one factor weights as added as one. So afterwards, really, the algorithm itself is increasing the paper equation, one paper is just used that algorithm calculate an index. And that index is supply and demand based. And that index calculation actually through the GIS system. To do that, after this, we show it on the map show map only index
April Moreno:was fascinating. So I see here, your figure five, the map of the results with the framework. Fascinating. Okay, so let's talk about the framework here. So we have the highest levels and the darkest green, what is this data telling us?
Dr Yan Li:So usually the darkest green is this these areas is high access area means generally it's people can access have adequate access to health care. We think again, this is the only use the data of the health care plan. So far, the health plan, health care plan coming to those areas, we are good, we probably have sufficient providers address our population needs in that area. But very, like those yellow areas are the areas of concern. Those areas are very low access areas. So if you look at figure five, you probably realize usually, generally, that's the case, right? Usually rural areas actually has a low access, because it's just for them, it's just very far for them to re reach healthcare provider. But there's more importantly, you know, if you drill down, that's more interesting is in some a lot of, you know, how to care that urban areas, but within the urban areas, you have low like pocket of low access area, like those areas, you really need to go in and take a look and try to do, you know, try to improve. That's really what the, from the from our does from ours as a health care plan, point of view, this healthcare provider, health care insurance provider, are really interested in that. So they really want to see which area we are for short. So for the for short area, so they can put more resource try to recruit providers. Right. So because, yeah, because our, you know, our insurance plays, especially for primary care providers, you really by the pay structure. So in order for them to have more incentives to for short areas, they can provide more incentives for them to to to really attract provider join that network.
April Moreno:Mm hmm. Yeah. It's fascinating, because, yes, there there are these portions like in the dark green, these high utilization, high quality access to health, military base, Twentynine Palms, for example. Right. Yeah. Yeah. And then some portions of probably Loma Linda region. Yeah. Green. Yeah. However, like you mentioned, the there are also urban areas that are not in the dark green, for example, like salsa, Pomona, Corona, Corona is a highly populated region.
Dr Yan Li:It's not that yeah, they haven't, because they're, you know, they don't have enough providers in that area. Right. So those area is going to be for the health plan is going to say, how can we actively recruit good providers to join our network. So the pay structure, for example.
April Moreno:Thank you. So tell us about some of the implications. And some of the also, before we continue with that part, I do also appreciate the mixed methods approach that you use here, you are looking, as you mentioned, the population level looking at, you know, more of the quality quantitative, the numeric clumps of gap clumps of data, right. However, you also zoomed in and shared some of the descriptive qualitative information, which I think has so much depth and so much story to it. Can you tell us a little bit more about some of the qualitative findings?
Dr Yan Li:Yeah, so what happened? So basically, definitely, like, you know, whatever, we building a system, or we build an algorithm, we call that artifacts, right? So it's information, we've got artifacts, because it's the artificial thing. But it's actually how to solve the problem. So this artificial thing, which is like a system is pretty much you can think about almost like a physical thing you build. Like, if you build a cell phone, for example, let's say that's your iPhone, you have an you have a phone, they actually have a lot of quality checks before they say they can, they can say, Oh, that's a quality phone that has to go out. So very similarly, we can build our framework, or we can design a framework, we can build a system, but we have to evaluate, say, Does it do what intended to do? Is it useful for the end users, right? So very similarly, that's where we did a very mixed method to evaluate to evaluate age, our framework and our system. So one way to do it, we use very quantitative approach, we compare it with other methods used statistical analysis, then we show that, you know, our method is statistically different from other methods. But different doesn't mean better. Right? So then we drill down, we did a qualitative comparison of the mass between the generated by our method versus Nam method, and show, you know, our our method actually showed how, as you actually have a huge impact on accessibility. That's the qualitative evaluation. More importantly, again, we build a system. We build a system that do it supposed to do, but how does any user does? So we actually implement system within the organization. We have done interviews and focus groups, we have with, you know, organization of real users, and they provide a lot of feedback on if, you know, basically, they said it's very useful, but they also provide feedback. on how can you improve that? That's iterated you know, improvement actually got feedback and continue to improve the tool. The end goal is, in the end, again, whatever we do to make an impact, people have to use that, right. So organizations I have used that we can show you, but if they don't use it, and they don't implement policy interventions, or interventions to improve, there is still that they're making pact, which just can be a beautiful tool sitting on their desktop, but we are really wanting to say, Okay, you, it's useful, I'm gonna really go out to do really to help address the health care access needs for those vulnerable populations.
April Moreno:And I like what one of these quotes is saying about how the data doesn't just provide information, but it's actionable information on how people can actually take next steps towards more beneficial results. So tell us a little bit more of the implications. And, you know, kind of what the conclusion was about free, what are the next steps in this framework in this process?
Dr Yan Li:Yes, as we say, you know, we definitely feel that our framework is very unique and a novel in the sense that we really provide a systematic way for organizations to incorporate social documentary house to measure, you know, especially focusing on vulnerable population characteristics within their study within their tried to assess healthcare access. And secondly, I think the really contribution is, we did not stop there just providing an algorithm staking pool improvement, we actually show the such algorithms improvement in order to, you know, to make an actionable output, we implemented in a spatial decision support system, right? And we call it intelligent because it really gives you recommendations give you not only we just show the map, say, Oh, this is a statistic, but we give you interactive scenario analysis, say, what if I do ABCD? How can I do so that scenario analysis can really play into policy implementations implications that how we're going to really improve health care access? I think for us, I think the biggest implication is, we will really we have, you know, we have very systematic way that saying you can, and the policymakers, not only just insurance, health care insurance providers, right, it could be also really like government, because a lot of like Medicare and Medicaid, they also wanted to improve health care access, as in government agencies, or state, regional health care, and public health workers, or public health policymakers, they could all utilize similar tools like this, again, the framework is very generalizable right. So you could pick the data characters that you think is relevant, and within the domain, and you can go through the algorithms and putting a map, then you can adding different kind of scenario analysis that fit your organizational needs will help you to really to do the design policy interventions.
April Moreno:Thank you so much for this publication, this article and the research that you've done. You know, we really appreciate the use of GIS and health informatics for improving health outcomes and access to quality care. So thank you so much for sharing this with us today. Dr. Lee. Thank you