Why is data important? Because it makes people count.


Why is data important? Because it makes people count.

While I was at a facility in Lilongwe this week, piloting a study about data use for the Kuunika project, I was inspired to create a little photo essay. I asked some of the facility staff a very simple question: “Why is data important?”

Each person’s eyes lit up when I asked them this question. Having just participated in the pilot for our survey, they were very engaged and really wanted to discuss the importance of data and how to improve data systems.  

So, I wrote down the first thing they said in response to my question and asked them to hold it up for a photo:

Facility staff in Lilongwe holding up why they think data is important. Staff include nurses, clinical associates, data clerks, health surveillance assistants, and our data collection staff.

Facility staff in Lilongwe holding up why they think data is important. Staff include nurses, clinical associates, data clerks, health surveillance assistants, and our data collection staff.

Although I’m typically sitting in a room full of people who all care deeply about data, getting to why they care can be more complex and nuanced than you might think.  I’ve worked in global health for several years now and have been focused on data systems, but I’m guilty of taking this basic question for granted.  So, this also got me thinking about the question. Why do I think data is important?

For me, data is important because it makes people count.

It makes the patients who come to the facility and are entered into the register or electronic medical record system count. It makes the work of the healthcare workers count. They spend hours caring for patients and tallying the numbers of patients they’ve seen. The tally sheets are sent to decision-makers who are challenged with figuring out how to treat large numbers of people in an equitable way despite resource constraints. It makes the decision makers at the district, national, and international levels accountable to the people whose lives they are trying to make better, people who might otherwise not be counted.

Data is important because it represents people who are important. At the end of the day, behind those numbers are people that matter. People who come to the health care clinic for services that they desperately need to stay alive. These people are important, and that is why data is important.

-- Andrea




Why Open Data Isn't Changing the World - Yet


Why Open Data Isn't Changing the World - Yet

Admittedly, I was a little late to jump on the podcast bandwagon. It took NPR’s Serial to get me on board, but now I can’t get enough. As a visual learner, I also didn’t think listening to people talk about data and data science would be that interesting…man, was I wrong.

Today I tuned into what is quickly becoming a favorite: The Digital Analytics Power Hour (highly recommend). The topic was “open data,” something we pay a lot of lip service to in international development programs but never seems quite well-executed. “Open data” simply means  publically sharing information you collect and store, and there’s a huge push now to get governments to open up information like never before (partially thanks to our friends over at Development Gateway). Though important, the power of open data can really be felt when everyone—from a PhD student to a frontline global NGO worker—knows how to use it.

The guests of the podcast, Jon Loyens and Brett Hurt (veterans of several data-driven, private sector companies), flagged some concepts that resonated and got me thinking about some of the major issues with fully unlocking the potential of data for development.

Issue #1: “Data is tribal”

Jon Loyens’ use of this phrase struck me because it accurately describes how often information about development data stays within small groups. Even if some data see the light of day, few people know what to do with it because they don’t understand what it is.

For example, in Malawi health facility data collection happens all the time. Health workers are surveyed, facilities are assessed, staff is monitored, and health outputs are tracked and uploaded to repositories, but each of these activities is administered by different groups with different goals. If demographic information is collected (e.g. “education”), the responses will vary and standards for how these data are captured don’t exist. Some responses might be “secondary” while others say “MSCE.”  Are these the same or mutually exclusive? Ultimately, when looking through results reports or data files (if available), there is little to no documentation of what criteria are set to determine these categories, so comparability is limited.

The same is true for health facility names. In one dataset a site might be named “Monkey Bay District Hospital” and in another, “Monkey Bay Hospital.” Are these the same site? The only way you could know for sure is to ask the group who collected it.

Knowledge about the data—the context, definition, how to interpret—stays within the “tribe” that collected it. This phenomenon doesn’t happen because people don’t want to be collaborative. I think it happens because developing adequate documentation is super time consuming and no one forces organizations to do it. According to the podcast guests (and I agree), 80% of analytics is janitorial. Not fun, but necessary.

Once the study is over, the report is written, and the funding has dried up, the data are filed away to collect dust in the silo. Meanwhile, in the silo next door, another organization is creating their own data collection tool from scratch with similar, but slightly different, categories.

Imagine a world where all the information collected at the site level in Malawi over the past decade suddenly has the same linked “key,” like an official site ID, and you could compare vast data from multiple sectors over time for a single site.  <sheds tear>

Note: I’m certainly not picking on Malawi. This a massive problem in every country and sector. It just happens to be what’s fresh on my mind.  Malawi is currently developing a nation-wide site registry that will link all sites with a common ID, so kudos to them on that front.

Issue #2: “People won’t understand the nuance”

Exceptionalism. <big sigh>  

I put issue #2 in quotes because, unfortunately, I said it in the past. When I worked for PEPFAR, we created a data stream that generates massive amounts of information on US government expenditures linked to program outputs (e.g. expenditure per person tested for HIV). It covers 58 countries, geographic regions within countries, thousands of implementing partners, tons of indicators.  The short story is, it’s a big and detailed dataset with many dimensions and provocative information. Once we had the data we used it extensively within PEPFAR, but refused to release the contents publically. Our justification? “It’s incredibly nuanced and there is potential for people to misuse it.”

I certainly get why this is a problem—I tend to be more of sharer than a keeper—but there was a real fear at the time that these data could damage perceptions of the program or reputations of our partners. Not because there was evidence of any glaring malfeasance, but because it shed light on areas where PEPFAR really needed to do better with the money available. 

Our mistake in the above example came from a generally good, if not misguided, motivation: a fear that can be ameliorated with better documentation and tools. There are, however, more shady situations where failure to share information is due to a fear that people find evidence of fraud or information mishandling.

In either case, “nuance” is not a valid excuse to keep data locked up tight. On the contrary, choosing not to share information that could be better used by someone else to improve development programs should be viewed as potentially damaging and a challenge to progress.

Issue #3: Data Quality

Though “misuse” can be invoked as a justification for sharing data, “data quality” is by far the most ubiquitous. This warrants a whole discussion on its own, which I won’t get into here. I will just say 3 things:

  1. People are generally nervous about how more information about their activities will impact them, which causes them to recoil at the notion of open data
  2. This fear is compounded when financial information is involved
  3. “All decisions are made on the basis of incomplete data, so either learn to live with this fact or get out of the game.” – Robert Townsend

Issue #4: We don’t understand the power of the semantic web

The semantic web is basically a concept and set of tools for better data documentation and standards that enhance congruency of data sources across the web. Seems basic, but we can’t even fathom the power this unsexy and unglamorous work unlocks.

Data janitorial work—data hygiene as I call it—has the power to democratize big data. We don’t all need to be data scientists to enjoy the insights better linked data will produce. We just need to line things up more effectively and let some amazing learning tools and bright minds step in.

As Brett Hurt states in the podcast, “The NSA gets it. Palantir gets it.  Facebook, Google, they get it.” But the rest of us aren’t there yet. Further, we can’t expect machines to really make our lives easier until we make it easier for machines to understand our data.

For this to work, the development community has to rally around data hygiene as a first principle and actually mean it. Then we have to get over our hesitance to share our work, warts and all.

I look forward to the day when data is communal, instead of tribal. 

- Tyler Smith



Put your data where your decisions are! A systematic analysis

We’re gearing up to launch the Kuunika: Data for Action project in Malawi. To date, we’ve had the pleasure of working with some of Malawi’s finest across government, private, and not-for-profit sectors.

As we work on the project’s implementation plan, a gap in the landscape became clear.  Kuunika’s overall goal is to increase access to and use of high-quality health data at multiple levels. However, given the vast array of actors, skillsets, and data systems, where do we start?  We want the project to target those users, systems, and activities we expect will have the greatest return on investment for improving HIV outcomes.  As such, we realized we need better information on what critical decisions are made that lead to HIV program outputs and outcomes.  In particular:  Who are the decision-makers and what do we know about them?  What HIV-related data is being used for decision making and where does it reside?  What information is missing? 

We could answer these questions anecdotally, but didn’t understand the full picture. 

It turns out, little has been done (or written about) to systematically document the critical decisions at various levels of the health system and indicate how users, data, and systems interact to produce action.  If our goal is for clinicians and policy makers to make more informed decisions using empirical data (which it should be), we need a better way of cataloguing the decisions and events where the right data need to be at the right users’ fingertips.

Cooper/Smith is helping to answer these questions in the Malawi context. We undertook a rapid-fire study: Strengthening Routine Use of Information to Improve HIV and Health Outcomes in Malawi: Systematic analysis of key data users and decision points.  We know, it’s mouthful, so we will refer to it going forward as the Data Users Study.  Our objectives were twofold:

  1. Systematically document, relate, and validate assumptions for key data elements (indicators), users, and systems that manage Malawi’s HIV response
  2. Identify the critical decisions/events encountered by decision-makers and the information used or needed to improve HIV program effectiveness

It took a couple of months from conception to execution to obtain the data needed from communities, service facilities, districts, and central offices.  We collected information from a wide array of actors, coded/analyzed responses, and extracted some initial gems to inform Kuunika implementation design.

For example, study respondents identified a total of 335 unique decisions typically made in their job functions.  We grouped these into 85 common categories.  Of those 85, the top 5 categories accounted for over 40% of all decisions identified.  These categories included drug supply, treatment initiation, defaulter follow-up, program performance and referrals. If we want to maximize return on investments in health information systems in Malawi, we should prioritize based on which decisions occur most frequently, which data are most valued, and which systems can/should be linked to efficiently produce this information when needed.

Please take a moment to look through the initial findings from the study.  We will continue to add to this analysis over time. We are also working on a Phase 2 of the study, which will examine health worker preferences for different incentives associated with promoting data use. 

As always, we love to talk about data!  Please share any thoughts or comments below or email us. 



The Evidence Heat Map on Data Use is Now available!

Which interventions are effective in stimulating health sector data use?  This is a question we've been asking for quite some time as we gear up the Kuunika Project: Data for Action in Malawi.  Turns out, there's not a great deal of evidence on data use and incentives, especially related to increasing access to information, capacity for analysis, and data-driven decisions.  

With a global push towards an information revolution in health, program planners need to create the value proposition for health workers to focus more on data.  Frontline workers have to see better use of data as a way to save time and improve quality of care.  Health managers need to see how applying routine data can increase program output and cost-effectiveness of limited budgets.  Choosing interventions for promoting data use must be aligned to both personal and program incentives if they are to be effective, as well as tailored to the target groups and context. 

In the lead up to the project design in Malawi, we completed a rapid lit review to catalogue evidence available on data use and incentive programs.  Realizing that others may be grappling with the same question, we assembled this review into a database and dashboard for easy access and navigation. This list is by no means exhaustive, but we are hoping this will be a good start and help promote collaboration and sharing of ideas for those working in health data. 

We hope this tool can be expanded and look to you to fill in any observed gaps.  Eventually, this may prove to be a useful platform for coordinating projects within countries and regions, sharing lessons learned, and discussing common challenges.  Please let us know what you think in the comments section.  If you have any requested additions to the evidence base or feedback on the design, please contact us

Special thanks to Roberta Makoko, Megan Wolfe, and Sara Walker for their contribution to the lit review and our own Andrea Fletcher for building such a sleek dashboard.  

Check it out!



Steps to connect fragmented health data

What does it take to get to a unified system for monitoring and evaluation of health programs?  Cooper/Smith Co-Founder, Tyler Smith, explores this question in the Malawi context in a recent report commissioned by the Global Fund to Fight AIDS, Tuberculosis and Malaria

Most donors, including Global Fund, are looking to optimize their health investments in partner countries as needs increase and global health investments plateau.  Optimizing programs means employing high-quality and relevant data to plan programs that will have maximal returns on health benefits.  Having the best data means investing in monitoring and evaluation (M&E) systems that are efficient, well-coordinated, easily accessible, and promote a culture of data use at all levels.  Something much easier said than done, especially considering the ballooning reporting requirements piled on by donors.

Though tragically underfunded, improvements to health information systems (HIS) are gaining traction as the world embraces digital solutions (e.g., mobile tech) and as the tidal waves of data collected by health and development programs swell at break-neck speed.  Simply put, without better data management in health, countries won’t be able to effectively store or access the information collected, let alone use it for routine decision-making.  And if we aren’t going to use it, why are we distracting frontline workers from patient care to collect it? 

All that said, our staff have worked on the flip side for many years and understand the inherent dilemma.  Within PEPFAR, we pushed actively for more data from countries to increase accountability and allocate resources to better match disease burden.  Though effective for our program at the time, we know these increased requirements created strain on health workers and competition across programs for high-quality information. Aid earmarked for specific programs, tied to specific reporting requirements, isn’t likely to disappear anytime soon; it’s how we learn to manage and leverage these data streams that will ultimately determine if the data we collect are for show or truly used to improve health outcomes.

The Malawi report discusses gaps in performance and limitations of M&E systems in the country, barriers to data access and use, barriers to systems integration, and recommendations to address the principal gaps and barriers.  Further, a landscape analysis is included that can serve as a reference point or primer to those less familiar with the country context.  The analysis brings together findings from previous assessments (with hyperlinks to full text) and maps the flow of data for HIV, TB, and malaria programs in Malawi. 

We hope you find the report useful and would love your thoughts.  Please email us (contact@coopersmith.org) or post your comments below or on our twitter feed.   Don't forget to subscribe to our blog to get the latest as we disseminate interesting findings in the near future.  

Thanks for tuning in! - Tyler 



Speaking Through Photos

An update on our work in Malawi: The Kuunika Project, Data for Action