Experiments Posts

Interoperability Judo in the Aid Sector

judo_2_large

Photo Courtesy of Yayasan Damai Olahraga Bali

Ten years ago I nearly set an Italian hotel on fire. I’d plugged an American fan into a European electrical socket, and after about 30 seconds I had a shower of sparks landing on the curtains. What I’d forgotten to account for, of course, was the difference between the 120 volt standard that the fan was expecting and the 240 volts that the outlet was producing. Just because the connection worked physically didn’t mean it would work practically. Contrast this with last month, when I’d brought my laptop and charger on a trip to India but again forgot a power converter. Thankfully, Apple has an elegant solution to the problem of different electrical standards. Rather than trying to convince every country to use the same voltage in its wall sockets, they’ve just built a charger that can handle a range of inputs. They accept the complexity that happens when large groups of people try to collaborate and work with it, not against it. They’ve taken an obstacle and turned it into an opportunity. It’s design judo.

When it comes to financial flows in the aid sector, standards are more complicated than deciding what plug to use. With so many governments, organizations, and companies sending billions of dollars to support global development, communicating the details of these relationships and transactions in a shared framework becomes a herculean task. The International Aid Transparency Initiative (IATI) has made progress in establishing a standard for the sector to describe funding and implementing relationships consistently. The list of 480+ entities that publish IATI-compliant data understates the standard’s reach. Most of the 29 member countries of the Development Assistance Committee (DAC) report IATI data about their aid spending, and the funds sent by these governments represent about 95% of total DAC expenditures. It’s hard to estimate an exact number, but it’s safe to say that the IATI standard describes a significant majority of the world’s aid dollars.

Still, there are some challenges to using IATI-compliant data to get a precise understanding of how the aid sector is actually organized. Despite IATI’s thoroughness, organizations still interpret the requirements differently, leading to the same data fields containing multiple types of information. This can make seemingly simple tasks, like identifying a unique organization consistently, very difficult in practice. Similarly, there aren’t strict validations or requirements preventing organizations from omitting data or inadvertently hiding important outcome data in a pages-long list of transactions. Organizations that don’t share their data are left out entirely, even if they’re mentioned frequently by organizations that do report. All this can make it hard for aid professionals like funders, program implementers, or researchers to extract useful conclusions from IATI data.

So what should the sector do about this? One approach might be to double-down on the rules associated with our data standards and try to force everyone to provide clear, accessible, and organized data. This would be similar to convincing all countries to share the same voltage standard; it’s not a practical option. The alternative is the judo method: work with the challenges inherent in the IATI standard instead of trying to regulate them away. Some friends and I recently tried to do just that as our capstone project for the UC Berkeley Masters of Information and Data Science degree.

The end result is AidSight, a platform that provides easy-to-use tools for the aid sector to search IATI data, explore relationships between organizations (including those that don’t report their data directly), and validate the likely usefulness of their results. For example, imagine you’re an aid agency that needs to report on the current state of the water sector in Ghana. First, AidSight enables you to query all IATI data in plain english instead of a complex requiring search interface or a code-heavy API call. Your results appear as network diagram that maps the relationships between the organizations that meet your search criteria, whether they report to the IATI standard or not. Here’s our result for the Ghanian water sector – note that we’re mapping the just the organizations and relationships, not their real-world locations or relative sizes:

screen-shot-2016-11-08-at-3-34-38-pm

The green dots represent organizations that report data to IATI directly, the red dots are organizations that are implied in the data that other organizations report, and the width of the lines connecting them indicates the strength of the relationship. This approach takes the data reported by 484 organizations and turns it into results for tens of thousands. In this example, there are two “hubs” of reporting organizations on the right side of the map that work with 5-7 non-reporting organizations at varying levels of connection. In contrast, there’s another hub organization (GlobalGiving itself) towards the bottom left that works with many more organizations, but in the same way with all of them. Using this method, users are quickly able to spot the key players in any sector and explore the strength of their collaborations instantly.

Understanding these connections is important, but what if the report needs more granular results? Before downloading and analyzing the raw data, you’d want to know if you’re likely to be able to draw meaningful conclusions from the results we’ve found. To make this easy, AidSight contains a data quality dashboard that uses heuristics to estimate how useful each organization’s data is likely to be and summarizes it with a simple letter grade.

Example AidSight Data Quality Dashboard

Now, anyone at an aid agency can measure IATI data quality with a few clicks and save their data science teams to focus on only the most useful datasets. We can also use this approach to establish valuable benchmarks for the aid sector as a whole. The average grade of C- suggests that there’s lots to be done to improve the quality of development data reporting, but having a framework to measure progress makes it possible to consider how we might get there.

Currently, AidSight is a minimum viable product, so there are many improvements to make. Still, solutions that focus on data interoperability without trying to fight the natural complexity of the aid sector represent exciting opportunities for us to bring enhanced accessibility and understanding to our work in a democratic way. Taking the judo approach to development data means that a growing number of inventive, creative, and driven users will be able to discover new solutions to the aid world’s challenges.


Special thanks to the other members of the AidSight team: Natarajan Krishnaswami, Minhchau Dang, and Glenn Dunmire, as well as Marc Maxmeister for his feedback on this work. Explore IATI data yourself at aidsight.org or download the open source code on Github.

Is Overhead All In Your Head? How Cognitive Psychology (and Font Colors) Can Drive Donations

Nick Hamlin, GlobalGiving’s Senior Business Intelligence Analyst, shares results of a recent experiment on the GlobalGIving website. (Photo courtesy of The Muse)

Nick Hamlin, GlobalGiving’s Senior Business Intelligence Analyst, shares results of a recent experiment on the GlobalGIving website. (Photo courtesy of The Muse)

No one likes worrying about the overhead costs associated with nonprofit work, and rightly so!  For years, overhead ratio has been of the only metrics that donors could use to compare philanthropic choices.  More recently, conversations like The Overhead Myth have pointed out that the world’s best businesses need operating capital to innovate and succeed, so why should nonprofits be any different?  Even though better measures of impact and effectiveness are increasingly available and accepted, a typical donor’s natural reaction when they see a percentage come up in a conversation about nonprofit fees is to interpret it as an overhead ratio. And most donors still don’t like overhead.

For us at GlobalGiving, this presents a challenge.  While we retain a 15% fee on donations through our website, our actual administrative overhead ratio is around 2%. Despite testing several different ways of demonstrating and explaining the difference between our fee and and our overhead, we still get lots of questions about our fees from users who assume that the two are the same. To help fix this, we recently asked ourselves: what if it’s not the explanation text that’s the problem, but how users are experiencing and processing the information it contains?

For inspiration, we turned to the world of cognitive psychology.  In his famous Thinking Fast and Slow, Nobel laureate Daniel Kahneman describes how we all have two systems at work in our brains.  System 1 is our intuitive, quick-reacting, subconscious mind, while System 2 is analytical, logical, and methodical.  He mentions a 2007 study that tried to use the interaction between these two systems to improve scores on the “cognitive reflection test”.  This short quiz consists of questions that seem simple at first, but have a “wrinkle” that makes them more complex than they appear (try them yourself). Half the participants in the study took the test normally, while the other half took the test under a cognitive load, meaning the questions they received were written in a lighter font that made them slightly harder to read. The researchers found that the second group performed much better on the test, presumably because the cognitive load caused their analytical System 2 processes to take over from their more reactionary System 1 minds.  Once in this “more logical” frame of mind, they were much better equipped to tackle the tricky problems.

After reading about this study, I wondered if we could replicate the results on GlobalGiving to help donors process the explanation of our fee and the accompanying invitation to ‘add-on’ to their donation to cover this fee on behalf of their chosen nonprofit. Our hypothesis was that donors usually use System 1 when thinking about our add-on ask; they quickly assume that the 15% represents overhead and they’re less inclined to donate additional funds to cover it. But, if they engage their System 2 mindset that makes them process the text more analytically, hopefully they’ll find the explanation more convincing and be more likely to add-on. To find out if this would work, we planned a simple test in which a subset of users would be randomly chosen to see a slightly modified version of the add-on page during their checkout process.  This page would have exactly the same text, just shown in a slightly lighter font that, we’d hope, would trigger the cognitive load and drive extra add-on contributions.

Users in the control group saw this unmodified add-on prompt.

Users in the control group saw this unmodified add-on prompt.

The test group received this add-on prompt with a decreased font contrast to create cognitive load.

The test group received the second add-on prompt with a decreased font contrast to create cognitive load.

The plan made sense in theory, but we had to be careful as we put it into practice.  First, we needed to make sure that the random assignment process, made possible by our Optimizely A/B testing framework, was running correctly and that all the data we would need to analyze the results was logged properly in our database.  Even more importantly, we have an obligation to our nonprofit partners to make sure we’re doing everything possible to maximize the funds they can raise by offering a seamless website experience for donors.  If this experiment caused users in the treatment group to become less likely to complete their donation, we’d need to know right away so we could stop the test.

We set up a pilot study where we closely monitored whether the cognitive load caused by the change in font color would cause potential donors to leave the checkout process prematurely.  We also kept a close eye on post-donation survey feedback to see if anyone mentioned the changed font color.  Fortunately, there was no difference in donation rates or feedback during this initial test, and we felt comfortable continuing with the larger experiment, which ran two weeks at the end of July (just before the launch of our new website).  In the end, we collected results from about 700 eligible users.

So what did we find? 49.4% of our control group chose to contribute towards the fee, compared to 56.8% of users who saw the lighter font.  This sounds like there’s reason to believe users were engaging their System 2 brains and processing the request for an additional donation. But, it would be premature to declare success without additional analysis.  Specifically, we wanted to make sure there wasn’t another explanation for the difference in add-on rates.

For example, it’s possible that users who were new to GlobalGiving would be less familiar with our fee and therefore less likely to want to add-on to their donation to offset it.  Similarly, donors contributing during a matching campaign might be especially inclined to make sure that the most money possible went to their favorite organization and, as a result, would add-on more often.  So, in our analysis, we statistically controlled for these factors, along with the size and geographic origin of each donation, to get our most pure estimate of the effect of the cognitive load.

The final result was a 7.8 percentage point increase in add-on rates with a P-value of 0.046. This means that we have only a 4.6% chance of seeing results at least as large as these purely by chance.  If we take this increase and estimate what might happen if we made the change on the whole site, we expect we’d see around another $27,000 in additional funding created for our project partners over the course of a year.  That may not sound like much in the context of the $35M+ that will be donated through the site in 2015, but it’s not a bad return for our partners for just changing a font color!

These are exciting results that suggest the possibility of a new way of thinking about how we present our fee, and there’s still plenty of work to be done.  Longer runtimes and larger sample sizes would give us even more confidence in our results and let us explore other potentially important factors, like seasonal effects.  Thinking about how we might integrate these results into our new website also presents opportunities for follow-up experimentation as we continue to Listen, Act, Learn, and Repeat on behalf of our nonprofit partners around the world.

 

Special thanks to my classmates Vincent Chio and Wei Shi in the UC Berkeley Masters of Information and Data Science program for their help with this analysis and to Kevin Conroy for his support throughout the project.

How We’re Building GG Rewards Together

Next week GlobalGiving will be launching the new GG Rewards Program. Here’s a post by Marc Maxmeister that provides a sneak peek into the work that’s gone into conceptualizing, building, and launching the program. 

_______

GlobalGiving‘s goal is to help all organizations become more effective by providing access to money, information, and ideas.

That is a lofty, aspirational goal. To everyone else, it might look like all we do is run a website that connects donors to organizations. But internally, I serve on a team that has met every week for the past 3 years to pour over the data, to find an efficient way to help organizations become more effective. We call ourselves the iTeam (i for impact).

GlobalGiving's i-team. We try not to take ourselves too seriously.

GlobalGiving’s iTeam. We try not to take ourselves too seriously.

It is hard to move thousands of organizations in one shared community forward. We use gamification, incentives, and behavioral economics to encourage organizations to learn faster and listen to the people in whatever corner of the world they happen to operate.

Before 2014 we used just six criteria to define “good,” “better”, and “best.” If an organization exceeded the goals on all six, they were Superstars. If they met some goals, they were Leaders. The remaining 70% of organizations were permanent Partners – still no small feat. Leaders and Superstars were first in line for financial bonuses and appeared at the top of search results.

In 2014 we unveiled a more complete effectiveness dashboard, tracking all the ways we could measure an organization on its journey to Listen, Act, Learn, and Repeat. We believe effective organizations do this well.

But this dashboard wasn’t good enough. We kept tweaking it, getting feedback from our users, and looking for better ways to define learning.

What is learning, really?

How do you quantify it and reward everyone fairly?

The past is just prologue. In 2015, GlobalGiving’s nonprofit partners  will earn points for everything they do to listen, act, and learn.

LALR cycle-dark-bglalr-2015-explained

This week I put together an interactive modeling tool to study how GlobalGiving could score organizational learning. When organizations do good stuff, they should earn points. If they earn enough points, they ought to become Leaders or Superstars. But how many points are enough to level up? That is a difficult question. We worked with our nonprofit partner Leadership Council to get their ideas, and we also created some data models to help us decide.

Here is the data; the current distribution of scores for our thousands of partners, leaders and superstars looks like this:

learning_default_model

How to read this histogram

On the x-axis: total learning points that an organization has earned.

On the y-axis: number of organizations with that score.

There are three bell curves for the three levels of status. It is significant to notice that these bell curves overlap. It means that some Superstar organizations in our old definition of excellence are not so excellent under the new set of rules. Other Partner organizations are actually far more effective than we thought; they will be promoted. Some of the last will be first, and some of the first will be last.

The histogram shown mostly reflects points earned from doing those six things we’ve always rewarded. But in the new system, organizations are also going to earn points for doing new stuff that demonstrates learning:

new_learning_points

And that will change everything. “Learning organizations” will leapfrog over “good fundraising organizations” that haven’t demonstrated that they are learning yet.

old_vs_new_learning_points_model

Not only will different organizations level-up to Leaders and Superstars, everyone’s scores will likely increase. We’ll need to keep “moving the goal posts.” Otherwise the definition of a Superstar organization will be meaningless.

The reason this is a modeling tool and not an analysis report is that anyone can adjust the weights and rerun the calculations instantly. Here I’ve increased the points that organizations earn for raising money over listening to community members and responding to donors:

fundraising_focused_points_model

This weighting would run contrary to our mission. So obviously, we’re not doing that. But we also don’t want to impose rules that would discount the efforts organizations have made to become Superstars under the old rules.

So I created another visualization of the model that counts up gainers and losers and puts them into a contingency table. Here, two models are shown side by side. Red boxes represent the number of organizations that are either going to move up or down a level in each model:

status_change_table

We’d like to minimize disruption during the transition. That means getting the number of Superstars that would drop to Partner as close to zero as possible. It also means giving everybody advance warning and clear instructions on how to demonstrate their learning quickly, so that they don’t drop status as the model predicts. (We’ve talked this over with representatives from our Project Leader Leadership Council to get ideas about how to best do this.)

This is a balancing act. Our definition of a Learning Organization is evolving because our measurements are getting more refined, but we acknowledge they are a work in progress. We seek feedback at every step so that what we build together serves the community writ large, and not just what we think is best.

We’ll share more about the launch of our GG Rewards platform next week. This post is just the story of how we used data and feedback to get where we are. Here are a few lessons of what we’ve learned along the way:

Lessons:

  • Fairness: It is mathematically impossible to make everybody happy when we start tracking learning behavior and rewarding it.
  • Meritocracy: We will need to keep changing the definition of Superstar organizations as all organizations demonstrate their learning, or else it will be meaningless. The best organizations would be indistinguishable from average ones.
  • Crowdsourcing: The only fair way to set the boundaries of Partner, Leader, and Superstar is to crowdsource the decision to our community, and repeat this every year.
  • Defined impact: We can measure the influence of our system on organizational behavior by comparing what the model predicts with what actually happens. We define our success as seeing everybody increase their score every year, and earning more points each year than in the previous year. Success is also seeing a normal distribution (e.g. “bell curve”) of overall scores.
  • Honest measurement: I was surprised to realize that without penalties for poor performance, it is impossible to see what makes an organization great.
  • Iterative benchmarking: We must reset the bar for Leader and Superstar status each year if we want it to mean anything.
  • Community: We predict that by allowing everyone a say in how reward levels are defined, more people will buy into the new system.
  • Information is Power: By creating an interactive model to understand what might happen and combining it with feedback from a community, we are shifting away what could be contentious and towards what could inspire stronger community.

We were inspired by what others at the World Bank and J-PAL did to give citizens more health choices in Uganda. What the “information is power” paper finds is that giving people a chance to speak up alone doesn’t yield better programs (the participatory approach). Neither does giving them information about the program alone (the transparency approach). What improves outcomes is a combination of a specific kind of information along with true agency – the power to change the very thing about a program that they believe isn’t working through their interpretation of the data.

The model I built can help each citizen of the GlobalGiving community see how a rule affects everyone else, and hence understand the implications of their choice, as well as predict how they will fare. If we infuse this information into a conversation about what the thresholds for Partner, Leader, Superstar ought to be each year (e.g. how much learning is enough?), this will put us in the “information is power” sweet spot – a rewards paradigm that maximizes organizational learning and capacity for the greatest number of our partners.

I predict that giving others this power (to predict and to set standards) will lead to a fairer set of rules for how learning is measured and rewards doled out. It ain’t easy, but it is worthy of the effort.

Using Data to Drive Donations: key findings from our work with DataKind

By Alison Carlman, in partnership with Miriam Young from DataKind

Recently we worked with DataKind to analyze project data from our website to learn what our nonprofit partners can do to maximize their potential for donations.

A project page on GlobalGiving.org

 

If you’ve ever visited more than a few pages on GlobalGiving, you’ll know that our project pages are the main hub of all fundraising activity on the web platform. Project pages are the pages where organizations describe their needs and give their best pitch to attract potential donors. We recently worked with a team of DataKind volunteers to analyze our data, helping us identify what impacts a nonprofit’s fundraising success.

How can organizations maximize their donations on GlobalGiving?

We already use data to drive our work (after all, our chief core value is Listen, Act, Learn. Repeat.), but we wanted to go deeper using data science (and some excellent data scientists) to uncover what leads to nonprofits successfully reaching their fundraising goals. We hope to use this information as we refine our search algorithm to help donors find projects they’re most interested in and also help nonprofits maximize their ability to attract donors.

Data science uses statistical and computational analysis to turn unwieldy amounts of data into actionable information to guide organizational decision making. Think of the many online services you use like LinkedIn, Netflix, or Amazon. These companies already use data generated by users on their sites to better serve their customers – making recommendations to help you use their services more effectively. We’re doing the same thing, using the same data science techniques that companies use to boost profits to advance our mission.

We first participated in a DataKind weekend DataDive, supported by Teradata, last October to do initial analysis of our project data to determine what factors led to projects being successfully funded. The team then handed off its findings to another team of DataKind volunteers – Jon Roberts, Ana Areias, Tim Rich, and Nate MacNamara – for a multi-month project to uncover insights about donor behavior that would help optimize our search ranking algorithm.

So what do nonprofits that fundraise successfully on GlobalGiving have in common? Many things: they get high traffic on their project page, they have a strong social media presence and a broad base of followers outside GlobalGiving. We wanted to hone in on the component it we could influence the most – the project page. Improving the project page itself with even minor tweaks, or providing nonprofits with tips backed by data can have a huge impact on fundraising success over time.

The DataKind volunteer team worked closely with our tech team to analyze which aspects of the project page led to higher conversion rates for donors. Looking at data from more than 4,000 project pages that had at least 100 visitors each, the volunteers looked for patterns and useful insights that could help us guide partners on best practices for maximizing donations.

Key learnings

The DataKind team looked at a variety of features of the project page, including project title, funding amount, number of donors, photos, length and content of project summaries. What impact, if any, did these things have on the project reaching its funding goal? The team found a few factors that had a clear influence on a project’s conversion or donation rate:

1. A “call-to-action” in the project summary 

There is a 14% higher conversion rate for projects that included a call-to-action in the project summary. Surprisingly, however, putting a call-to-action in the project title did not appear to make an impact on a project’s conversion rate. Titles may be important for getting traffic to a project, but it appears the project summary is king when it comes to inspiring people to give on GlobalGiving.

2. Longer project summaries (30-35 words)

Going against the traditional wisdom that short and sweet is always best, the team actually found that a project’s conversion rate increased with project summary length. To a point. But there is a sweet spot of 30-35 words, as summaries longer than 35 words encountered diminishing returns.

3. Specific language

At the DataDive, volunteers did text analysis of various project pages and found a correlation between specificity of language and a nonprofit’s project fundraising success. For example, nonprofits raised less money when they used generic words like funding for the “arts” versus a specific project like “a photography exhibit.”

4. Higher fundraising goals ($25,000-$50,000)
There seems to be a sweet spot of $25,000-$50,000 being correlated with increased conversion rates. This implies organizations should set their project goal in this range where possible and, if more funds are needed, launch a second project in the same range instead of simply increasing the original project’s requested amount.

Now, as any good stats student knows, correlation is not causation. All of these findings were based on inferential analysis of GlobalGiving’s existing data, which means we don’t know if these factors actually caused increased conversion rates. Nevertheless, the findings offer powerful information for our team to experiment with as we make recommendations for our partners going forward.

Start your journey

This project might also get you thinking about what hidden learnings are in your data. Data is everywhere. Your organization may have a web platform ours where you’re constantly generating data, or may have other sources like program intake forms, surveys or social media analytics. And don’t forget the wide range of publicly-available data provided by government agencies and others that can shed light into how your organization can maximize its impact.

If you’re interested in learning how your organization can tap the power of data science to improve your efforts, check out NTEN’s Data Community of Practice, Data Analysts for Social Good or reach out to the DataKind team at contact@datakind.org for advice on how to get started. If you think a data science project might help you scale your work, apply on the DataKind website for support!

All data science journeys begin with a question. What question will help your organization move the needle on the issue you care most about? DataKind love to help you answer it.

The Popup Experiment: Effects from Unexpected Places

Part 1: Testing Upworthy’s Technique

I spend a lot of time on the Internet and I’m always looking for inspiration. In late 2013 I clicked a link to an article on Upworthy, a site always on the forefront of testing stickiness and psychological hooks. When the page loaded I was presented with this statement:

Upworthy's Modal Dialog

After clicking “I Agree” (because who wouldn’t?), I was asked to sign up for the site’s mailing list.

This type of interface element is called a “modal dialog,” and I generally find them to be annoying, distracting, and intrusive. In this case however, the dialog made me smile, and even got me to click just to see what would happen.

That experience stuck with me and over the next few weeks I designed a way to take advantage of a similar psychological cue on GlobalGiving.

Our initial experiment (a minimum viable product)

We’re always looking for ways to build our newsletter list, and I saw an opportunity to let visitors to our nonprofit partners’ project pages subscribe to their quarterly email updates. I liked the fact that I was giving users a way to express their interest in and support for the project even if they weren’t ready to give monetarily.

Because of my dislike for modal dialogs (unless the message is critical), I looked for a way to catch the user’s attention without monopolizing it. Eventually I decided to have a small, unobtrusive box slide out from the page’s lower corner a few seconds after the it loaded, and make a similar appeal. Here’s my first attempt:

This project is doing great work

If the user chose to click “I Agree,” I offered them the opportunity to sign up for the project’s email updates and for GlobalGiving’s newsletter. I launched that to the site and saw a modest level of signups, so I knew I was onto something. But wanting to make the most of the opportunity, and doing my best to live our “Never Settle” core value, I decided to try to maximize the signup rate by testing a few of the assumptions I had made.

Experimental Variables

Variable 1 (Teaser): The point of this experiment was to catch people’s attention with the friendly language and statement nobody could possibly disagree with, then make the “big” ask of signing up for our mailing lists (using what’s known as the Foot-in-the-Door technique). Hopefully by first agreeing with the modest statement, the user would feel more inclined to sign up. But would people feel misled by this? Was I undermining their trust by posing a seemingly harmless question, then asking them for their email address? Would people not understand that they were being offered the option to sign up for the mailing list, and ignore the coy initial question completely? I decided to test dropping the pretense and just offer the user the ability to sign up for the mailing lists directly.

Variable 2 (Language): Relatedly, how would the playful language affect the user’s understanding of the offer? Would they be more likely to sign up if the language were more straightforward? I decided to test replacing the language above versus the direct but rather flat, “Sign up for updates about this project?” followed by the options “OK” and “No.”

Variable 3 (Timing): Finally, the duration of time I had chosen to wait before showing the offer was more or less arbitrary. I didn’t want the offer to pop up immediately, in order to give the user some time to digest the information they wanted in the first place, but could the time be too short, causing the offer to surprise, jar, or overwhelm the user? Could it be too long so that the user would already have left, or decided to take another action? I decided to test pauses of four and eight seconds before showing the offer.

Tables with Eight Cells Defined

I divided our audience into eight segments, and showed each segment a unique combination of those three variables. Over the course of about a month, we showed the offer nearly 80,000 times, and gained more than 1000 new subscribers.

Results

Analyzing the results, I learned that more people signed up for the mailing lists if they saw the “teaser” first; the Foot-in-the-Door technique worked! That was the only statistically significant result the experiment produced; waiting eight seconds slightly outperformed waiting for four, while the two different wordings were nearly a statistical toss-up.

Showing the teaser question increased the conversion rate from 1.10% to 1.57%

My team and I decided to make the offer permanent with “This project is doing great work” showing up after eight seconds.

Part 2: Tweaking the results

A few months went by, and our communications manager came to me with two observations about the offer. First, she explained that many of our website visitors are learning about these projects and their work for the first time, so the user would not be equipped to weigh in on whether or not the project is doing “great work.” This might make them less likely to respond to the prompt. Second, the statement parses oddly: the projects themselves aren’t doing any work at all, it’s the people that work at our nonprofit partner organizations who do the work.

She suggested an alternative wording that would remedy these issues: “This project is important.” We fired up another test.

There was no statistical difference in the signup rates between the cells with the two language variations after a month…or two months…or three months. After three months and 300,000 views, the two wordings were at a statistical dead heat, so we ended the experiment. We decided to stick with the “important” language if for nothing else than better grammar.

Even failed experiments (those without significant results) can result in learning, so I was prepared to accept these results and go forward with the knowledge that neither a project doing “great work” nor being “important” was more persuasive to our users in terms of convincing them to sign up for a mailing list. Perhaps there is other wording that would be more persuasive; perhaps there is a more effective UI treatment. Opportunities for further experimentation abound.

Part 3: The twist ending

And that’s where things would have ended, except there were other effects to consider. Newsletter sign-ups are not the primary goal for our partners’ project pages; ultimately, we want to help our partners receive donations. I had an inkling that our intervention might have some effect on users’ donation rates, so I compared the funds raised in each of the two cells, and the results speak for themselves:

Revenue increased 9.5%

Sure enough, users who are asked whether or not they agree that a project is “important” donate nearly 10% more money than those that are asked if they agree that it is “doing great work!” This was a statistically significant result.

I walked away from this experience happy that I found a way to increase donations, and humbly surprised that the biggest gain came from a source I hadn’t even considered. This experiment, well over a year in the making, serves as a reminder that by continually testing we can be continually improving, and to always remain open to positive effects from unexpected places. Plus it’s always a good idea to use correct grammar.