If You Think All Bullet Lists Are Created Equal, This Data Will Change Your Mind

Do you have a list of bullet points located anywhere on your site?

Or perhaps a list of product or service features?

Better yet, a prominent list of benefits that help sell your wares?

When did you last work at improving your list of bullets – at refining it for maximum readability and persuasiveness?

A while back? Never?

I’m here to recommend spending some time working on your bullets… whether they’re a list of benefits… of reasons to believe… of key differentiators. No matter. It’s worth it if you want people to read them and take action.

(Psst… despite what copy and CRO gurus say… just because you create a list of bullets doesn’t mean people will read ‘em – scannable or not.)

The Bullet List A/B/n Test

The awesome team at Precision Nutrition (PN) wanted to test one of their more prominent bullet lists to see how it might impact the number of people who joined their Nutrition Certification Program presale list. (Over the course of the last 3 months, I’ve worked with PN to optimize many of their key landing pages. They’ve got a great testing culture.)

This particular test was on a page that gives interested fitness pros the ability to get early notification when the certification program opens for registration (there are 2 intakes per year):

pn-test-1

DEFAULT LANDING PAGE

We’d experimented with those big blue ‘reasons to believe’, as well as the sign-up form, and managed to produce some decent gains.

Two elements that we hadn’t yet touched were (1) the headline and (2) the bullets mid-way down the page.

After reviewing the page analytics and our timeline, we decided to run a 6-way split test, with the following creative:

Default landing page
VARIATION 1: Default bullets + new headline #1
VARIATION 2: Default bullets + new headline #2
VARIATION 3: Optimized bullets + default headline
VARIATION 4: Optimized bullets + new headline #1
VARIATION 5: Optimized bullets + new headline #2

The 6-way test enabled us to reliably isolate the headline changes from the bullet changes. You could set up a multivariate test to do the same, but I usually gravitate to A/B/C/D…/n tests due to their simplicity.

To be clear, the majority of the page stayed the same; we preserved the logos, the 3-column reasons to believe, and the form. The headline and bullets changed. Here are the alternate variations:

pn-test-2

VARIATION 1

 

pn-test-3

VARIATION 2

The 2 new headlines were focused on program benefits. Our hypothesis was that a benefit-driven headline would persuade more visitors to submit their email address.

The remaining treatments incorporated headlines from the default page and first 2 variations, and also used a single column, copy-optimized list of replacement bullets:

VARIATION 3

VARIATION 3

 

VARIATION 4

VARIATION 4

 

VARIATION 5

VARIATION 5

What’s different about the bullets?

  1. We reduced the number of bullets from 9 to 6
  2. We began each bullet with a graphical checkmark
  3. We focused the strongest benefits on the first and last bullets
  4. We improved the sight-line by going from 3 columns to 1
  5. We made the copy for each bullet more concise

So what’s YOUR personal take on how we treated the bullets? What else might you have tried?

The Results of the Bullet List + Headline Test

This test was set-up in Optimizely and exposed to 100% of visitors, with equal traffic distribution across the 6 pages. The single success metric used for this test was email address submissions.

About a week after launching, we saw 2 under-performing variations and “throttled down” traffic to them. Shortly thereafter, we paused those same 2 variations in Optimizely to allow more traffic to be exposed to the high-performing creative.

And at the 9-day mark, we throttled down traffic to the default page to allow even more visitors to see (and convert on!) the winning recipes.

The test had been running for 12 days when I grabbed the screenshot of the results:

pn-test-7

TEST RESULTS

What’d we learn?

  • The bullet list improvements had a massive effect on visitor behavior. A 25% increase in opt-ins means thousands of additional email marketing opportunities per year for Precision Nutrition.
  • This visitor segment (fitness pros looking to be certified) is already a motivated bunch, but the bullets amplified their already-high motivation.
  • It was a “clean” set of results. You’ll notice that the 3 winning recipes all include the new layout for the bullets.
  • The headlines appear to have little impact on these visitors, with the conversion rate range from lowest-performing winner to highest-performing winner at just 3.3%. Time to test more headlines!

Would you make any additional observations about the results?
Leave them in the comments below…

Finally, there are some things to be learned from how the Precision Nutrition team looks at testing:

  1. They understand the tremendous business value of good copy!
  2. They are willing to test messaging, layouts, and page functionality that sometimes makes them uncomfortable
  3. They always want to iterate on tests… winners and losers
  4. They always roll out the winning creative to maximize their conversion rate
  5. They make creative resources available to jump on new ideas and experiments
  6. They are curious, customer-focused, very smart, and incredibly nice!

Those are 6 (well, technically 9) attributes I recommend you aspire to in your own business, particularly items 1 – 4 for your conversion optimization program.

Until next time!

~lance

Do your biz a favor: buy these ebooks, master copy fast, sell more

  • http://www.idlegrad.com Mike

    Interesting post Lance.

    I would be curious how mobile played into this. Single column to me seems to way to go for mobile landing pages. Did Optimizely segment that out at all?

  • Mark Hall

    Great stuff, Lance! Very useful takeaways. I’ll be curious to know if a) other headline variations and b) even Fewer bullets (I’ve heard the “use max of 5 bullets” maxim) further optimize the conversions. Cheers, Mark

    • http://www.userhue.com/ Lance Jones

      Thanks Mark! Your suggestions are on the testing roadmap, and I’ll be sure to share results if we learn something interesting!

  • http://www.toppingtwo.com/ Lance Jones

    Hey Beatrix — we intentionally designed that pop-up thingy such that you can’t select it as “annoying”. :-) It’s actually a new user feedback tool that we designed and built, and we’re currently using it to “eat our own dog food” on copyhackers.com. Feel free to send me any feedback you have about your experience… lance at copy hackers dot com.

  • Robert Campbell

    A lift is a lift, no matter how small. Besides, the new bullet treatment just looks better. That has to account for something. btw, this is Lance’s post, but the pop-up feedback banner at the bottom is from Joanna. Are you two checking up on each other?

    • http://www.toppingtwo.com/ Lance Jones

      We thought putting Joanna’s name on the feedback request would get a higher response. :-) It’s a new content feedback tool that we designed and developed — and we just launched it on copyhackers.com yesterday!

  • http://www.theconversionstrategist.com/ Yassin Shaar

    Great post

    I would test adding a testimonial of a previous client on how taking the certification enhanced his/her credibility in the market.

    • http://www.toppingtwo.com/ Lance Jones

      Great suggestion, Yassin. We’ll definitely give that a try on our next round.

  • http://www.decalmarketing.com/adwords-book/ Iain Dooley

    I have a personal vendetta against 3 columns of text on landing pages … to me they look like an instruction to the “human computer” to skip that bit of the page and keep reading: http://glui.me/?i=l8bk1cp1cn4o726/2014-02-26_at_9.22_AM.png/ designers I work with seem to love them, though. Is it still a pathological fear of scrolling? It’s always seemed to me that the most logical thing to do is keep someone scrolling, moving down the page until they get to your big ass button and do your bidding. Great to see this test as a concept, thanks for posting.

    • Joanna Wiebe

      Totally! If you can start them scrolling, you can keep them scrolling. Why try to cram everything into a little section of what could be a long, lovely, well-read page??

  • http://www.makementionmedia.com/ Jen Havice

    Hmm… Had no idea CRO could get so heated. The comments are almost as interesting as the post.

    Anyway, I really like what you all did with the single column bullet points and the snazzy blue check marks. The more and more we’re tasked with coming up with ways to present information and get it read – with everyone on overload – getting ideas on fresh alternatives is always welcome.

    Website visitors are becoming more and more discriminating while being less and less tolerant of sifting through a site to get what they want. And, too many business bottom lines are suffering for it. Being able to see how even simple changes can make a dramatic difference may get more people to understand how important this kind of process is.

    So, thanks for sharing your ongoing results. Always gives me new ideas to noodle on… and ways to improve on my copy.

    • http://www.toppingtwo.com/ Lance Jones

      :-) Hopefully not too heated. Some passion for sure.

      We want to share what we’re seeing… and no matter where you stand on the “future projectability of results” issue, you can learn from running a similar test as this.

      Of course we hope you’ll let the test run as long as possible. But we also know that leaving a test open for an entire year is not realistic for many companies (and often not even feasible).

      The stats are not fail safe, either. I have run tests for a full year (at Intuit, for TurboTax) and seen error margins below 1% in December move back up to 3% in March as the motivation of visitors shifts (with a looming tax deadline approaching). It is impossible for any testing tool to account for such a shift.

      Of course, if you can’t rely 100% on the tools or the math, then what’s the alternative? If you’re truly a purist, then you wouldn’t run any tests. But is that realistic? Or desirable? Doubtful.

      Thanks, Jen!

      • http://www.makementionmedia.com/ Jen Havice

        At some point, if you’re seeing positive results with whatever your goals are – sign ups, trials, purchases – that you can contribute to the changes, does it really matter? I realize you want to feel certain about what’s effecting change and that you’ve given the testing long enough but as you said that might not always be the realistic choice. Pragmatism will only get you so far. Although, I guess it depend who you talk to.

  • http://copygrad.com/ Will Hoekenga

    Lance and Joanna,

    Love seeing these (in progress) results and the thought process behind it all! But what I really like about this post is the bit at the end about being a great client. You should strongly consider writing a whole post on that topic.

    It’s incredible that most people do not realize how beneficial it can be for ALL involved when you know how to be a great client.

    A mentor of mine who works in the publishing biz likes to say that, although he is technically the publisher’s client, he treats them like THEY are HIS client. Wouldn’t ya know that they just seem to treat him better, respond quicker, and work harder for him than a lot of their other clients? Funny, right? :-)

    • Joanna Wiebe

      Amen, Will! We are super-blessed to have clients that are insanely wonderful to work with, with Precision Nutrition being an awesome example. (And Neil Patel being another. And Dave and Amy at Positive Parenting Solutions. And all our past clients. And and and and.) Maybe we should do a post on that!

      • http://copygrad.com/ Will Hoekenga

        Very cool to hear y’all have had such good experiences with so many people. +1 to my faith in humanity for the day. Would be interesting to hear both their perspective and your perspective on what goes into building a great relationship.

  • http://www.toppingtwo.com/ Lance Jones

    Hey Daniel, pausing a variation mid-stream in Optimizely — when you’re using redirects to display alternate variations — will not continue to show the paused variation/creative to visitors.

    • danielgonzalez

      Are ONLY new visitors allowed to enter the test to begin with?

      • http://www.toppingtwo.com/ Lance Jones

        New visitors to what? The site? The page? :-)

        Remember that every visitor is considered “new” by your testing tool when you first install the code snippet on your site. Every single visitor.

        And if any visitor removes her cookies, she is considered new again. So “first time/new” versus “repeat” visitor is already tremendously flawed as far as testing tools go.

        The page we’re testing here really only has “new visitors” (not necessarily as assessed by the testing tool), because once you decide to opt-in, you’ll likely never return to this page. If you choose not to opt-in, you could return, but we’re talking about very small numbers (and the certification program only runs twice per year).

        With every test there is going to be some “noise” but this test has over 2,000 conversions, so the signal to noise ratio is very high.

      • danielgonzalez

        You’re a ninja Lance :)

      • Craig Sullivan

        There is a further twist to this, although as Lance states – in this example – you’re likely not to return to this page.

        It’s worth mentioning an effect I’ve seen in tests that Lance alludes to – the snipping of datapoints at the start and the end of any test period.

        On an e-commerce site, for example, you start a new landing page test. All the people who saw page A last week – when they return, 50% of them are going to see a new one. There is always this residual effect at work – visitors coming before (and after) the experiment start time Of course it affects the outcome – one group seeing what they saw and another seeing a new one! That’s something which drives additional noise into the start of testing so be very aware of the repeat visit cycle and how long it takes for your path to purchase to complete.

        Now the second part – when you finish testing, you’re cutting off people who’ve been exposed to the experiment but didn’t get a chance to convert in the fullness of time. If your average purchase cycle is 6 weeks and you run a test for two, you’re cutting out a chunk of valuable data.

        Sometimes figuring out confidence, error bars and sample sizes aren’t enough – you need to be aware of the start and end test clipping issue… At least figure out your purchase cycle distribution, for example.

      • danielgonzalez

        Craig Sullivan,

        Yeah, I sort of get around that by tapping the Google Analytics cookie in VWO, it looks like this: http://www.screencast.com/t/pkuW1USjM0 — that allows me to admit only new visitors according to the GA cookie.

        And then, when I wind a test down, I close it to new visitors, allowing it to run for returning visitors only, for as long as is necessary based on the time to purchase report in GA.

      • http://www.toppingtwo.com/ Lance Jones

        An excellent point, Craig. And interestingly, one that I believe most marketers won’t factor into their test plans. I oversee 12 senior testing consultants at Adobe, and roughly 130 clients, and this is the type of stuff that prompts clients to ask why it has to be so complicated… as their eyes glaze over. That doesn’t mean we shouldn’t acknowledge the pitfalls of testing, I know…

      • Craig Sullivan

        Lance – maybe this explains why some companies have resources that go after dormant or low testing client accounts, to try and wake them back up into testing again.

        If you’re giving them things that make their eyes glaze over or contain ambiguities in the inferences that can be drawn, you’ve lost a battle and maybe the entire war in that second. You’ve got to be the bridge to clarity and simplicity for the client, regardless of the underlying complexity.

        A perfect example is when you say “You’ll notice that the 3 winning recipes all include the new layout for the bullets.” – that’s confirmation bias (or just a hypothesis still) – it’s not a proof based on that data. Injecting that into a clients brain will make them think “Change all bullet styles to new one, across entire site” when in fact, the test data did not say this.

        The problem is if we replace simplicity and clarity with ambiguity, heavy statistics and confirmation bias – the clients will lose faith, the tests will not teach the client true ‘insights’ and the directionality and performance of tests will suffer. What then happens is the results suck and people stop testing.

        C.

  • http://conversionxl.com/ Peep Laja

    Guys – bad science here. Look at the error margins. You don’t have a winner here.

    Original: 28.3% +/- 3.59%. So it could be 31.89%.
    Default bullets: 31.1% +/- 4.69%. So it could be 35.79% (winner?)
    And your “winner” is 35.4% +/- 2.84%, so it could be also 32.56%

    Even 31.89 vs 32.56 is such a small difference that you’d need to keep the test running for at least another week (3 full weeks total).

    This test is not cooked, keep at it.

    • http://www.toppingtwo.com/ Lance Jones

      Hey Peep, the test is actually still running… we’re on Day 18 now, and those numbers have held steady (actually improving slightly) while the error margins have narrowed.

      • http://conversionxl.com/ Peep Laja

        Great! So why post half-cooked results to begin with? We should preach good science and not give people the wrong idea that they can call tests early, the mother of all optimization fuck ups.

      • Joanna Wiebe

        I dunno – I take away from all of our posts that you should always be testing… not that you must do this one thing. So even if we’re showing super-strong trending winners, it’s great for feeding ideas to our readers about tests. (And for PN, Lance is actually seeing massive increases on opt-ins on the back end. So we have a lot of supporting data, but we can only show so much.)

        To poke you back a bit: when’s the last time you posted a case study of your own test, Peep? Our site’s filled with them. :) Can’t find many on yours…

      • http://www.toppingtwo.com/ Lance Jones

        I’m not much on preaching — and I’ve been doing this for a very long time, Peep (since the Offermatica days).

        They’re not half-cooked results, either. The error margins indicate the expected range of possible conversion rates in the future for this creative. But for the period of this test, we have in fact raised the number of opt-ins for PN by 25% on the leading treatment.

        This test result is based on 2000+ conversions, too. How many tests do you see run with those kinds of numbers (or more) before they’re called? I suspect not many…

        Those error bars may never move to <1%, but PN is willing to bet — as am I — that the difference in the presentation of those bullets is real. That may make us bad scientists in your book, but the business results (not just Optimizely reported results) for their optimization program strongly suggest otherwise.

      • Craig Sullivan

        One last thing Lance – beware of altering the test composition as you’re running it. I always weeded out low performers in waves (by stopping all tests and restarting).

        What happens is that if you ramp up the traffic percentages in a test tool, or disable one or more variants as it’s running, you’re altering the proportions of the test. There’s a great explanation of it here in this paper – but best advice is when you’re altering compositions of traffic in the test by removing variants or adjusting traffic percentages, beware of Simpsons Paradox:

        http://robotics.stanford.edu/users/ronnyk/2010-12ExPUnexpectedSIGKDD.pdf

      • Craig Sullivan

        Interesting discussion and you’re all correct, in different ways. Excuse me being long winded, patronising and boring – I’m just trying to explain some shizzle.

        Peep is right to point out that your probability fuzziness is overlapping. The numbers are taking a guess at where the value might lie within and the testing software always shows a nice sweet point on a graph.

        However, it’s not a precise point – it’s a bell shaped curve probability region.

        When you see 3.5 +/- 0.1 it doesn’t quite mean that it could be 3.6 or 3.4 – it means it’s more likely to be in the middle (3.5), than out towards the edge but it’s *possible*, just less so the further you get from the middle.

        I hate tests where this happens – because depending on how the results overlap, it means you’re guessing about part or all of the result set. If there is a sizeable overlap, it means it’s entirely possible that the results are winners for any of the overlapping ranges.

        If just the edges of the intervals over lap (like 3.5 +/-0.4 and 2.7 +/- 0.4) then it’s possible that the 2.7 one is the winner, just not as highly probable as the 3.5 one. With me so far?

        OK – so if the ranges overlap lots (like 3.5 +/-0.2 and 3.4 +/- 0.2) then the bell shaped curve massively increases the likelihood that any ‘winner’ you draw from the experiment is not a result, just a guess and could turn out to be wrong.

        You run the experiment and check the results every day and get a different ‘winner’. If you forgot to check the data movements, you might drop by and sample the winner occasionally. It could be any one of them if they’re still overlapping. Each time you look, it might be different orders showing.

        This is why when one tests – you should always go for something that narrows in on the stuff that is shifting behaviour without overlaps. If I use a red button and that wins over a green button (and these two overlap heavily) – it just means there’s not enough difference to be sure. It does not mean that red works better. This is the problem with any inference drawn from overlapping stuff. We get drawn to the contents we put in the test and look for confirmation from the data. Uh oh. Bad.

        When you have a lot of overlaps, you really end up testing marginal things. You’re doing what Google did with the shades of blue malarkey. If you keep the test running forever, every guess might turn out to be right along the way.

        And the worst bit it is, if you pick an overlapping candidate as a winner, you may have picked the loser – next weeks data might have flipped things. If you pick an overlapping winner, it might go down in performance, leaving you to scramble to understand why or figure out what to do. I’ve done it for months on end and realised the futility of my stupidity in drawing useful stuff from overlapping results. You just end up chasing shadows, not being confident or really data driven.

        As publishers of stuff, I’d just be cautious about showing test results where overlap is a big factor. It’s kinda hard to explain and marketing folks take this at face value. It helps education if we don’t show results that for some (less statistically minded) lead them to irrevocable conclusions, when the result may not be showing this. I’m not maligning you here – just pointing out that the reader, the viewer will make this leap and it’s not your fault, just the way things are! There’s too much ambiguity lurking for the unwary in that results graph, for sure.

        Lance hits the nail on the head though with the entire point of the test – this is to be celebrated. You can’t maybe be sure of the top winner but you’ve got more to iterate with than you had before. Optins are up 25%. Rejoice! Onwards and Upwards!

        Seriously though – why don’t we propose new test ideas rather than gnawing this one to death (craig says, quietly stepping away from a big bone)

        Lastly, what the fuck are you all still here arguing about things for – you’re all wasting valuable testing time (laughs). Go on – whittle down the candidates, inject new stuff and get testing again.

  • http://www.copy-cat.co/ Momoko Price

    +1 to spiffy checkmarks! I always recommend lovely, happy, bright checkmarks for all Johnson box lists and benefits lists :)

    • Joanna Wiebe

      +1 to your +1 — love me some checkmarks.

  • scott

    interesting but not really a fair conclusion since you did not put the pizazz on the column of bullets just the centered version

    • http://www.toppingtwo.com/ Lance Jones

      Hey Scott, you’ll recall that we actually changed 6 aspects of the original bullet list… including layout and the copy itself. We’re not saying that everyone should go modify their bullet lists to be a single column. We’re saying that big gains can be achieved by making your bullets more compelling and ensuring that they get read. The best bullets is the world (copy wise) won’t matter if people gloss over them…

  • http://webcopyservices.com/ Zafifi

    Great post. The bullet lists in the treatment version are placed centrally so visitors would read at a glance without having to read from one side to the other side like the original version.

    • http://www.toppingtwo.com/ Lance Jones

      I think you’re right, Zafifi… with the new treatment, we almost force people to read down that single column of bullets. It requires very little work to read the list versus the default page’s 3 columns.