Tuesday, May 23, 2017

Please run for GNOME Board

You have two more days to announce your candidacy for the upcoming Board term.
Are you a member of the GNOME Foundation? Please consider running for Board.

Serving on the Board is a great way to contribute to GNOME, and it doesn't take a lot of your time. The GNOME Board of Directors meets every week via a one-hour phone conference to discuss various topics about the GNOME Foundation and GNOME. In addition, individual Board members may volunteer to take on actions from meetings—usually to follow up with someone who asked the Board for action, such as a funding request.

At least two current Board members have decided not to run again this year. (I am one of them.) So if you want to run for the GNOME Foundation Board of Directors, this is an excellent opportunity!

If you are planning on running for the Board, please be aware that the Board meets 2 days before GUADEC begins to do a formal handoff, plan for the upcoming year, and meet with the Advisory Board. GUADEC 2017 is 28 July to 2 August in Manchester, UK. If elected, you should plan on attending meetings this year on 26 and 27 July in Manchester, UK.

To announce your candidacy, just send an email to foundation-announce that gives your name, your affiliation (who you work for), and a few sentences about your background and interest in serving on the Board.

Friday, May 19, 2017

Can't make GUADEC this year

This year, the GNOME Users And Developers European Conference (GUADEC) will be hosted in beautiful Manchester, UK between 28th July and 2nd August. Unfortunately, I can't make it. I missed last year, too. The timing is not great for me.

I work in local government, and just like last year, GUADEC falls during our budget time at the county. Our county budget is set every two years. That means during an "on" year, we make our budget proposals for the next two years. In the "off" year, we share a budget status.

I missed GUADEC last year because I was giving a budget status in our "off" year. And guess what? This year, department budget presentations again happen during GUADEC.

During GUADEC, I'll be making our county IT budget proposal. This is our one opportunity to share with the Board our budget priorities for the next two years, and to defend any budget adjustment. I can't miss this meeting.

Wednesday, May 17, 2017

GNOME and Debian usability testing

Intrigeri emailed me to share that "During the Contribute your skills to Debian event that took place in Paris last week-end, we conducted a usability testing session" of GNOME 3.22 and Debian 9. They have posted their usability test results at Intrigeri's blog: "GNOME and Debian usability testing, May 2017." The results are very interesting and I encourage you to read them!

There's nothing like watching real people do real tasks with your software. You can learn a lot about how people interact with the software, what paths they take to accomplish goals, where they find the software easy to use, and where they get frustrated. Normally we do usability testing with scenario tasks, presented one at a time. But in this usability test, they asked testers to complete a series of "missions." Each "mission" was a set of two of more goals. For example:

Mission A.1 — Download and rename file in Nautilus

  1. Download a file from the web, a PDF document for example.
  2. Open the folder in which the file has been downloaded.
  3. Rename the dowloaded file to SUCCESS.pdf.
  4. Toggle the browser window to full screen.
  5. Open the file SUCCESS.pdf.
  6. Go back to the File manager.
  7. Close the file SUCCESS.pdf.

Mission A.2 — Manipulate folders in Nautilus

  1. Create a new folder named cats in your user directory.
  2. Create a new folder named to do in your user directory.
  3. Move the cats folder to the to do folder.
  4. Delete the cats folder.

These "missions" take the place of scenario tasks. My suggestion to the usability testing team would be to add a brief context that "sets the stage" for each "mission." In my experience, that helps testers get settled into the task. This may have been part of the introduction they used for the overall usability test, but generally I like to see a brief context for each scenario task.

The usability test results also includes a heat map, to help identify any problem areas. I've talked about the Heat Map Method before (see also “It’s about the user: Applying usability in open source software.” Jim Hall. Linux Journal, print, December 2013). The heat map shows your usability test results in a neat grid, coded by different colors that represent increasing difficulty:

  • Green if the tester didn't have any problems completing the task.
  • Yellow if the tester encountered a few problems, but generally it was pretty smooth.
  • Orange if the tester experienced some difficulty in completing the task.
  • Red if the tester had a really hard time with the task.
  • Black if the task was too difficult and the tester gave up.

The colors borrow from the familiar green-yellow-red color scheme used in traffic signals, and which most people can associate with easy-medium-hard. The colors also suggest greater levels of "heat," from green (easy) to red (very hard) and black (too hard).

To build a heat map, arrange your usability test scenario tasks in rows, and your testers in columns. This provides a colorful grid. You can look across rows and look for "hot" rows (lots of black, red and orange) and "cool" rows (lots of green, with some yellow). Focus on the hot rows; these are where testers struggled the most.

Intrigeri's heat map suggests some issues with B1 (install and remove a package), C2 (temporary files) and C3 (change default video player). There's some difficulty with A3 (create a bookmark in Nautilus) and C4 (add and remove world clocks), but these seem secondary. Certainly these are issues to address, but the results suggest to focus on B1, C2 and C3 first.

For more, including observations and discussion, go read Intrigeri's article.

Saturday, May 6, 2017

Not running for Board this year

After some serious thinking, I've decided not to run for the GNOME Foundation Board of Directors for the 2017-18 session.

As the other directors are aware, I've over-committed myself. I think I did a good job keeping up with GNOME Board issues, but it was sometimes a real stretch. And due to some budget and planning items happening at work, I've been busier in 2017 than I planned. I've missed a few Board meetings due to meeting conflicts or other issues.

It's not fair to GNOME for me to continue to be on the Board if I'm going to be this busy. So I've decided to not run again this year, and let someone with more time to take my seat.

However, I do plan to continue as director for the rest of the 2016-17 session.

Thursday, May 4, 2017

How I found Linux

Growing up through the 1980s and 1990s, I was always into computers. As I entered university in the early 1990s, I was a huge DOS nerd. Then I discovered Linux, a powerful Unix system that I could run on my home computer. And I have been a Linux user ever since.

I wrote my story for OpenSource.com, about How I got started with Linux.

In the article, I also talk about how I've deployed Linux in every organization where I've worked. I'm a CIO in local government now, and while we have yet to install Linux in the year since I've arrived, I have no doubt that we will someday.

Tuesday, April 18, 2017

A better March Madness script?

Last year, I wrote an article for Linux Journal describing how to create a Bash script to build your NCAA "March Madness" brackets. I don't really follow basketball, but I have friends that do, so by filling out a bracket at least I can have a stake in the games.

Since then, I realized my script had a bug that prevented any rank 16 team from winning over a rank 1 team. So this year, I wrote another article for Linux Journal with an improved Bash script to build a better NCAA "March Madness" bracket. In brief, the updated script builds a custom random "die roll" based on the relative strength of each team. My "predictions" this year are included in the Linux Journal article.

Since the games are now over, I figured this was a great time to see how my bracket performed. If you followed the games, you know that there were a lot of upsets this year. No one really predicted the final two teams for the championship. So maybe I shouldn't be too surprised if my brackets didn't do well either. Next year might be a better comparison.

In the first round of the NCAA March Madness, you start with teams 1–16 in four regions, so that's 64 teams that compete in 32 games. In that "round of 64," my shell script correctly predicted 21 outcomes. That's not a bad start.

March Madness is single-elimination, so for the second round, you have 32 teams competing in 16 games. My shell script correctly guessed 7 of those games. So just under half were predicted correctly. Not great, but not bad.

In the third round, my brackets suffered. This is the "Sweet Sixteen" where 16 teams compete in 8 games, but my script only predicted 2 of those games.

And in the fourth round, the "Elite Eight" round, my script didn't predict any of the winners. And that wrapped up my brackets.

Following the standard method for how to score "March Madness" brackets, each round has 320 possible points. In round one, assign 10 points for each correctly selected outcome. In round two, assign 20 points for each correct outcome. And so on, double the possible points at each round. From that, the math is pretty simple.

round one:21 × 10 =210
round two:7 × 20 =140
round three:1 × 40 =40
round four:0 × 80 =0
My total score this year is 390 points. As a comparison, last year's script (the one with the bug) scored 530 in one instance, and 490 in another instance. But remember that there were a lot of upsets in this year's games, so everyone's brackets fared poorly this year, anyway.

Maybe next year will be better.

Did you use the Bash script to help fill out your "March Madness" brackets? How did you do?

Monday, April 3, 2017

How many testers do you need?

When you start a usability test, the first question you may ask is "how many testers do I need?" The standard go-to article on this is Nielsen's "Why You Only Need to Test with 5 Users" which gives the answer right there in the title: you need five testers.

But it's important to understand why Nielsen picks five as the magic number. MeasuringU has a good explanation, but I think I can provide my own.

The core assumption is that each tester will uncover a certain amount of issues in a usability test, assuming good test design and well-crafted scenario tasks. The next tester will uncover about the same amount of usability issues, but not exactly the same issues. So there's some overlap, and some new issues too.

If you've done usability testing before, you've observed this yourself. Some testers will find certain issues, other testers will find different issues. There's overlap, but each tester is on their own journey of discovery.

How many usability issues is up for some debate. Nielsen uses his own research and asserts that a single tester can uncover about 31% of the usability issues. Again, that assumes good test design and scenario tasks. So one tester finds 31% of the issues, the next tester finds 31% but not the same 31%, and so on. With each tester, there's some overlap, but you discover some new issues too.

In his article, Nielsen describes a function to demonstrate the number of usability issues found vs the number of testers in your test, for a traditional formal usability test:

…where L is the amount of issues one tester can uncover (Nielsen assumes L=31%) and n is the number of testers.

I encourage you to run the numbers here. A simple spreadsheet will help you see how the value changes for increasing numbers of testers. What you'll find is a curve that grows quickly then slowly approaches 100%.

Note at five testers, you have uncovered about 85% of the issues. Nielsen's curve suggests a diminishing return at higher numbers of testers. As you add testers, you'll certainly discover more usability issues, but the increment gets smaller each time. Hence Nielsen's recommendation for five testers.

Again, the reason that five is a good number is because of overlap of results. Each tester will help you identify a certain number of usability issues, given a good test design and high quality scenario tasks. The next tester will identify some of the same issues, plus a few others. And as you add testers, you'll continue to have some overlap, and continue to expand into new territory.

Let me help you visualize this. We can create a simple program to show this overlap. I wrote a Bash script to generate SVG files with varying numbers of overlapping red squares. Each red square covers about 31% of the gray background.

If you run this script, you should see output that looks something like this, for different values of n. Each image starts over; the iterations are not additive:








As you increase the number of testers, you cover more of the gray background. And you also have more overlap. The increase in coverage is quite dramatic from one to five, but compare five to fifteen. Certainly there's more coverage (and more overlap) at ten than at five, but not significantly more coverage. And the same going from ten to fifteen.

These visuals aren't meant to be an exact representation of the Nielsen iteration curve, but they do help show how adding more testers gives significant return up to a point, and then adding more testers doesn't really get you much more.

The core takeaway is that it doesn't take many testers to get results that are "good enough" to improve your design. The key idea is that you should do usability testing iteratively with your design process. I think every usability researcher would agree. Ellen Francik, writing for Human Factors, refers to this process as the Rapid Iterative Testing and Evaluation (RITE) method, arguing "small tests are intended to deliver design guidance in a timely way throughout development." (emphasis mine)

Don't wait until the end to do your usability tests. By then, it's probably too late to make substantive changes to your design, anyway. Instead, test your design as you go: create (or update) your design, do a usability test, tweak the design based on the results, test it again, tweak it again, and so on. After a few iterations, you will have a design that works well for most users.