Case Study: Evaluating Builtin.com

Over the Fall 2022 semester as part of my Interaction Design & Usability Testing class at Claremont Graduate University, I planned, executed, and analyzed a usability test to evaluate Built In, a job board.

Note: This project is not affiliated with Built In.

Finding a new job is hard, but the process should be simple.

Built In (builtin.com) is a job board for tech and startup jobs. The website includes job postings for remote and in-office jobs across the US, information about tech companies, blog posts, salary information, and links to “tech hubs” to find jobs in specific cities. While the content on the website is valuable to job seekers, the search functionality leaves something to be desired. There are some usability issues that make the job search process even more frustrating than it already is (see Figure 1).

Figure 1. Built In search results and usability issues as of October 2022

My objective was to identify and recommend fixes to improve the search and filter workflow. Built In’s workflows should meet high standards of usability because looking for a job is already a time-consuming and often frustrating process; unusable websites add an extra burden for the job seeker. Making the act of searching for jobs easy and effective will enhance the Built In user’s experience and may increase the likelihood that users will return to Built In to search for jobs, which in turn may motivate more businesses to post their open positions on Built In.

I ran a usability test with 5 job seekers to assess the usability and user satisfaction of Built In's job search filters, leveraging quantitative and qualitative metrics.

I chose to focus on filter options, as those seemed to have the most usability issues. I designed a usability test with formative and summative elements to determine the overall usability of each filter and the specific usability issues that arose during the workflow. The test addressed the following research questions:

Is the current search and filter workflow usable?
Are users satisfied with the filter options when searching for jobs on Built In?

Participants were asked to interact with all 4 filters: Location, Experience, Category, and Industry in an interactive prototype. Several quantitative and qualitative variables were measured: task completion rates, single ease question (SEQ) ratings, filter satisfaction on a 7-point Likert scale, and themes from a think-aloud protocol. 5 participants were recruited using a convenience sampling method to gather responses with limited time and resources. They were screened to ensure they had searched for tech jobs and/or internships online within the past year.

See the research plan for full details on variables, hypotheses, screeners, sampling, procedures, and tasks.

Methods

I hypothesized that the usability of each filter would be less than average and users would not be fully satisfied with the filter options.

H1: The perceived usability (SEQ score) of the current filter workflow is less than 5.5 (where 5.5 means a task was easy).

Participants will likely be able to complete tasks, but the perceived usability (SEQ) may be lower.
Note: For the statistical test, H1 is the alternative hypothesis, the null hypothesis is that the average SEQ score for each filter is equal to the industry average of 5.5.

H2: Users are not fully satisfied with filter options.

Participants will not be fully satisfied because they want more filter options and/or the ability to sort results.
Note: there is no traditional null and alternative hypothesis because this will be analyzed qualitatively and with confidence intervals, there will be no statistical hypothesis test. This hypothesis is a guess as to how users will react to filter options based on a review of the current site.

Hypotheses

Quantitative analysis involved benchmarking filter performance against an industry average, while qualitative analysis focused on identifying themes related to participant satisfaction with the filters.

Both quantitative and qualitative analysis techniques were used to paint a holistic picture of participant interactions with the Built In job search filters. H1 was analyzed primarily by benchmarking test SEQ scores for each filter against an industry average SEQ score of 5.5 (where a score of 7 means a task is very easy) using a one-sample one-tailed t-test (see Figure 2). Other measures were used to triangulate findings, including task completion rates with adjusted Wald confidence intervals, filter efficacy ratings with t-confidence intervals, and themes that arose during the think-aloud protocol.

Analysis

Figure 2. Benchmarking SEQ scores in Excel

H2 was analyzed qualitatively using thematic analysis as well as satisfaction scores with t-confidence intervals. To begin the thematic analysis, I organized all raw data into a spreadsheet, then I used color coding to highlight pain points, positive and negative sentiments, and comments related to participant satisfaction with the filters. Finally, I counted the frequency of recurring pain points, sentiments, and comments to identify themes and their prevalence.

Take a look at the raw data and my analysis process here.

Most filters performed below average, however, results should be taken with a grain of salt given the small sample. Built In has since coincidentally implemented some of my recommendations.

Overall, participants were satisfied with the filters, but there was not enough evidence that the usability of the Location, Job Category, and Industry filters were above the industry average based on benchmarking. Participants also experienced pain points using each filter. The Job Category filter can be improved by allowing users to select multiple categories and moving “Internship” to the Experience filter. The Industry filter can be improved by moving it out of “More Filters.” Based on participant comments during the test, there is an opportunity for Built In to add filters for salary and company, and to add information about company ratings and the number of applicants for a job to improve job seekers' experiences.

One limitation of this study was the small sample size (n=5). This sample size was chosen because of time and resource constraints. Ideally, a larger sample (n=13+) should be used to measure quantitative items more precisely. The larger sample size was calculated based on an iterative sample size calculation method for summative tests.

Full findings and recommendations can be found in the written report.

Built In actually updated their filters while this study was running! Consistent with my recommendations, they removed "More Filters" so the Industry filter is more easily accessible (see Figure 3).

Findings & Recommendations

Figure 3. Built In search results as of November 2022