Research Portfolio

Xavier Hills

Staff UX Researcher

I lead mixed-methods research programs that move strategy, pairing rigorous qualitative depth with statistical analysis to turn ambiguous business questions into decisions teams act on. Across Okta, Spotify, Affirm, and Meta, my work has shaped product roadmaps, go-to-market strategy, and how multi-disciplinary teams understand their users.

Mixed methods Survey & statistical analysis Journey mapping Generative & evaluative Research strategy
10+ yrs
Leading UX research
4
Flagship programs
$5.8M
Annual revenue influenced (Affirm)
B2B + B2C
Consumer to enterprise

Research that earns a seat in the decision

My through-line is triangulation: I rarely rely on a single method. I combine secondary research, qualitative depth, and quantitative validation so that recommendations survive scrutiny from product, design, engineering, data science, and leadership alike.

Methodological range

From 55-participant in-depth interview programs and 6-month diary studies to n=30 surveys analyzed with cross-tabulation and chi-squared testing, I match the method to the decision, not the other way around.

Triangulation by default

I deliberately pair qualitative and quantitative streams, for example validating interview themes against survey statistics, so findings are both deep and defensible.

Cross-functional influence

I run stakeholder workshops, build the shared artifacts (journey maps, segmentation, dashboards) teams rally around, and translate insight into roadmap and GTM commitments.

Strategic framing

I connect user behavior to business levers like acquisition, retention, ROI, and launch readiness, so research changes what teams prioritize, not just what they know.

In-depth interviews Surveys Cross-tab & chi-squared analysis Journey mapping Diary studies Eye tracking Focus groups Competitive analysis Exploratory data analysis Concept evaluation International field research A/B test partnership Stakeholder workshops Persona development

Four flagship programs

Each pairs methodological rigor with measurable strategic impact. Recent work is presented first.

Okta 2025 · Mixed-methods program

Understanding the ISV Developer Ecosystem

A two-part research program, qualitative journey discovery validated by a quantitative decision-maker survey, to shape Okta's strategy for growing its Integration Network (OIN).

Lead UX Researcher · partnered with Product (Jason Teller, Jeff Taylor, Tova Lam), Design (Shirley Wang), and ISV Ecosystem Growth
The challenge

Okta needed to grow the breadth and depth of integrations independent software vendors (ISVs) build for the OIN, but lacked a clear picture of how ISVs decide what to build, what blocks them, and what would make Okta a strategic partner rather than just a platform. The stakes spanned product roadmap, developer experience investment, and the go-to-market case for emerging capabilities like Cross-App Access.

Approach: qual then quant

I designed a sequential mixed-methods program so each phase sharpened the next:

  • Phase 1: Developer journey (qualitative). 9 in-depth interviews across software, medtech, and fintech, spanning ISVs who had built for Okta and those who build for competing platforms. I mapped the end-to-end developer journey from awareness through build, submission, and depth.
  • Phase 2: Decision-maker survey (quantitative). A blind survey of 30 ISV decision-makers (VPs, directors, C-level, PMs), analyzed with cross-tabulation and chi-squared tests, reporting only statistically significant relationships.
In-depth interviews (n=9)Journey mappingBlind survey (n=30)Cross-tab analysisChi-squared testingSegmentation
ISV developer journey: the end-to-end path from awareness through build, submission, and depth.
ISV developer journey: the end-to-end path from awareness through build, submission, and depth.
What we learned
  • Customer demand drives every build decision. ISVs prioritize integrations to protect existing revenue first. A request from a large existing customer ranked as the #1 roadmap-prioritization factor, well above security improvements or competitor parity.
  • The developer has more influence than Okta believed. Leadership surfaces the need, but developers decide complexity and steer toward whichever platform is easiest to build for. Survey respondents independently weighted technical complexity heavily for identity integrations, validating the interview finding.
  • Submission friction caps depth. A lengthy, ambiguous validation process after submission was the single biggest deterrent to building deeper integrations beyond basic SSO/SAML.
  • Cross-App Access is a latent business story. Decision-makers saw it as both cost-saving and revenue-generating (most expecting 6–20% on each), and tied it to the urgent need to secure AI agents.
Decision-maker survey (n=30): ranked factors driving integration prioritization.
Decision-maker survey (n=30): ranked factors driving integration prioritization.
"I never submitted those drafts because… when I discovered the amount of work to do to create the documentation and the validation scenarios, it took me a long time to do the one… and I just didn't have time anymore." ISV Developer, journey interviews (P7)
Impact
4
Strategic recommendations adopted across product & GTM
2 phases
Triangulated qual + quant for defensible findings
New study
Seeded a follow-on GenAI Actions Platform study

The program reframed Okta's ISV strategy around customer-driven value rather than technical messaging, repositioned Cross-App Access as a business and AI-security enabler, and prioritized fixing the submission experience to unlock integration depth. Design ran a findings workshop, and the work directly scoped a downstream study on using GenAI to remove development barriers.

Staff lens

This is the work I'm proudest of as a systems thinker: I owned a multi-phase program end to end, chose a sequential design where qualitative discovery defined what the survey needed to prove, and used statistical validation to give a strategic narrative the credibility it needed with senior leadership. The finding that developers, not just decision-makers, gatekeep platform choice changed who Okta needed to influence.

Okta Jan 2025 · Evaluative / AI product

Log Investigator: Evaluating an AI Product Before GA

A limited early-access study to decide whether Okta's generative-AI log analysis tool was ready for general availability, and where to invest next.

Lead UX Researcher · with PM Cagatay Berilgen and Designer Maggie Cai
The challenge

Log Investigator, an LLM-powered tool that turns plain English into log queries, had been announced at Oktane but never validated with real customers. Leadership needed to know whether its capabilities matched user expectations, where the usability and utility gaps were, and whether to expand access, hold, or invest more before GA.

Approach

I designed a limited early-access (LEA) study with deliberately minimal instructions to avoid biasing how participants explored the tool, pairing in-product feedback and an exit survey with 30-minute customer interviews. Success was framed against five categories: ease of use, perceived utility, intent to keep using, satisfaction, and query quality.

Limited early access (LEA)In-depth interviewsExit surveyIn-product feedbackPersona analysis
What we learned
  • Value depended entirely on experience level, and surfaced two personas. Less-experienced admins found real value (query generation removed the barrier to getting started); experienced admins expected ChatGPT-like conversational depth and felt the tool fell short. The target audience was genuinely unclear.
  • Expectations were set externally. The Oktane announcement and familiarity with other LLMs created an expectation of conversational, follow-up-capable interaction that a one-prompt/one-response tool couldn't meet.
  • Summarization needed depth, not just retrieval. Users wanted explanations tied to specific policies and the events leading up to a log (connecting multiple events to understand a root cause), not just faster log-finding.
Two personas emerged, defined by participants' prior experience querying logs.
Two personas emerged, defined by participants' prior experience querying logs.
"I thought it was gonna be very similar to… you can have a conversation with it… it does seem like a 1-to-1, like one prompt to one response type of thing at the moment." Participant "Mason," experienced admin
Impact & recommendations

I recommended defining a target persona aligned to current capabilities, adding educational content to set correct expectations, exploring conversational interaction so users could dig into issues without rewriting queries, and extending the LEA over several months to observe real adoption. The study gave the team a clear-eyed read on GA readiness and a prioritized roadmap of capability gaps; Design ran a follow-on workshop to act on the findings.

Staff lens

Evaluating an AI product means separating the tool's real ceiling from expectations set by the broader LLM market. The most valuable output here wasn't a usability bug list. It was reframing an ambiguous "is it ready?" question into a positioning and persona problem, which is a more durable and strategic answer than a feature punch-list.

Spotify Multi-method program · Advertising

Advertiser Acquisition & Retention

A large, multi-phase program to understand why enterprise advertiser growth lagged internal goals, and to give Product, Sales, and Marketing a shared, actionable picture of the advertiser journey.

UX Researcher · partnered with Product, Product Marketing, Sales, Design, Data Science, and senior advertising leadership (VP Global Advertising, VP Ads Products)
The challenge

Despite rising ad revenue, Spotify's ads growth fell short of goals. ~80% of income came from direct sales, making advertiser acquisition, onboarding, and support expensive. Enterprise onboarding took 2–3 months and campaign setup often needed engineering. The team needed to understand what drives an advertiser's decision to start, and to stay.

Approach: six methods, sequenced
  • Literature review to establish what was known and expose the research gap on awareness and retention.
  • Focus groups with the Sales org across four markets (US, EU, APAC, LATAM) to surface advertiser segments.
  • Competitive analysis of Meta, Google, iHeartRadio, X, and Pinterest ad platforms to benchmark onboarding and education.
  • Exploratory data analysis with Data Science to build behavioral segments and identify churn signals.
  • 55 hour-long in-depth interviews with enterprise advertisers unfamiliar with Spotify, to capture the non-user perspective.
  • Sales enablement survey to locate training gaps affecting conversion.
Literature reviewFocus groupsCompetitive analysisEDAIDIs (n=55)SurveyJourney mappingStakeholder workshops
What we learned & built
  • A statistically significant link between Sales' ability to explain features and advertiser spend, pinpointing a training gap, not just a product gap.
  • Behavioral segments predicting retention vs. churn, operationalized into an internal tool and a Tableau dashboard that flagged at-risk advertisers for proactive outreach.
  • An interactive advertiser journey map with risk points that gave Product, Sales, and Marketing a single shared reference.
Advertiser segments journey map: retention-growth versus churn paths drawn from the 55 IDIs.
Advertiser segments journey map: retention-growth versus churn paths drawn from the 55 IDIs.
Impact
3 orgs
Sales, Marketing & Product changed roadmaps/plans
At-risk
Dashboard enabling proactive churn intervention
12 markets
Insight depth leadership needed for global strategy

A stakeholder workshop converted findings into commitments: Sales invested in more training, Marketing built awareness content, and Product teams adjusted 2023–2024 roadmaps to close experience gaps. The work produced lasting infrastructure (segmentation, a churn dashboard, and a journey map) that outlived the study itself.

Staff lens

This program shows the scale and orchestration I operate at: sequencing six methods across four markets, partnering deeply with Data Science on EDA and tooling, and, critically, converting insight into organizational commitments and durable artifacts through a structured workshop. I also navigated real constraints honestly (excluding unreliable self-serve data; managing a 3-month EDA and costly recruitment), which is part of operating at staff scope.

Affirm ~3-month program · Support experience

Guided Help Experience ("Project Clippy")

Turning a flood of negative App Store reviews into a redesigned support experience, and a clear, quantified business case.

UX Researcher · with Design, Content, 2 Foundations PMs, 6 engineers, Data Science, and Operations leadership, with CEO visibility from escalations
The challenge

Affirm faced rising negative App Store reviews, many from customers saying they wouldn't use Affirm again. The product team wanted to improve perception of the support experience; my longer-term research goal was to address the underlying inquiry drivers inside the product. The issue was severe enough to reach the CEO's inbox.

Approach

Over a roughly three-month timeline I researched the drivers of support frustration, surfacing the emotional cost of a support experience that intentionally hid contact paths to deflect inquiries. I partnered closely with Operations, whose staffing realities shaped what was feasible, and helped shape and validate a new guided help concept.

Qualitative interviewsInquiry-driver analysisConcept validationCross-functional partnership
"I exhausted all my possibilities before reaching out… After doing all of that, I still didn't get the help I wanted. So I disputed the charge… they didn't even try to ask or were apologetic about the issue. So that was a problem and frustration for me." Affirm customer, support interviews
Concept for the guided help experience, recommending the right support channel by issue.
Concept for the guided help experience, recommending the right support channel by issue.
Impact
+72%
Users using the recommended servicing channel
+40.9%
CSAT from contact experiences
$5.8M
Added annual sales from improved retention

The redesign also drove an 8% decrease in interactions needed for resolution and helped Affirm meet contractual servicing obligations for customers previously unable to reach support. A retention analysis found higher repeat-customer rates among those who engaged with support.

Staff lens

I include the honest postscript here because it's the most instructive part: making help easier to find increased case volume (contact info had been deliberately hidden to deflect), and Ops had to temporarily roll back before re-enabling with more staffing. Owning that second-order effect, and the lesson that research must account for the operational system, not just the user, is exactly the systems-level judgment I bring at staff level.

Meta Foundational craft · Facebook & Instagram

Facebook Navigation & Instagram Inform Treatment

Two long-running programs at consumer scale that show the depth of method craft underpinning my strategic work.

UX Researcher · Product Navigation Team and Integrity/Misinformation, with Design, PM, Data Science, Engineering
Facebook: product awareness & user control

To improve product awareness and adoption within a navigation system limited to ~5 surfaced products, I ran a global survey to quantify awareness, followed by international in-home field research that uncovered deeper causes (unclear iconography, missing signage, confusing cross-surface navigation). I validated changes with recurring relevance surveys against usage data, and ran a 6-month diary study to learn how long navigation changes take to be noticed and which education channels (interstitials, push, in-app, badges) worked. Follow-up in-lab interviews and an A/B test partnership with Data Science shaped a more flexible, customizable navigation, covered by Adweek, TechCrunch, CNET, and others.

Global surveyInternational field research6-month diary studyIn-lab interviewsA/B test partnership
Instagram: informing users about false content

To deter the spread of fact-checked false information, I researched where users encountered it and how to warn them without degrading the experience. Qualitative research showed a warning alone wasn't enough: users wanted sources and the choice to view content, leading to a redesigned treatment with fact-checker context and a "See Why / See Post" pattern. I ran an eye-tracking study confirming key information sat where users' gaze naturally fell, plus international research across two countries. Follow-up work uncovered that users wanted to warn others, reshaping the sharing flow. Covered by The Verge, Mashable, and Engadget.

Qualitative concept testingEye trackingInternational researchIterative validation
Staff lens

These programs are the craft foundation beneath everything above: eye tracking, multi-month diary studies, international fieldwork, and disciplined iterate-and-validate loops at billions-of-users scale. Staff researchers earn strategic influence by being unimpeachable on method first; this is where that came from.