Xavier Hills

Okta 2025 · Mixed-methods program

Understanding the ISV Developer Ecosystem

A two-part research program, qualitative journey discovery validated by a quantitative decision-maker survey, to shape Okta's strategy for growing its Integration Network (OIN).

Lead UX Researcher · partnered with Product, Design, and ISV Ecosystem Growth Teams

The challenge

Okta needed to grow the breadth and depth of integrations independent software vendors (ISVs) build for the OIN, but lacked a clear picture of how ISVs decide what to build, what blocks them, and what would make Okta a strategic partner rather than just a platform. The stakes spanned product roadmap, developer experience investment, and the go-to-market case for emerging capabilities like Cross-App Access.

Approach: qual then quant

I designed a sequential mixed-methods program so each phase sharpened the next:

Phase 1: Developer journey (qualitative). 9 in-depth interviews across software, medtech, and fintech, spanning ISVs who had built for Okta and those who build for competing platforms. I mapped the end-to-end developer journey from awareness through build, submission, and depth.
Phase 2: Decision-maker survey (quantitative). A blind survey of 30 ISV decision-makers (VPs, directors, C-level, PMs), analyzed with cross-tabulation and chi-squared tests, reporting only statistically significant relationships.

In-depth interviews (n=9)Journey mappingBlind survey (n=30)Cross-tab analysisChi-squared testingSegmentation

ISV developer journey: the end-to-end path from awareness through build, submission, and depth.

What we learned

Customer demand drives every build decision. ISVs prioritize integrations to protect existing revenue first. A request from a large existing customer ranked as the #1 roadmap-prioritization factor, well above security improvements or competitor parity.
The developer has more influence than Okta believed. Leadership surfaces the need, but developers decide complexity and steer toward whichever platform is easiest to build for. Survey respondents independently weighted technical complexity heavily for identity integrations, validating the interview finding.
Submission friction caps depth. A lengthy, ambiguous validation process after submission was the single biggest deterrent to building deeper integrations beyond basic SSO/SAML.
Cross-App Access is a latent business story. Decision-makers saw it as both cost-saving and revenue-generating (most expecting 6–20% on each), and tied it to the urgent need to secure AI agents.

Decision-maker survey (n=30): ranked factors driving integration prioritization.

"I never submitted those drafts because… when I discovered the amount of work to do to create the documentation and the validation scenarios, it took me a long time to do the one… and I just didn't have time anymore." ISV Developer, journey interviews (P7)

Impact

4

Strategic recommendations adopted across product & GTM

2 phases

Triangulated qual + quant for defensible findings

New study

Seeded a follow-on GenAI Actions Platform study

The program reframed Okta's ISV strategy around customer-driven value rather than technical messaging, repositioned Cross-App Access as a business and AI-security enabler, and prioritized fixing the submission experience to unlock integration depth. Design ran a findings workshop, and the work directly scoped a downstream study on using GenAI to remove development barriers.

Staff lens

This is the work I'm proudest of as a systems thinker: I owned a multi-phase program end to end, chose a sequential design where qualitative discovery defined what the survey needed to prove, and used statistical validation to give a strategic narrative the credibility it needed with senior leadership. The finding that developers, not just decision-makers, gatekeep platform choice changed who Okta needed to influence.

Okta Jan 2025 · Evaluative / AI product

Log Investigator: Evaluating an AI Product Before GA

A limited early-access study to decide whether Okta's generative-AI log analysis tool was ready for general availability, and where to invest next.

Lead UX Researcher · with PM and Design

The challenge

Log Investigator, an LLM-powered tool that turns plain English into log queries, had been announced at Oktane but never validated with real customers. Leadership needed to know whether its capabilities matched user expectations, where the usability and utility gaps were, and whether to expand access, hold, or invest more before GA.

Approach

I designed a limited early-access (LEA) study with deliberately minimal instructions to avoid biasing how participants explored the tool, pairing in-product feedback and an exit survey with 30-minute customer interviews. Success was framed against five categories: ease of use, perceived utility, intent to keep using, satisfaction, and query quality.

Limited early access (LEA)In-depth interviewsExit surveyIn-product feedbackPersona analysis

What we learned

Value depended entirely on experience level, and surfaced two personas. Less-experienced admins found real value (query generation removed the barrier to getting started); experienced admins expected ChatGPT-like conversational depth and felt the tool fell short. The target audience was genuinely unclear.
Expectations were set externally. The Oktane announcement and familiarity with other LLMs created an expectation of conversational, follow-up-capable interaction that a one-prompt/one-response tool couldn't meet.
Summarization needed depth, not just retrieval. Users wanted explanations tied to specific policies and the events leading up to a log (connecting multiple events to understand a root cause), not just faster log-finding.

Two personas emerged, defined by participants' prior experience querying logs.

"I thought it was gonna be very similar to… you can have a conversation with it… it does seem like a 1-to-1, like one prompt to one response type of thing at the moment." Participant "Mason," experienced admin

Impact & recommendations

I recommended defining a target persona aligned to current capabilities, adding educational content to set correct expectations, exploring conversational interaction so users could dig into issues without rewriting queries, and extending the LEA over several months to observe real adoption. The study gave the team a clear-eyed read on GA readiness and a prioritized roadmap of capability gaps; Design ran a follow-on workshop to act on the findings.

Staff lens

Evaluating an AI product means separating the tool's real ceiling from expectations set by the broader LLM market. The most valuable output here wasn't a usability bug list. It was reframing an ambiguous "is it ready?" question into a positioning and persona problem, which is a more durable and strategic answer than a feature punch-list.

Spotify Multi-method program · Advertising

Advertiser Acquisition & Retention

A large, multi-phase program to understand why enterprise advertiser growth lagged internal goals, and to give Product, Sales, and Marketing a shared, actionable picture of the advertiser journey.

UX Researcher · partnered with Product, Product Marketing, Sales, Design, Data Science, and senior advertising leadership (VP Global Advertising, VP Ads Products)

The challenge

Despite rising ad revenue, Spotify's ads growth fell short of goals. ~80% of income came from direct sales, making advertiser acquisition, onboarding, and support expensive. Enterprise onboarding took 2–3 months and campaign setup often needed engineering. The team needed to understand what drives an advertiser's decision to start, and to stay.

Approach: six methods, sequenced

Literature review to establish what was known and expose the research gap on awareness and retention.
Focus groups with the Sales org across four markets (US, EU, APAC, LATAM) to surface advertiser segments.
Competitive analysis of Meta, Google, iHeartRadio, X, and Pinterest ad platforms to benchmark onboarding and education.
Exploratory data analysis with Data Science to build behavioral segments and identify churn signals.
55 hour-long in-depth interviews with enterprise advertisers unfamiliar with Spotify, to capture the non-user perspective.
Sales enablement survey to locate training gaps affecting conversion.

Literature reviewFocus groupsCompetitive analysisEDAIDIs (n=55)SurveyJourney mappingStakeholder workshops

What we learned & built

A statistically significant link between Sales' ability to explain features and advertiser spend, pinpointing a training gap, not just a product gap.
Behavioral segments predicting retention vs. churn, operationalized into an internal tool and a Tableau dashboard that flagged at-risk advertisers for proactive outreach.
An interactive advertiser journey map with risk points that gave Product, Sales, and Marketing a single shared reference.

Advertiser segments journey map: retention-growth versus churn paths drawn from the 55 IDIs.

Impact

3 orgs

Sales, Marketing & Product changed roadmaps/plans

At-risk

Dashboard enabling proactive churn intervention

12 markets

Insight depth leadership needed for global strategy

A stakeholder workshop converted findings into commitments: Sales invested in more training, Marketing built awareness content, and Product teams adjusted 2023–2024 roadmaps to close experience gaps. The work produced lasting infrastructure (segmentation, a churn dashboard, and a journey map) that outlived the study itself.

Staff lens

This program shows the scale and orchestration I operate at: sequencing six methods across four markets, partnering deeply with Data Science on EDA and tooling, and, critically, converting insight into organizational commitments and durable artifacts through a structured workshop. I also navigated real constraints honestly (excluding unreliable self-serve data; managing a 3-month EDA and costly recruitment), which is part of operating at staff scope.

Affirm ~3-month program · Support experience

Guided Help Experience ("Project Clippy")

Turning a flood of negative App Store reviews into a redesigned support experience, and a clear, quantified business case.

UX Researcher · with Design, Content, 2 Foundations PMs, 6 engineers, Data Science, and Operations leadership, with CEO visibility from escalations

The challenge

Affirm faced rising negative App Store reviews, many from customers saying they wouldn't use Affirm again. The product team wanted to improve perception of the support experience; my longer-term research goal was to address the underlying inquiry drivers inside the product. The issue was severe enough to reach the CEO's inbox.

Approach

Over a roughly three-month timeline I researched the drivers of support frustration, surfacing the emotional cost of a support experience that intentionally hid contact paths to deflect inquiries. I partnered closely with Operations, whose staffing realities shaped what was feasible, and helped shape and validate a new guided help concept.

Qualitative interviewsInquiry-driver analysisConcept validationCross-functional partnership

"I exhausted all my possibilities before reaching out… After doing all of that, I still didn't get the help I wanted. So I disputed the charge… they didn't even try to ask or were apologetic about the issue. So that was a problem and frustration for me." Affirm customer, support interviews

Concept for the guided help experience, recommending the right support channel by issue.

Impact

+72%

Users using the recommended servicing channel

+40.9%

CSAT from contact experiences

$5.8M

Added annual sales from improved retention

The redesign also drove an 8% decrease in interactions needed for resolution and helped Affirm meet contractual servicing obligations for customers previously unable to reach support. A retention analysis found higher repeat-customer rates among those who engaged with support.

Staff lens

I include the honest postscript here because it's the most instructive part: making help easier to find increased case volume (contact info had been deliberately hidden to deflect), and Ops had to temporarily roll back before re-enabling with more staffing. Owning that second-order effect, and the lesson that research must account for the operational system, not just the user, is exactly the systems-level judgment I bring at staff level.

Meta Foundational craft · Facebook & Instagram

Facebook Navigation & Instagram Inform Treatment

Two long-running programs at consumer scale that show the depth of method craft underpinning my strategic work.

UX Researcher · Product Navigation Team and Integrity/Misinformation, with Design, PM, Data Science, Engineering

Facebook: product awareness & user control

To improve product awareness and adoption within a navigation system limited to ~5 surfaced products, I ran a global survey to quantify awareness, followed by international in-home field research that uncovered deeper causes (unclear iconography, missing signage, confusing cross-surface navigation). I validated changes with recurring relevance surveys against usage data, and ran a 6-month diary study to learn how long navigation changes take to be noticed and which education channels (interstitials, push, in-app, badges) worked. Follow-up in-lab interviews and an A/B test partnership with Data Science shaped a more flexible, customizable navigation, covered by Adweek, TechCrunch, CNET, and others.

Global surveyInternational field research6-month diary studyIn-lab interviewsA/B test partnership

Instagram: informing users about false content

To deter the spread of fact-checked false information, I researched where users encountered it and how to warn them without degrading the experience. Qualitative research showed a warning alone wasn't enough: users wanted sources and the choice to view content, leading to a redesigned treatment with fact-checker context and a "See Why / See Post" pattern. I ran an eye-tracking study confirming key information sat where users' gaze naturally fell, plus international research across two countries. Follow-up work uncovered that users wanted to warn others, reshaping the sharing flow. Covered by The Verge, Mashable, and Engadget.

Qualitative concept testingEye trackingInternational researchIterative validation

Staff lens

These programs are the craft foundation beneath everything above: eye tracking, multi-month diary studies, international fieldwork, and disciplined iterate-and-validate loops at billions-of-users scale. Staff researchers earn strategic influence by being unimpeachable on method first; this is where that came from.

Personal Project 2026 · macOS desktop app

Exposé: AI-Powered Photo Culling for Photographers

A native macOS app that uses Claude AI vision to analyze, rate, and tag RAW photos — cutting the tedious first-pass cull down from hours to minutes.

Solo developer · Electron + React + TypeScript + Claude Code (Anthropic)

The problem

After any substantial shoot, photographers face the same bottleneck: manually reviewing hundreds of RAW files to find the keepers before any creative work can begin. Existing tools (Lightroom, Capture One) require human eyes on every frame. I wanted to automate that first pass so the camera roll arrives pre-sorted.

What it does

Exposé scans a folder of RAW photos (Sony ARW and others), converts each image locally using macOS's built-in sips tool, and sends a resized preview to Claude's vision API for analysis. Claude rates each shot on sharpness, exposure, composition, and subject quality, then writes those ratings, star scores and color labels, directly back into the image's XMP metadata. The ratings appear immediately in Lightroom or Capture One with no extra steps. A persistent cache means re-analyzing a large folder skips images that haven't changed, keeping re-runs fast.

Electron + ReactTypeScriptClaude AI visionXMP metadata write-backRAW/ARW processingCSV export

Why I built it

Partly to solve a real personal pain point as a hobbyist photographer, and partly as a deliberate learning project, shipping a full desktop app end-to-end, from AI integration and native file I/O to packaging a signed macOS DMG. It's the kind of side project that makes the human factors of AI-assisted workflows concrete: where does automation earn trust, and where does the photographer still need to be in the loop?

Research that earns a seat in the decision

Methodological range

Triangulation by default

Cross-functional influence

Strategic framing

Four flagship programs

Understanding the ISV Developer Ecosystem

Log Investigator: Evaluating an AI Product Before GA

Advertiser Acquisition & Retention

Guided Help Experience ("Project Clippy")

Facebook Navigation & Instagram Inform Treatment

Exposé: AI-Powered Photo Culling for Photographers