Claude Fable 5 real UX/UI design test
Full analysis of what AI is actually GREAT at when it comes to UX.
- What I did
- One shot and the lazy trap
- – The lazy trap
- Start with context or start with aesthetics
- – Useful design starts with context
- Project one, Vetidesk
- – Key visual
- – Copy, text hierarchy
- – CTA buttons
- – Outcome
- FX Pro
- – Key visual
- – Copy, text, hierarchy
- – CTA buttons
- – Second section
- The junior portfolio problem
- The real test
- – How I use AI for my work
- How well Fable did
- – The Flow Diagram
- – The LoFi prototype
- HTML, MCP, High Fidelity UI
- Conclusion
Real Workflow Case Study
Most AI testing for UX/UI design is done in a pretty odd way. The idea is to show a one-shot of a UI design of a website hero section.
My test is NOT about showing "generative landing pages" without context.
Because if you prompt it just right, you can get a first-glance impressive result that could be a nice enough dribbble shot. But in most cases those sites completely fail on a second look.

I actually tested Fable on three real projects. Two purely UI ones and one complex, multi-step UX platform.
The results are interesting. They prove AI has a place in professional design workflows, but likely NOT in the area that most people find the most impressive.
What I did
I picked two projects that were over two years old. That helps me make sure that AI will use knowledge that it averaged from the internet. Which is likely between 1 and 2 years old as well.
Two of them wer already completed. One with measurable results (almost 3x conversion growth YoY) and one ongoing.

One shot and the lazy trap
I purposefully wanted to see what a "one-shot" will look like. That's a conscious methodologic choice. Of course when you reprompt, modify things and improve, your results will be better.
But that completely changes the premise. If you need to reprompt, edit, point out mistakes, you're actively using AI as a tool with human guidance. What most AI accelerationists push is exactly that one-shotting part. They also get angry when you show that it can completely miss the mark.

The lazy trap
One-shotting is also connected to the biggest issue real UX designers face when using AI. Humans are inherently lazy. If we can optimize a process and stress our brain less, we do it. Even if it's detrimental to the final result.
We can cut corners if our cognitive load gets even slightly lowered.
The problem with generative AI tools, is that if you use them across multiple steps of the process, and you miss mistakes or hallucinations early on, that issue will compound.

Once you get to the end of the project, the problems will be huge, hard to fix without full rework, and sometimes even invisible. AI is very good at convincing you that:
You're absolutely right!
Start with context or start with aesthetics
Most AI testing on UX/UI design happens with aesthetics in mind. I get it. A beautiful design on a thumbnail gets more clicks. Showing something objectively pretty "generated by AI" also pushes the hype-train further.
The problem is that real websites, ones with purpose are never built from the aesthetics standpoint. That "impressiveness" of UI design seems to have taken over most of the design discourse.

Since when is UX/UI design degraded to just making "Framer templates" for landing pages? As that's what essentially is tested and shown when determining AI use in design.
Design Twitter especially seems to be only about landing page related content, mostly with eerily similar AI generated visuals.
Useful design starts with context
Instead of specifying
Add a beautiful gradiented light-leak effect, animated using GSAP in the landing page header and highlight the words "seamlessly empower" with a shifting gradient
I actually gave the model full context of each project. I gave it the PRD file from the client, I gave it my annotated version of that PRD with ideas, styleguide settings (colors, general font-style but not specific typefaces), tone of voice and visuals, and initial copy.

On top of that I also attached a "goal" document that outlines what the intended outcome of the site is when it comes to a specified target demographic.
In case of the third project I gave it a lot more extra information, but we'll get to that when talking about that specific project as it's way more complex.

Project one, Vetidesk
Vetidesk launched over two years ago. The result was 2.9x increased signups and user trust compared to the previous landing page. The client has changed the site and iterated on it on their own after that as well to further test what may work even better.
Fable got all the context, text, prd and goal of the website at the start. I also told it to ask me questions if anything is unclear. It took around 20 minutes for it to complete.
Let's talk about the result.

Key visual
The most impressive thing about Fable on this project, is that it correctly understood the intricacies of what Vetidesk actually does. It grasped that it's an appointment booking and VoiP service and presented the key visual in context, matching the product almost perfectly.
When I talked about it in a YouTube video, many commenters accused me to "prompt it bad" on purpose, as the result is NOT impressive. They probably expected some gradient blobs, light-leaks or 3d transformations. I get it. A key visual of just boxes of parts of an interface may feel boring.

And while metric theater is not necessarily the best way to illustrate a project, it's better than those "fancy" visuals that are commonly associated with AI landing pages. This is actually a win, even though there's lots of inconsitencies in icons, spacing and styling of these elements.
They are already animated however, which is another benefit.

Copy, text hierarchy
The copy is similar. In a few places AI decided to tweak it, not necessarily for the better. It did add a kicker, that's repetitive and doesn't contribute to anything in this case. It repeats what's already in the H1.
Why did it add a kicker? Easy. Because nearly ALL landing page templates shared online now have a kicker. So AI assumes every website should have one no matter what. There was no kicker mentioned in the spec though, it just added it anyway.
Then there's the odd highlighting of the labels and not the numbers in social proof. I really have no idea why, but it's not a big deal if you audit it afterwards.

There's also a tendency for the H1 to be bold. Very bold. Once again an artifact from training, and a trend that is kind of going away now. Maybe not towards ultra-light text from 2014-era, but regular or medium.
The hierarchy uses the same X value between all the elements except for the social proof which uses 1.5X. This is safe but a little boring visually.
CTA buttons
The main thing with the CTA is the shape and color. The site is heavily blue with pill shapes for both buttons and some of the decoration. That makes our brain process the buttons slightly slower. Not the best choice. But once again this is a choice that stems from MOST landing pages having pill-shaped buttons out there.
Our original green button totally stands out on the page and instantly grabs attention. Blue? Not so much.

Outcome
The outcome is neither good nor bad. It's average, just as expected. What was impressive is the ability of Fable to understand the context of the product enough to make an actually fitting key visual.
Predictable, but fitting. In comparison, our key visual was actually based on talking with the clinics and hearing how they are a bit overwhelmed with complex dashboards and metrics. This is why in our version we went with a smiling doctor and a dog. The floating windows around are all small and purposefully not data-heavy.
This was outlined in the context document Claude too.
The real website got 2.9x more registrations and a lot of praise from the customers.

FX Pro
We also worked with FX Pro two years ago. It wasn't a redesign, but rather a direction exploration for their future websites. In this case I picked a tablet breakpoint.
That's because tablets are the most rare of all online projects, which means there's less training data on them. But for most projects we still have to make that breakpoint. So yes, technically I picked an AI blind-spot, but the tablet version wouldn't be much different from the desktop one.

Key visual
The structural similarities only show that the spec I uploaded to AI was detailed and thorough. Of course it couldn't use a photo, so instead it tried to generate one with SVG. I'm not going to pick on that part though. It's non essential and easily replacable with a photo.
It was close enough to the spec, as the idea was to make a simple, photo oriented visual. Nothing extravagant or overly visual, to show professionalism and not chasing visual trends.

Copy, text, hierarchy
Once again it put the kicker social proof inside pills above the heading, not considering any other option for this element. They're also different heights.
The H1 is super-bold and huge. Another AI-trope of trying to adhere to average landing page design rules.
One thing it did that we missed is making the 7M customer number in bold. Good catch!
One thing I noticed is that BOTH designs use roughly the exact same spacing between elements, even though they're really different. At 1:1 rendering it's down to maybe a single pixel difference.

CTA buttons
Here we have even more duplication of the button shapes, but also the exact same color pattern repeats at the top. That takes the attention away from the main CTA. They should be more muted and only highlight more when we scroll past the hero section if we go with a sticky header.
They're also using almost the exact same colorful shadow as AI picked on Vetidesk. Only a different hue. That kind of shows how AI recycles a couple of the same patterns.

Second section
The second section came out as an ultra generic carousel, without having one of the items cut off to show scrolling. On a tablet that's a way better pattern than having an arrow and a link.
Then the boxes themselves are entirely typical AI pattern too. Rectangles with colorful squares inside, with darker shade of the same color icons inside. Sometimes those rounded squares also have a brighter same color outline too.
This pattern right now is a pure definition of slop.

In the original design, the images weren't "generated". They were assembled by hand from separate elements, so that they could be individually animated on scroll. Because some were blurred to show depth, the parallax effect could look very nice and subtle.

Claude has no image generation built in, so it went with what it knows. The icons. A skilled designer can then take this V1 and greatly improve it by adding assets and explaining the desired outcome in more visual detail.
But then it stops being a one-shot. It becomes a prototyping tool that still requires someone with skill to operate.

The junior portfolio problem
When learning UX a little deeper than just Dribbble visuals, AI has to train on case studies. And what kind of case studies are the most abundant out there? Of course the Junior designer ones from people who apply for jobs.
The problem with these, aside from sometimes questionable quality, is that they usually pick a part of a flow to illustrate that's detached from the rest.

And because of time constraints they don't think of that part HEAVILY in the context of the rest. That leads to gaps and omissions which are fine to assess the skill for a UX recruiter, but not that great when training.
Think of it like a tower of blocks. If your foundation is wonky, or even plain missing, what you build on top can look nice and still collapse under its own weight.
AI mostly trained on these portfolios because they're the most available ones. Sure, some pro agencies do these comprehensive breakdowns online, but they're not the majority and almost never show the entire process either.

In a way for a pro agency it's a way to protect their know-how and patterns. Nobody shares this kind of stuff online. So the training data on actual UX to be good has to be hand fine-tuned by skilled designers working for those AI companies.
That leaves a big long-term gap in learning. It's a lot easier to learn from "popular Dribbble shots" on how to make an "impressive" landing page hero. But building a complex multi-step project is a different thing.
And the argument of "scraping Mobbin" also falls apart, as it assumes two things:
- All projects can be extrapolated from existing ones
- Big companies ALWAYS know what they're doing
We all know neither of these is true. Even the largest, most profitable multi-Billion dollar brands make botched or plain bad design choices. Want an example? Meta implemented an AI chatbot that allowed people to prompt-inject their way into accounts of other users and hack them.
Who thought it was a good idea to give an AI full user-rights permission? Someone designed it, someone tested it, and then it went live.
When faced with complexity AI fails even more, but funnily enough I believe this is where the future of AI for design is.
Let me explain.

The real test
Due to the project being ongoing, I can't share specific details from it yet, just the high-level facts.
It's a complex platform. In Poland alone it has 5 competitors, worldwide it's likely around 3 to 5 per country. But this platform redesign is breaking out of the norms of how ALL of the similar platforms are built.
It uses patterns that are NOT in the industry yet. No case studies to learn from. No flows and no success stories.
It's a mix of uncharted territory, user research (positive so far) and experimentation. Oh, and a strategy for when to dial the innovation back too.

How I use AI for my work
As I mentioned before, it has to start with the proper foundation. That foundation needs to be the most scrutinized and hand-held of all the processes.
The workflow looks like this. We get from a PRD and extra context to a modifiable flow diagram. That diagram is heavily tweaked by hand. It results in a JSON export of all the connections and also WHAT ui elements are on each screen. Then we use that with basic lo-fi rules to build a no-color HTML+CSS clickable prototype.
In some cases it has validation built in. That depends on the project.
For now, the HTML files are then MCP'd back into our design tool for a high-fidelity polish. That high-fidelity step also leads to a lot of the ideas being changed from the low-fidelity. Not just visually, but often structurally. Change is inevitable so it's important to be ready for it.

How well Fable did
On understanding the PRD and context it did fairly well. Then it inserted the structure into our self-made internal tool for diagramming. It created all the nodes, flows and divided it into clear sections.
Now when you look at the completed diagram, it really IS impressive. AI comes up with practically all edge-cases. Some of them are known but often skipped due to time and budget constraints.
This time we had them all.
Very nice!

The natural reaction of many designers is to browse the generated diagram and look for flaws or issues. But this can very easily lead to the laziness trap. If something at first glance feels comprehensive enough, we put our guard down.
The main problem with the flow was those logical gaps I mentioned. I used to review hundreds of junior portfolios each year and I recognize some of those patterns. AI definitely has trained on at least those case studies posted publicly as blogs or articles. It's not that the results are completely wrong. It's that they're a little bit off.

There is a tendency to expand the number of screens or steps. Things that can be merged are separated, often with splits that are not beneficial to the project.
I ran the same test on Opus 4.8 and to be fair, Fable had around 30% less of these errors per screen. However the oversimplification and logical gaps were present on basically every single part of the flow except for the most common ones (like forgot password or login).
The Flow Diagram
When Fable exported the JSON to our internal tool, I loaded it up and went through EVERY single node of the flow by hand rewriting it to make sense. Removed the ones that didn't make sense and merged where necessary.
Normally building a diagram like that from scratch took between 2–3 days. With Fable I managed to save a day. The generating part itself took maybe two hours. The rest was the adjustments and manual tweaks.

Those tweaks still took a lot of time, as in many cases they were simply substantial.
However, for many designers a lot of these omissions or gaps can go unseen. Then they proceed to the next steps with a lot of UX AI Debt.
The LoFi prototype
Since all nodes have "what's on this screen" section (structure and specific UI components to handle specific tasks) the result of the modified flow diagram is another JSON file.
This one is fed back into Fable with an internal .MD ruleset to generate a no-color, no-graphics clickable prototype. In this case we also added requirements for error messages, so the forms actually need to be filled.
This stage, after the initial tweaks were in was much better, but still required a couple extra hours of feedbacking to get just right. And once again at first glance it looked exectly as it should.
Falling into the laziness trap in these stages is extremely easy.

HTML, MCP, High Fidelity UI
The last step is pushing those HTML files to our design tool via MCP. We use Sketch, but it works the same way with Figma. We tried "applying a design system" to the Low-fidelity and manually making it look final in code, but at least for now that wasn't viable.
With generic, boxy UI's it may be possible. We try to design custom-built elements and components, often breaking some conventional rules.
HTML files got converted roughly to Sketch artboards and then tweaked screen by screen by hand to look exactly how we need them to look. In that stage we also came up with lots of changes to the structure of the screens that came from how they looked.
But for these to make sense time-wise they required to be based on stable foundations first.
Conclusion
For now our high fidelity is still fully done in a design tool. But I do see more and more AI use (fully guided) in the design process.
I just don't believe landing pages are the best use-case. Generative landings end up looking the same as everyone else's which will greatly decrease their actual value.
And that will decrease sales big time.
Because the value is NOT how well they look on a Dribbble page or on Social Media. The value is whether they bring in trust and conversion from the end-users.
Flow-to-Low-Fidelity is actually a great use for AI.
When fully guided and thoroughly edited by a skilled human, you save maybe 20–30% of the time and you get the extra edge-cases to create.
Those JSON files are also helpful if the devs want to generate anything frontend related later on using another AI tool. It's a bridge between two worlds.
When a designer uses it without falling into the lazy trap, I can totally see a huge benefit from AI use in UX. Let's just hope more of us will actually put in that effort and won't be satisfied with the V1 flow AI generates.
What's Your Reaction?
Like
1
Dislike
0
Love
0
Funny
0
Angry
0
Sad
0
Wow
0