AI Tools SME Testing Methodology
Lili Marocsik, owner of AI Tools SME
Real Testing, Real Results
We pride ourselves on testing every tool by hand. Unlike competitors who claim to test thousands of tools in short timeframes, we prioritize quality over quantity.
Our Testing Approach
Hands-on evaluation: Lili personally tests most tools on our website
Expert contributors: For specialized areas like data analytics or CRM, we bring in subject matter experts
Consistent methodology: All tools receive the same standardized prompts for direct comparison
Default settings focus: We avoid custom styling to evaluate each tool's default capabilities
Standard Test Prompts
Video Generators (No Audio)
Create a video of 2 people looking at each other and shaking hands, one being an AI robot, the other a woman. The background is space and the mood is friendly.
Video Generators (With Audio)
Base prompt above, plus:
Make the two protagonists look at each other. The woman says: "So nice to finally meet you" and the cyborg says: "Same here."
AI Image Generators
We use three standardized prompts to evaluate different capabilities:
šÆ Baseline Comparison Test:
Create a video of 2 people looking at each other and shaking hands, one being an AI robot, the other a woman. The background is space and the mood is friendly.
šØ Creativity Assessment:
Create a mars landscape with chrome design elements
š Detail Following Test:
Create an image of an older lady with natural wrinkles and grey hair laying tarot cards. We see her from the front as she holds one card up. Her look is mysterious, she is wearing a veil and the background is a dark blue velvet curtain with golden stitchings of stars, the moon and star constellations. The style is somewhat between Dune (the movie) and Aladdin, with a shiny gloss on it.
Presentation AI Makers
šÆ Open-ended Capability Test for AI generation feature
AI Tools for SMEs
This broad prompt helps us evaluate what each tool can generate independently.
Why This Methodology Works
ā
Consistent comparison across all tools in each category
ā
Real-world testing by actual users, not automated systems
ā
Quality focus over quantity claims
ā
Expert validation in specialized domains
ā
Default performance evaluation without bias toward specific styles