Vision testing promises scale and access, but in regulated medical domains, speed and iteration introduce silent risk. In this interview, Feyenally founder Piotr Kruszynski explains why eyeTEST is designed to refuse unsafe results and why physical constraints and traceability matter more than model confidence.
1. You moved from running a high-frequency, low-margin logistics platform at PizzaPortal into a domain where error tolerance is near zero and validation cycles are long. What is the most important business belief you had to unlearn in that transition, and where did your audit background from Deloitte and EY directly change a technical or regulatory decision at Feyenally?
Piotr Kruszynski: The biggest thing I had to unlearn was the idea that speed and iteration can compensate for uncertainty. In marketplaces, you assume some level of error and you design the system so volume and optimization wash it out. In medical technology, that logic is simply wrong. An error isn’t noise, it’s a potential harm. That changes everything: how you design, how you validate, and how you talk about what the product does. My audit background shows up very concretely in how we make decisions at Feyenally. I have a very low tolerance for claims that cannot be backed by documentation, traceability, and evidence. If we can’t explain and document why a decision was made, it doesn’t belong in a regulated product.
2. Feyenally appears to have originated from a very specific failure point in eyewear e-commerce rather than from a traditional medical research pathway. To what extent was eyeTEST initially designed to solve conversion friction for brands like Gepetto, and how do you decide when retail usability conflicts with clinical edge-case precision?
Piotr Kruszynski: Feyenally didn’t come out of a lab in the traditional medical sense, that’s true. It came out of seeing a very specific failure in how vision correction is accessed, especially online. Through eyewear e-commerce we saw that people were blocked not by willingness, but by access to eye tests. So yes, retail friction was an important lens early on. But from the start we were very clear that if we were going to touch refraction at all, it had to be done as an objective measurement, not as a usability trick. If the physics or the signal isn’t there, we don’t try to smooth it over with UX. We either constrain the experience or we don’t return a result.
3. You chose photorefraction, a method that is physically sensitive to camera geometry, distance, pupil size, and ambient light. Under what concrete conditions does Feyenally refuse to return a refractive result, even if the model could technically output one?
Piotr Kruszynski: There are very clear situations where the system simply refuses to produce a refractive result, even if a model could technically output a number. If the image quality is insufficient, or if the environment violates the assumptions of the measurement, the system stops. That’s not a failure, that’s a safety mechanism. The guiding principle is that the ability to compute does not equal the right to report. Refusal is often the most responsible output.
4. Smartphone hardware varies widely in flash placement, sensor noise, and optical quality. When the system encounters a low-end device or degraded signal quality, does the model default to rejection, reduced confidence, or inferred compensation, and how do you prevent statistical inference from masquerading as optical measurement?
Piotr Kruszynski: When it comes to device variability, this is exactly why we are currently limited to recent iPhone devices and iOS in general. Smartphone hardware differences are not a theoretical issue for us, they directly affect the physics of the measurement. Rather than trying to compensate statistically for weak or inconsistent signals, we made a deliberate decision to narrow the supported device set to hardware we understand and can characterize well. In parallel, we are actively working on the Android version, but with the same philosophy: support will expand only when we can enforce the same physical and quality constraints, not before.
5. When your model outputs a value such as −2.50 diopters, is that result grounded in interpretable optical features derived from retinal reflection, or is it primarily the outcome of learned image-level pattern recognition? If regulators asked you to justify that number beyond probability, what evidence could you point to?
Piotr Kruszynski: When the system outputs something like minus 2.50 diopters, the intent is that this value is grounded in optical information derived from image captured under controlled conditions. Of course, neural networks are part of the pipeline, and learned patterns play a role. But if a regulator asked us to justify that number beyond probability, the justification would be the capture protocol, the physical basis of photorefraction, and the validation data showing that under those conditions, the system produces repeatable, clinically meaningful results.
6. Early photorefraction research assumes controlled lighting and fixed working distance, yet your product operates in uncontrolled home environments. What user behaviors or environmental variables proved most destructive in early trials, and how did you enforce physical constraints through product design rather than user instructions?
Piotr Kruszynski: The most destructive variables early on were exactly what you’d expect: uncontrolled light, wrong distance, incorrect angles and motion. We learned very quickly that instructions alone are not enough. So instead of telling users to behave perfectly, we enforce constraints through design. Automatic capture instead of manual photos, geometry checks, quality gates, and rejecting anything that doesn’t meet the criteria. The product is designed to say no more often than users might like, and that’s intentional.
7. Visibly’s experience shows that even subjective online refraction can be stalled for years by regulators. As a company pursuing objective measurement, are you planning around substantial equivalence via 510(k), or are you prepared for a De Novo path, and have you explicitly modeled the possibility that regulatory approval may not arrive before capital constraints do?
Piotr Kruszynski: On the regulatory side, we’re being very pragmatic. In parallel to preparing for full medical certification, we are actively pursuing screening and pre-clinical use cases that may not require certification in the first place. There are meaningful opportunities where objective screening or triage can create value without making diagnostic or treatment claims, and it would be irresponsible not to explore those paths while certification is underway. At the same time, we are absolutely committed to pursuing formal certification in parallel. We don’t see this as an either-or decision. Screening allows us to deploy earlier, learn, and build real-world experience, while certification is the long-term path that enables deeper clinical integration. What we are careful about is not letting early screening uses dilute or shortcut the rigor required for the certified product.
8. Organized optometry in the US has consistently argued that refraction without broader eye health assessment is ethically incomplete. How do you respond to the claim that tools like eyeTEST risk reinforcing a “vision without health” mindset, and do you see physiological screening becoming a requirement rather than an optional add-on?
Piotr Kruszynski: I understand the concern from organized optometry that refraction without broader eye health assessment is incomplete. I don’t disagree with the premise. But I also think we have to be honest about the alternative, which for a huge part of the global population is no assessment at all. We are not positioned as a replacement for comprehensive eye exams. It’s a tool to close a very specific access gap. Long term, I do expect that physiological screening and broader health indicators will become increasingly important, and possibly required. But refusing to address refraction because it isn’t the whole picture would, in my view, be a net harm.
9. In emerging markets with poor connectivity and low-cost devices, cloud inference and high-fidelity imaging assumptions break down. Have you already accepted the need for fully on-device inference with tighter confidence thresholds, and does that force you to maintain fundamentally different technical standards across markets?
Piotr Kruszynski: For emerging markets, connectivity might be unreliable, devices are often low-end, and expecting high-fidelity imaging everywhere would be unrealistic. Instead of forcing the same solution into an environment where it doesn’t fit, we built a separate app designed specifically for these contexts. That app focuses on locally running survey-based screening functionalities and is able to work fully on low-end devices without relying on cloud inference or high-quality imaging. It’s meant to address access and triage. We’re planning to deploy this first in Ghana this year.
10. Remote diagnostics suffer from silent failure, where incorrect results rarely return as labeled feedback. What concrete mechanisms does Feyenally use to capture downstream failure cases, and how do you prevent the model from stagnating in the absence of reliable ground-truth loops?
Piotr Kruszynski: Silent failure is one of the hardest problems in remote diagnostics. The core principle we operate under is that absence of feedback is not proof of correctness. We assume that without deliberate mechanisms to capture outcomes and discrepancies, models will stagnate. That’s why we treat post-market data collection and validation as ongoing obligations, not one-off milestones.
11. Photorefraction behaves differently across iris pigmentation and eyelid morphology. How representative is your current training data across populations, and if accuracy or confidence intervals diverge meaningfully by phenotype, how do you decide between restricting use and scaling anyway?
Piotr Kruszynski: Photorefraction behaving differently across pigmentation and morphology is a real issue. The responsible approach is to measure performance across populations and then make decisions. If accuracy or confidence intervals diverge meaningfully, you either restrict the indicated use or you invest in fixing the data and revalidating. Scaling something you know performs unevenly is not acceptable just because the market is large.
12. Moving from refraction into treatment via digitalOCULIST significantly raises both regulatory and clinical stakes. What specific evidence would convince you that Feyenally has fully earned the right to expand into treatment workflows, and what would cause you to deliberately delay that step even if market demand is strong?
Piotr Kruszynski: Moving from refraction into treatment is not just an expansion of scope, it’s an expansion of responsibility. For me, earning the right to do that means having very strong evidence that the foundational layer is clinically solid, validated, and appropriately certified. It also means having workflows where treatment decisions are integrated responsibly, not just digitized.
Editor’s Note
This interview exposes a core tension in medical AI where what a system can compute exceeds what it is responsible to report.

