Sergey Toshin on Why Mobile Security Breaks at Workflow Fit

In this interview, Sergey Toshin, founder of Oversecured, explains why mobile security tools fail when they do not fit how development teams actually work, and why dependable detection depends as much on workflow integration and reproducibility as on finding vulnerabilities.

When you turned your own mobile security research workflow into Oversecured, what was the hardest part to make dependable inside a product?

We had to go through a long process of building a product for development teams and understanding their workflows. At the very beginning we had one single offering — Single Scans. You upload an app and get a vulnerability report for it. But companies needed integrations with different services, embedding into their workflows, and broader process support. Even now, the whole team keeps working on this, making the product fit our clients’ team-based workflows as naturally as possible.

In a mobile app with a heavy third-party SDK stack, where does source-code review usually stop being enough? What does binary analysis show more clearly in practice?

From a vulnerability scanner’s perspective, it doesn’t really matter whether it’s an SDK or the app’s own code. Similar vulnerabilities appear in both, and our patterns detect them. Also, Oversecured doesn’t rely on pure binary analysis. On both Android and iOS, we scan code, decompiled or original, to find source-to-sink vulnerabilities through SAST taint analysis. On the DAST side for Android we use fairly high-level approaches as well, such as creating breakpoints based on Java class/method/field names.

Which mobile attack surfaces still cause the most serious issues today?

We have a presentation on exactly this, with full lists of attack surfaces — even QR code scanning can be dangerous. But the most severe vectors are 0-click remote. Recent DNG exploits affecting both iOS and Android showed how shared libraries can enable remote code execution through a single message. A few years ago iPhones were getting completely bricked when the system tried to parse certain letters or syllables from an Indic script — a system-level DoS. At the application level, deeplinks are usually the main one. But if the app is a messenger, or has any functionality for transferring or receiving data from other users, that flow also needs to be treated as an attack vector.

A lot of security tools lose trust because they create too much noise. What did you have to get right to keep detection precise without making the product too narrow?

Right now we’re integrating AI for assessing vulnerability confidence. Before the AI era, I had to scan a huge number of applications and go through reports manually, figuring out whether each finding was real or a false positive. I spent a lot of time on that loop, fixing errors in the rules. I also built templates to make corrections easier. For example, if an operation writes data to a file, I describe a separate template covering all the possible issues that can occur with a file write, and then apply that template to specific Java/Swift methods. With this approach it became much simpler to manage rules, fix errors, and add new checks.

How do you think about the boundary between static and dynamic analysis in mobile security today? Where does each one still matter most?

Mobile security still has a problem: not many people understand it deeply at a technical level — there are lots of attack vectors, mobile-specific issues, mobile-specific code, and so on. We built DAST primarily so we could generate proof-of-concepts. That way more people can actually grasp a vulnerability, because they can reproduce and test it right away. Our SAST only highlights the problematic lines of code, and before DAST the developer had to dig into a lot of technical nuance and try to reproduce findings on their own. We also noticed that SAST and DAST cover for each other: SAST sometimes misses vulnerabilities — mostly due to decompilation issues or overly complex dataflow — but DAST catches them. And sometimes DAST can’t validate a vulnerability because the conditions are too complex, though we’re putting real effort right now into reducing those cases. So the type of analysis matters less than people think — both are ways to surface the same problems.

Some mobile vulnerabilities matter less as isolated bugs and more as exploit chains. When you review real applications, how often does the real risk come from several smaller assumptions combining in the wrong way?

Very often. A couple of examples: file overwrite on its own isn’t too scary, but if you can replace the contents of an executable file, it turns into arbitrary code execution. Various WebView handlers that a developer implemented incorrectly escalate the impact from “can open arbitrary URLs in this WebView” to token theft, access to arbitrary files, and so on.

Your MavenGate research pointed to a deeper trust problem in the software supply chain. What do you think most teams still misunderstand about dependency trust once code moves from the open-source ecosystem into real production environments?

It’s probably a variant of an old problem: developers don’t always know what to trust and what not to trust. If you’ve got legitimate repositories configured and your dependency integration is copy-pasted from the manual — what could go wrong? I think these kinds of problems should be solved by solutions like Oversecured. We should be the ones thinking about what can go wrong, how to detect it, and how to explain it clearly to teams. Most product teams should not need to reason through supply-chain edge cases themselves.

Oversecured often provides proof-of-concept evidence rather than only describing a finding. What changes when a security issue becomes concrete enough for an engineering team to reproduce and see for itself?

We’re rolling out AI functionality that also handles detailed vulnerability descriptions. It uses the same building blocks. The broader problem we’re trying to fix is that many security engineers are web specialists, not mobile. They don’t know the specifics of WebView, the risks around Android Intent, and so on. Mobile developers, on the other hand, know security only on the surface — for many of them it comes down to obfuscation, SSL pinning, and root detection. But if both groups can copy-paste and reproduce the issue themselves, the likelihood of it getting fixed goes up significantly — they don’t need to dig in and figure it out from scratch — and the speed of the fix goes up too.

You moved from top-tier vulnerability research into building a company around automation. What changed most in how you make decisions once the goal became continuous risk reduction rather than finding a single impressive bug?

In bug bounty you’re solving one task: find a vulnerability. When you’re running a B2B company around automated vulnerability discovery, the task list expands to include product development, brand awareness, sales, retention, and funding. So instead of while (true) findBug(); you end up doing planning and prioritization first, and only then getting to the actual work.

When you decide what should become a product capability, what usually matters most to you: exploitability, prevalence, remediation value, or something else?

At the scanner level, the goal is detecting as many vulnerabilities as possible. But at the product level, priorities shift from the number of findings to usability. New tasks come in — explaining the issue correctly, and generally making sure the specific security problem gets solved faster and more easily for our users.

As more mobile products add AI features, where do you think teams are most likely to underestimate security risk today?

I see AI as one more feature inside an app. So the attacks will likely stay the same, but there are extra things worth checking:

whether an attacker can interact with the AI without the user knowing;
if training happens locally, whether there’s a risk of that data being stolen;
whether the integration with AI services is done correctly.

We’ve already found vulnerabilities where developers just hardcoded a cloud LLM token into the app, which let anyone read every conversation users had with the AI.

Looking ahead, what shift are you paying closest attention to right now in security?

Vulnerability detection through AI, and replacing parts of traditional analysis with AI-driven approaches. That could mean generating payloads, or building a plan for walking through an app with AI — in DAST it’s important to trigger as much code as possible at runtime so you can actually check its security.

Editor’s Note

This interview examines a broader shift in mobile security from isolated vulnerability discovery toward continuous risk reduction, where product fit, exploit reproducibility, and developer adoption matter as much as technical depth.