25 Years of Software Development: Testing and AI Lessons

The career path nobody plans for

Most developers do not set out to reinvent their careers halfway through. They specialize, they go deep, and one day they realize the pendulum has swung too far from the work they actually enjoy.

In a recent episode of the Tech Effect podcast, we sat down with Danny Preussler, a full-stack engineer at SoundCloud with over 25 years of experience in software development. Preussler's career has taken him from embedded Java on feature phones to Blackberry development, through a decade-long specialization in Android at companies including eBay, Groupon, and Viacom, and eventually to a full-stack role at SoundCloud, the online audio and music streaming platform used by millions to upload, share, and listen to music, podcasts, and DJ sets.

His journey was not linear. It followed what he calls "the pendulum", the cycle where you start as an engineer, grow a team, get promoted to management, realize you hate it, and switch back to coding. "At some point I realized what I really enjoy is coding. I hate my job. I should do something," Preussler says. When an opportunity opened up on SoundCloud's integrations team, he took it. Knowing it would mean stepping from a senior Android role into what was effectively a junior position in backend and frontend development.

That willingness to reset seniority, learn from colleagues in unfamiliar domains, and accept the discomfort of being a beginner again is the thread that runs through everything Preussler has learned, about career transitions, about AI, and about how to test software that millions of people depend on.

TL;DR

30-second summary

What does 25 years of software development across mobile, full-stack, and streaming platforms teach you about testing, AI, and building a career that lasts?

According to Danny Preussler, full-stack engineer at SoundCloud, speaking on the Tech Effect podcast:

Learn concepts, not libraries. Foundational knowledge, like asynchronous patterns, dependency injection, and UI composition principles, transfers across every platform and language. Specific library expertise does not. Developers who understand how things work under the hood adapt faster to new environments than those who only know the tools built on top of them.
AI widens the horizontal bar of a T-shaped career. AI assistants allow developers to navigate unfamiliar codebases, analyse large volumes of test data, and contribute meaningfully in domains where they lack deep expertise. The developers who treat AI as a productivity multiplier, delegating repetitive work and keeping their focus on judgment and creativity, will outpace those who resist it or misuse it.
AI-generated code increases the need for QA, it does not reduce it. Studies consistently show that AI-assisted development produces more code duplication and lower code quality. The demand for experienced testers who can review, validate, and guide AI-generated work is growing, not shrinking. Using AI to execute tests directly introduces non-determinism and makes already-flaky tests worse.
Stop automating what keeps breaking. If a test consumes more time in stabilisation than it saves in bug detection, delete it. Wasting engineering effort on tests that rarely catch real bugs is not rigour. it is waste. Cover the same behaviour at a more reliable level of the test pyramid, or use a semi-automatic approach during release.
Test behaviour, not implementation. Tests coupled to internal implementation details break on every refactor and teach developers to hate testing. Tests coupled to observable behaviour survive architectural changes, build confidence, and stay useful over time.

Bottom line: According to Danny Preussler on Tech Effect, the most durable lesson from 25 years of software development is that quality cannot be automated away — it has to be designed in. AI makes developers faster and broadens their reach, but it also introduces new categories of risk that only experienced engineers and QA professionals can manage. The teams that understand this will use AI as a multiplier. The ones that don't will ship faster and break more.

Why learning concepts beats learning libraries

One of the most transferable pieces of advice Preussler offers to developers considering a career shift, whether from mobile to full stack, or from one platform to another, is deceptively simple: learn the concepts, not the libraries.

"Developers often focus on, hey, there's this one library.". "Then you put them in a slightly different environment and they're completely lost." —Danny Preussler, full-stack engineer at SoundCloud

The reason this matters is that foundational patterns repeat across platforms. Android's coroutines and Go's goroutines are conceptually similar. Jetpack Compose is heavily inspired by Flutter and React. Swift and Kotlin are closer to each other than Objective-C and Java ever were. A developer who understands dependency injection as a concept can implement it in any language, even one that lacks a framework for it. A developer who only knows Dagger is stuck the moment Dagger is not available.

Preussler discovered this firsthand during his transition. Languages were not the challenge. "After a while, if you've seen one language you've seen them all," he says, with admitted exaggeration. The real surprises were environmental. Backend development turned out to be less about writing code and more about Kubernetes and Terraform. Frontend development felt like a "big mess" where every project uses different frameworks. And debugging in JavaScript involved literally typing the word "debugger" into the code, something that would never happen in Android Studio.

The practical takeaway for developers making a similar move: invest in understanding how things work under the hood. Learn how to write your own dependency injection framework, and you will understand what to do when you land on a platform that does not have one. Be curious, be humble about your new junior status, and appreciate the company that gives you the chance to grow into a broader role while still paying you as a senior.

How AI changes the shape of a developer's career

Preussler's advice on AI adoption is direct: do not stay behind, do not oppose it, and do not confuse it with code completion.

"I use my AI assistant on the side. I use agents," he says. For someone navigating unfamiliar microservices, which is daily reality for a developer who recently transitioned to full stack, AI acts as a junior developer you can assign tasks to.

"I'm opening a microservice I've never seen. I know there's some endpoint with this pattern somewhere. Then I ask the agent: can you try to find me all the points in the code where this is called? This is very helpful."

The broader insight is about what AI does to career shape. Preussler describes himself as a T-shaped engineer, deep expertise in one area (Android), with breadth across others. AI makes the horizontal bar of that T much wider. "All of a sudden you can dig into a language, into a project you've never seen before, which I think would be much harder before."

He draws a parallel to the music industry, where artists face the same fears about AI replacement. Their response is pragmatic: use AI for the parts you do not enjoy. If you do not like mastering, let AI do the mastering. If there is a repetitive task in your daily work, give it to the agent and keep focusing on the things that require your judgment and creativity.

The comparison to code completion is instructive. When code completion first appeared, nobody opposed it. AI is fundamentally the same kind of tool, it makes you more productive if you use it the right way. The key phrase is "if you use it the right way."

Why AI-generated code makes QA more important, not less

There is an emerging tension that Preussler highlights with characteristic directness: AI will help developers write more code, but the quality of that code will go down.

"All the studies show so far, the code quality will go down, the duplication will go up. You really need a lot of QA to test all the software that was written by AI".

This creates a specific challenge for testing. The initial instinct that "finally AI can write these tests for me", is, in Preussler's view, potentially the worst idea. LLMs are non-deterministic, which means using them directly for test execution would make tests even more flaky than they already are.

But there are legitimate, high-value uses. AI can generate test code that you then stabilize and maintain. It can analyze large volumes of test data, like load test results, dashboards, and pattern detection—work that previously required significant manual effort. The key is using AI to assist the testing process, not to replace the human judgment that makes testing meaningful.

The broader implication is significant: AI enables people without computer science backgrounds to build applications, but those people have no idea what the generated code actually does. They lack the expertise to evaluate whether it is correct, secure, or maintainable. That means the demand for developers and QA professionals who can review, guide, and validate AI-generated work is growing, not shrinking.

What testing looks like in streaming and consumer-facing apps

Preussler's experience spans both major consumer-facing platforms (eBay, Groupon) and media streaming (Viacom, SoundCloud). The testing challenges across these industries share fundamentals but differ in critical ways.

The fundamentals are the same

Every product has KPIs that matter, such as time to first playback for streaming and checkout completion rates for e-commerce. You build dashboards, look for patterns, and react when something deviates. And the universal truth holds: an app that is crashing or laggy is a bad app. "It can be very shiny, but if it then takes 20 seconds to load, the user is probably already on TikTok and forgot that you opened your app," Preussler says.

Personalization makes testing harder than people expect

When every user sees a different screen based on their location, listening history, and subscription status, you cannot write deterministic end-to-end tests in the traditional sense. Server-driven UI means the client renders whatever the server sends, like a hero image, a carousel, or a track listing, without knowing in advance what the content will be. The testing approach shifts accordingly: verify that whatever comes is rendered correctly, regardless of whether it is a Billie Eilish track or a Metallica album. Test the behavior, not the specific content.

Streaming adds network complexity

Unlike a mail app where a three-second delay is tolerable, streaming users expect uninterrupted playback on a train with patchy connectivity. Preloading, adaptive quality for video, and maintaining audio quality for music subscribers all require handling network conditions that most other app categories can largely abstract away.

Testing in production is not optional

Platforms with user-generated content (such as SoundCloud or eBay) have catalogs that change constantly and cannot be fully replicated in staging environments. "I always said, but what are you afraid of, that this little client can break your backend? Then I think there's something wrong," Preussler says. Developers need to use their own product, go through the pain of uploading a track or posting a listing, and experience both sides of the platform.

The automation trap: Knowing when to stop

One of the most counterintuitive lessons Preussler shares is about knowing when not to automate.

At SoundCloud, the engineering team spent significant effort trying to stabilize flaky UI tests - Espresso tests on Android that sometimes passed, sometimes failed. They followed every best practice: stubbing network requests, removing arbitrary waits, investigating every flaky failure. A dedicated team triaged flaky tests. Eventually, they asked the uncomfortable question: when was the last time this test actually caught a real bug?

The answer was almost never.

So they deleted all of them. "It was the best thing we ever did," Preussler says.

The principle behind this decision applies broadly: automation should give you value. If you are running against the wall again and again trying to make a test work, it may not be worth automating. Consider a semi-automatic approach. Specifically, a script that a human runs and visually verifies during release week. Consider covering the same behavior with faster, more reliable unit tests lower in the pyramid. Consider whether the effort you are spending on stabilization would be better spent elsewhere.

This connects to a deeper misconception about testing that Preussler addresses: the belief that any code change should cause tests to fail. If you change the internal implementation of a system but the behavior stays the same, for instance, the same inputs produce the same outputs, your tests should still pass. If they do not, you have written tests that are coupled to implementation details rather than behavior. This is, in his view, why many developers end up hating testing: they have been taught to write tests the wrong way, testing how the system works rather than what it does.

He illustrates with a real example: a colleague rewrote an entire app's architecture from one pattern to Redux over a weekend. The tests needed minor adjustments but still passed, because they were testing behavior, not structure. If they had been traditional one-test-per-class unit tests, every single one would have been useless.

What scaling looks like from the inside

For consumer-facing apps in the streaming space, scaling introduces specific challenges that developers need to prepare for.

Peak traffic is real and predictable. A new BTS album drops and access goes through the roof. It is match day for a sports streaming platform and everyone hits play at the same time. These are the moments your infrastructure has been preparing for, and if it fails at that moment, you were not prepared well enough. Preussler draws the parallel to Amazon on Black Friday. If that is the moment you have been waiting for all year and everything goes down, something went fundamentally wrong in your preparation.

The client-side responsibility is specific. Implement proper retry patterns with backoff. When a backend goes down and comes back up, every client in the world retrying simultaneously can take it right back down. Exponential backoff and jitter are not academic concepts, they are the difference between a graceful recovery and a cascading failure.

And the economics of streaming have shifted fundamentally. There is less money in the industry than there once was. Margins are low, artists need to get paid fairly, and computing royalties at the individual listener level (what SoundCloud calls fan-powered royalties) was for a long time considered computationally impossible. It is now possible, and it changes the fairness equation. Namely, if you pay €10 and only listen to one artist, that money should go to that artist, not be pooled across the platform.

For developers working in or considering the streaming industry, you need to enjoy consuming the content. You will watch a lot of streams. You will listen to a lot of music. You will test across phones, tablets, TVs, and Chromecasts. If you do not enjoy the medium, the daily reality of the work will wear on you.

See how this works in practice.

Our QA consulting and test automation services help engineering teams build testing strategies that scale—from mobile release optimization to AI-augmented quality assurance.

Explore QA consulting services

Key takeaways for developers and QA teams

Learn concepts, not libraries. Foundational knowledge, like asynchronous patterns, dependency injection, and UI composition principles, transfers across platforms. Specific library expertise does not. Developers who invest in understanding how things work under the hood adapt faster to new environments, languages, and frameworks.

Embrace AI as a productivity tool, not a threat. AI assistants help developers navigate unfamiliar codebases, analyze data, and extend their reach. The T-shaped engineer gets a wider T. Use AI for the tasks you do not enjoy, and focus your energy on the work that requires human judgment and creativity.

AI-generated code increases the need for QA, not decreases it. Code quality declines and duplication increases with AI-assisted development. The demand for experienced testers who can review, guide, and validate AI-generated work is growing. Do not assume AI will replace the need for quality engineering, it amplifies it.

Stop automating what keeps breaking. If a test consumes more time in stabilization than it saves in bug detection, delete it. Cover the behavior at a different level of the test pyramid, or use a semi-automatic approach. Automation should give you value.

Test behavior, not implementation. Write tests that verify what the software does, not how it does it internally. Tests coupled to implementation details break on every refactor and teach developers to hate testing. Tests coupled to behavior survive architectural changes and build confidence.

Testing personalized and streaming content requires pragmatism. Server-driven UI, user-generated content, and constantly changing catalogs mean traditional end-to-end testing breaks down. Test that rendering is correct regardless of content, handle network conditions explicitly, and do not be afraid to test in production.

→ Listen to the full conversation with Danny Preussler on the Tech Effect podcast

FAQ

Most common questions

Why does AI-assisted development increase the need for QA rather than reducing it?

Research consistently shows that AI-assisted development produces more code duplication and lower overall code quality than manually written code. AI also enables people without computer science backgrounds to build applications, meaning code is being written by people who lack the expertise to evaluate whether it is correct, secure, or maintainable. This increases the demand for developers and QA professionals who can review, validate, and guide AI-generated work. The volume of code going through QA pipelines is growing, and the judgment required to evaluate it is not something AI can replace.

When should engineering teams stop trying to automate a test?

When the effort spent stabilising a test consistently outweighs the bugs it actually catches, it should be deleted or replaced. Flaky tests, those that sometimes pass and sometimes fail without a code change, erode confidence in the entire test suite and consume engineering time that could be spent elsewhere. The right question to ask is not "how do we make this test pass reliably?" but "when did this test last catch a real bug?" If the answer is rarely or never, consider covering the same behaviour with faster, more reliable unit tests lower in the test pyramid, or handling it through a semi-automatic process during release.

What is the difference between testing behaviour and testing implementation?

Testing implementation means writing tests that are coupled to how a system works internally. Specifically, its class structure, method calls, or architectural patterns. These tests break whenever the internal structure changes, even if the external behaviour stays identical. Testing behaviour means verifying what the system does—that given specific inputs, it produces the expected outputs—regardless of how it achieves that internally. Behaviour-coupled tests survive architectural refactors, remain useful over time, and give developers genuine confidence rather than false security.

How should developers approach a career transition into an unfamiliar technical domain?

The most important investment is in foundational concepts rather than specific tools or libraries. Patterns like dependency injection, asynchronous execution, and UI composition repeat across platforms. A developer who understands them can apply them in any language or framework. Specific library knowledge does not transfer. Beyond concepts, the practical advice is to accept a temporary reduction in seniority, stay genuinely curious, learn from colleagues in the new domain, and use AI assistants to navigate unfamiliar codebases faster than would otherwise be possible.

What makes testing streaming and personalized consumer apps different from standard software testing?

Two factors make traditional testing approaches break down. First, personalization means every user sees a different interface based on their location, history, and subscription, so deterministic end-to-end tests that check for specific content become unreliable. The testing approach shifts to verifying that whatever content arrives is rendered correctly, regardless of what it is. Second, user-generated content and constantly changing catalogs cannot be fully replicated in staging environments, making some degree of production testing unavoidable. Network complexity, handling patchy connectivity, adaptive quality, and retry patterns, adds a further layer that most other app categories do not face.

Is your QA strategy keeping up with flaky tests, AI-generated code, and faster releases?

As AI accelerates code output and raises the stakes for quality engineering, the teams that get ahead are the ones with testing strategies built for the pace they're actually shipping at, not the pace they were at two years ago. Let's talk about what you're doing to keep up.

What 25 Years of Software Development Teaches You About Testing and AI