Code is money. It is a business asset with a measurable marketplace value in current cashflow and future earnings. It may contain business secrets and sensitive information. As a form of intellectual capital, code is created out of the time, knowledge, skills, research, innovation, and systems of your organization—everything that gives you a competitive advantage.
This is why code needs protection, as with any other business asset. Loss of code or control over it may result in loss of intellectual property, loss of revenue, loss of trust, and loss of reputation. These are disasters for any commercial enterprise. You need code security, now.
Code obfuscation helps protect your valuable code. In this guide, we will examine the purpose and uses of code obfuscation. We trace its place in application security and the software lifecycle. We provide guidance on best practice and success measurement. Finally, we finish with buyer advice and a look at future developments.
Obfuscation is a method of securing code recommended by OWASP. It protects source code by making it challenging for both humans and AI to parse or comprehend that code. The function of the code remains the same after the obfuscation process. The code still works, while appearing more complex and less comprehensible than before it was obfuscated.
Obfuscation is sometimes confused with other techniques. Distinguishing these techniques from obfuscation will help clarify what it means.
Data obfuscation is an attempt to hide sensitive data items. Code obfuscation is the narrower, more specialized attempt to hide the function of the code e.g., code flow obfuscation (see below). Here are examples of data obfuscation that are employed in cybersecurity.
Code obfuscation is often mentioned along with other security techniques and sometimes confused with them.
Code obfuscation reduces the readability and understandability of the source code. By making the code more complex, obfuscation slows down serious attackers while acting as a psychological barrier to prevent attacks from casual hackers and "script kiddies."
A major aim of code obfuscation is to make the malicious reverse engineering (RE) of program logic and source code more difficult. Reverse engineering attacks against applications may aim at IP theft, malicious changes, or data breaches. The first step to committing such an attack is to understand the application involved and then formulate a cyberattack plan. Code obfuscation makes attempts to understand the code harder, making reverse engineering attempts a more time consuming, frustrating, and painful process.
Code obfuscation has the potential security benefit of making vulnerabilities harder to find and exploit by attackers. Exploits often depend on minute details about program internals. Obfuscation deliberately introduces diversity and variability into deployed applications. This can make it more of a challenge for malicious actors to understand the code or construct exploits that reliably work against multiple targets on a grand scale.
Code obfuscation does not actively prevent modification. But it does give the indirect benefit of helping make software more impervious to unauthorized access and modification attempts. If the code is difficult to understand, that means it is more difficult to modify.
Obfuscation techniques differ from tamper-proofing mechanisms, like checksumming. This is the process of verifying the integrity of the code to protect it from modification. But the two types of techniques are often layered together to preserve the intended operation of an application’s code.
To gain a high-level understanding of how code obfuscation works, it is useful to look at how individual techniques are grouped together. Each obfuscation technique transforms one of these aspects of code.
Code obfuscation is widely used by high-security apps in multiple industries, including banking, gaming and streaming. It offers a cost-effective security boost that can be applied quickly and easily with suitable tools. When applied, code obfuscation is difficult to overcome and protects against a wide range of attacks.
But despite these strengths, there are limitations to the effectiveness of code obfuscation. Many of these weaknesses occur when obfuscation is used in isolation, overused, or used naively. The good news is that these drawbacks are substantially diminished if obfuscation is combined with other security techniques, and the least interfering obfuscation techniques are employed in a smart way. Good quality obfuscation tools should strive to achieve this.
Code obfuscation is often valued because it strikes a practical balance between security effectiveness and performance. But heavy obfuscation can negatively impact application performance, leading to poor UX. It can even lead to rejection from app stores.
One way to deal with this problem of overload is to prioritize obfuscating critical components that could be exploited rather than obfuscating non-critical code. Decide which code does (e.g., sensitive code) and does not (e.g., performance critical code, non-critical code) require obfuscation.
Code obfuscation provides little defence against runtime attacks e.g., dynamic instrumentation. It only deals with code behavior and doesn't specifically protect stored data (e.g., API keys, user credentials, tokens, local databases). And it is not a foolproof method against motivated, skilled, and time-rich attackers. It acts as a speed bump, not a wall. This is why it is often combined with runtime controls in the context of protecting mobile applications.
Obfuscation makes the code difficult to understand but it should not make it difficult to develop or debug. It is important to consider how the obfuscation tool fits into the software development lifecycle (SDLC) by evaluating its impact on testability and debuggability. A key feature of suitable tools is the generation of mapping files which allow debug stack traces to be mapped back to the original source code.
Obfuscation techniques can be used by attackers to hide malicious code. Unethical uses of obfuscation techniques include the hiding of data collection, malicious features, back doors, and concealed vulnerabilities in code.
According to Gartner’s report on Avoiding Mobile Application Security Pitfalls, there is a five-stage path to building a secure app: threat modeling, secure development, testing, hardening, and anti-tampering. Code obfuscation has a central place in the last of these two main areas of application security and in-app protection. This involves security features that can be built into mobile apps and integrating with them to prevent threats.
Application hardening is the process of strengthening an app’s security by reducing the size of its vulnerable attack surface. It achieves this by modifying an app’s source code, binary code or bytecode to make it more resistant to tampering and RE.
Some application hardening techniques predict, monitor, and detect attacks. Others - like code obfuscation and data encryption - help prevent and block them by increasing the difficulty for a malicious actor to run a static analysis or execute an attack.
Unlike application hardening, which is a largely a preventative measure, application shielding can provide comprehensive protection for mobile apps against attacks and threats in real time. Along with code obfuscation, it includes data obfuscation techniques (like white-box cryptography), as well as runtime protection security technology to protect the app’s entire ecosystem at multiple levels.
Runtime app shielding controls and monitors the process of an app running on a device to ensure it is clean and safe. This is software security mechanism is called runtime application self-protection (RASP). Integrating static code obfuscation with dynamic RASP checks create a more comprehensive security posture.
There are two ways of viewing where code obfuscation takes place in the application development process.
The software development lifecycle (SDLC) consists of the basic stages in building a software application: analysing, planning, designing, building, testing, deploying, and maintaining. Code obfuscation along with other code hardening techniques are crucial to the later part of this cycle.
Code obfuscation happens towards the end of the build stage to ensure that it does not impact code development. But code obfuscation must occur before the testing of software and performance. This is to ensure that the application’s function is acceptable post obfuscation, and that the obfuscation process hasn’t introduced any defects.
Within this crucial build phase, obfuscation can be carried out before, during or after the compilation process (see below). After it, in the maintenance phase, obfuscation may be reapplied to new builds and updated to maintain security. This is where finding the balance between security and performance is crucial.
In a typical compilation process for a compiled language, there are three places where obfuscation can happen. These are the three levels of code obfuscation. It is important to consider the advantages of each option.
Source level obfuscators change the text source files before they are compiled. The disadvantages with code obfuscation here are:
Intermediate level obfuscators and most application security tools work at this point. The disadvantages with code obfuscation here are:
Post-build obfuscation means that it doesn't need to take place during the development part of the application life cycle. Rather, it can take place after the product has been built and compilation complete. This is the best obfuscation level. There are many advantages to a post-build, post-compilation obfuscation:
There are many best practice recommendations for code obfuscation. For example, code obfuscation practice should be kept up-to-date with the latest tools, tests, and attempts. Attention needs paid to the performance of obfuscated code in speed and size. And it is smart to employ variability and randomness so each app build has a unique binary.
But the best in best practice for obfuscation lies in layering.
Obfuscation is is an important element to app security, but it is not sufficient to work by itself - and isn't support to. Each individual obfuscation technique typically offers relatively little benefit by itself. Employing multiple obfuscation types, techniques and layers with other security measures along side each other is the best way to establish a solid protection shield.
Using a mixture of code obfuscation techniques creates layers of code complexity so that RE is as difficult and time-consuming as possible. A layered defence works by combining several different obfuscation techniques multiple times to maximize impact. The more techniques used in a smart way, the better code is protected.
Here are some obfuscation techniques that are frequently employed together in a layered manner:
Code obfuscation complements other security techniques, enhancing rather than replacing them. This is especially important in areas that obfuscation doesn’t cover. For example, obfuscation alone doesn’t stop runtime attacks, API abuse, or data theft, so it must be combined with other defenses for maximum protection.
Here are some other security techniques that are commonly employed alongside code obfuscation:
The strength and quality of code obfuscation can be measured by a mix of technical metrics and performance testing. The first can provide some quantitative assurance for particular functions. However, in practical terms, measuring code obfuscation often comes down to a cost-effectiveness analysis. Reconstructing the original code should be much more costly and difficult for attackers than the initial obfuscation attempt.
The cost of mobile security must be balanced against the true cost of insecurity. However, just like any other security measure, code obfuscation introduces cost and "computational overhead" to the system. The higher level of obfuscation, the greater cost. In human terms, this cost can be measured by increases in execution time and resource consumption. In software terms, it can lead to increases in development costs, size, and performance.
The effectiveness of code obfuscation programs can be measured in different ways. These are two of the standard measurement criteria:
These are two other criteria that deserve attention.
There are some obfuscation products that provide similar sorts of capabilities for app security. It is best to focus in the practicality of implementing the tool and the implications for the ongoing code maintenance.
So, the question is this: How practical is the package to use and how sophisticated are the tools supporting it? The whole package is what matters, not just the low-level features. For example, if the tools is difficult to use, coverage is likely to be sparse.
The general rules are these. An obfuscation product should be easy to operate to the degree that it does not need an expert user. It should be able to perform straight out-of-the-box. And any required configuration work should be explicit if required.
When evaluating an application hardening tool or product that provides code obfuscation as one of its main offerings, ease of use is important. Some tools seem sophisticated in their configuration settings but are hard to use. This introduces higher performance costs and will likely lead to sparser coverage.
When choosing a tool, try to balance its security strengths with its performance limits. The benefit of obfuscation is that it adds protective value. But what is the impact on execution?
You might launch an application onto the App Store and receive reports that it is crashing on certain devices. Developers discover a bug, which they want to fix, so you can get another release out. But once the code is obfuscated, it is difficult to know what is happening.
Although obfuscation has a relatively long history, it continues to evolve in the face of new cybersecurity challenges.
The techniques mentioned so far are examples of static obfuscation. They transform the code at protection time and then the code is shipped in its transformed state. The code doesn’t alter again after this. Dynamic obfuscation is when some transformation are applied to the code at protection time, but the code moves, mutates, and even self-repairs in runtime as well. This makes it a moving target to attack, which helps defend against AI attacks.
These two obfuscation approaches are related to two types of target analysis used when attacking a program. Static analysis examines the program at rest, when loaded off disk or the app as a file. An analyst would use a decompiler or disassembler to analyse the code. A dynamic analysis observes the code in an execution environment and observes how in functions at runtime.
Code obfuscation provides sound protection for applications against static analysis. But runtime controls are necessary to protect against dynamic analysis because it interferers with the process of running the application. Most static techniques are understood by pen testers and are increasingly vulnerable to automatic machine learning (ML) attacks. New dynamic approaches help deal with this challenge, and are also better for customer ease of use and maintenance.
Virtual machine (VM) obfuscation works by transforming code into another code format, such as another bytecode or instruction type. VM instruction encoding uses real time code generation with diversity to prevent scripting attacks. A hacker who stumbles across this code in a RE attempt cannot interpret its meaning or function because it is a different from anything previously encountered. It runs in an interpreter at runtime, and so in considered a dynamic procedure.
Progressive decryption (PD) is another variant of dynamic code obfuscation. With PD, only the part of the application that is running gets decrypted. The remainder stays encrypted, making it difficult for an attacker to observe the entire code. Both code and data are decrypted ‘just-in-time’ for use, as function are entered and then destroyed after use.
The future of Artificial Intelligence (AI) in cybersecurity involves many aspects. AI has been used in code deobfuscation trials. AI is moderately successful at when dealing with each technique and level of obfuscation in isolation, but less so when they are layered together. Runtime dynamic techniques would further help against AI attacks.
Learn more about code obfuscation techniques or examples of how code obfuscation works in practice.
Protect your IP and app’s data by obfuscating the purpose of its code with Promon IP Protection Pro.
Book a meeting to ask us about Promon’s powerful RASP and code obfuscation that defends mobile apps against tampering, reverse engineering, and unauthorized access.
Let us protect you and your valuable code, starting today.