Python-compiled Bytecode Is Used by Attackers to Avoid Detection

Python-compiled Bytecode Is Used by Attackers to Avoid Detection

Attackers have developed a new method for concealing their malware from security detectors, manual checks, and other types of security research. These attackers target open-sourced package repositories such as PyPI - Python Package Index.


Python bytecode (PYC for short) files that may be executed directly, as compared to the source code files which must be parsed by the Python runtime, has in one instance been found to contain malware code.


Source code versus compiled code

The great majority of the packages that can be found on open source repositories like npm on JavaScript, PyPI on Python, and RubyGems on Ruby are made up of open-source code files that have been archived. As a result of their simplicity in unpacking and reading, security detectors for those repositories are designed to handle this kind of packaging.


Obfuscation is the most often used evasion method for plaintext code in the ongoing conflict between attackers and security vendors to avoid detection. This entails leveraging a programming language's built-in tools like encoding, decoding, or eval to render the code functional but unreadable. For instance, it is a frequent practice to encode dangerous code in base64, yet security technologies can handle such encoding.


The hackers behind the W4SP Stealer malware are well-known in the PyPI ecosystem for using methods like LZMA compression, base64 encoding and minification, which involves removing whitespace and comments from code to make it more compact but more difficult to read. To do this, the organization makes use of several open-source software from third parties, including pyminifier, Kramer, and Hyperion. In one variant of the W4SP attacks, the malicious code within the file was relocated past the default screen borders so that a person manually analyzing the source code file will not be able to notice it.


PYC files are unique, though. Unlike plaintext PY scripts, they cannot be read by humans. When a Python script is imported or run by the Python interpreter, PYC files are created. They don't need to be reinterpreted because they were already interpreted code when they were first run by the Python interpreter. This improves performance because it executes more quickly, and the most typical application for files like this is in the deployment of Python modules.


Most PyPI malware is designed to connect to an outside URL and download malware, which is typically an information thief. This provides yet another chance for security tools to spot unusual activity. The complete malicious payload can be contained within the file, making it much more difficult to identify it if the security tool is not made to decompile it, as was the case in this most recent event with a package named fshec2 that was discovered to involve a malicious PYC file.


ReversingLabs discovered new behavior in the fshec2 package that was probably intended to avoid detection. Typically, the import directive is used to import a module from a Python script. Importlib, a distinct package that performs the import functionality and is only utilized in specific situations, such as when an imported library is automatically updated upon import, was used to load the malicious PYC module in this instance. Since the malicious PYC in this instance was not being altered, there is no technical justification for using importlib besides trying to avoid using the standard import directive, probably to prevent discovery.


It appears that the main objective is to steal credentials

After being run on a computer, the malicious payload known as fshec2 gathers data about the system, including users, directory listings, and hostnames. It then creates a cron job that runs on Linux or a scheduled operation on Windows to run commands downloaded from a remote server. The attackers can send a new version of the virus along with extra payload in the form of Python scripts, and the commands enable the malware to self-update.


After examining the command-and-control server, the ReversingLabs researchers discovered errors that gave them access to a limited amount of data. They discovered, for instance, that the victim PCs receive an incremental ID, and they were able to validate that the malicious software was in fact executed by a number of victims.


It is possible that the attackers installed keylogging software on some of the workstations based on some of the file names that were discovered on the server.


The PyPI security group removed the package when ReversingLabs alerted them to the new attacking vector and claimed they had never come across this attack method previously. This does not rule out the potential that other packages with identical functionality will also end up in the repository.


Organizations require more than just static code analysis tools to address these current challenges to the software supply chain. They require tools that can keep an eye on delicate development environments for the execution of files, the creation of suspicious processes, unauthorized URL access, commands that gather information, and the use of functions that are simple to misuse, such get_path or importlib.

Recommend