
Abstract
Insecure deserialization vulnerabilities represent one of the most critical and pervasive threats to modern software systems, consistently featuring in top vulnerability lists such as the OWASP Top 10. These flaws empower malicious actors to execute arbitrary code, manipulate objects, bypass authentication, or launch various injection attacks by exploiting the inherent trust placed in reconstructed data. This comprehensive research paper delves deeply into the multifaceted nature of deserialization vulnerabilities, meticulously examining their diverse manifestations across a spectrum of programming languages and frameworks. It explores the intricate mechanisms behind common attack vectors, including remote code execution (RCE), denial of service (DoS), and privilege escalation. Furthermore, this paper outlines an exhaustive suite of robust mitigation strategies essential for developers, security architects, and IT professionals. By analyzing real-world, high-profile incidents, such as the critical CVE-2025-23120 in Veeam Backup & Replication, this analysis underscores the paramount importance of adopting and rigorously enforcing secure deserialization practices to safeguard enterprise applications and critical infrastructure against sophisticated exploits.
Many thanks to our sponsor Esdebe who helped us prepare this research report.
1. Introduction
In the intricate landscape of modern software architecture, the processes of serialization and deserialization are foundational pillars, enabling the seamless conversion of complex in-memory objects into a format suitable for persistent storage or transmission across network boundaries, and their subsequent faithful reconstruction. These processes are indispensable for a multitude of core functionalities, including data caching, inter-process communication (IPC), web services, message queuing, and distributed computing environments. However, despite their utility, these mechanisms introduce a significant and often underestimated attack surface when applications handle data originating from untrusted or unvalidated sources.
Insecure deserialization arises precisely when an application attempts to reconstruct an object from serialized data without adequate validation or integrity checks. This critical oversight can transform seemingly innocuous data streams into potent vectors for severe security breaches, ranging from devastating remote code execution (RCE) and crippling denial-of-service (DoS) attacks to insidious privilege escalation and data tampering. The core vulnerability lies in the fact that the deserialization process, by its very nature, often involves dynamically instantiating classes and invoking methods based on the incoming serialized data. If an attacker can control this data, they can effectively dictate which classes are loaded and which methods are invoked, leading to arbitrary code execution within the application’s context.
The OWASP Foundation, a leading authority on web application security, consistently highlights insecure deserialization as a severe vulnerability, defining it as a flaw that emerges when ‘untrusted data is used to abuse the logic of an application’s deserialization process, allowing an attacker to execute code, manipulate objects, or perform injection attacks’ (owasp.org). This vulnerability often stems from a fundamental misunderstanding of the trust model: developers frequently assume that deserialized data will always be benign, particularly if it originates from an internal or seemingly trusted system. This assumption is perilous, as attackers can often inject malicious serialized payloads at various points, including network requests, database entries, file uploads, or even through compromised internal systems.
Recent history is replete with examples of high-impact insecure deserialization vulnerabilities affecting widely deployed software. The critical CVE-2025-23120 vulnerability in Veeam Backup & Replication serves as a highly pertinent and recent illustration of the profound dangers associated with this class of flaw. This particular vulnerability allowed an authenticated domain user to execute arbitrary code on the backup server, potentially granting attackers full control over an organization’s crucial backup infrastructure and underlying data. Such incidents underscore the critical necessity for developers and security professionals to possess a deep and nuanced understanding of deserialization vulnerabilities, their technical underpinnings, their impact across diverse programming environments, and, most importantly, the robust and multi-layered strategies required for their effective mitigation.
This paper aims to provide precisely this comprehensive understanding, dissecting the anatomy of insecure deserialization, exploring its manifestations in popular programming languages, analyzing a significant real-world exploit, and proposing actionable, defense-in-depth mitigation strategies to fortify modern software systems against these pervasive and potent threats.
Many thanks to our sponsor Esdebe who helped us prepare this research report.
2. Understanding Deserialization Vulnerabilities
To fully grasp the intricacies of insecure deserialization, it is imperative to first establish a solid understanding of the fundamental processes of serialization and deserialization themselves, followed by a detailed examination of how these processes are subverted by malicious actors.
2.1 Serialization and Deserialization Processes
Serialization is the transformational process of converting an object’s state, including its data and sometimes even its methods or class structure, into a format that can be easily stored or transmitted. This process is often referred to as ‘marshalling’ an object. The resulting format can vary widely, from a binary byte stream (common in Java’s native serialization or Python’s pickle) to human-readable text formats like JSON (JavaScript Object Notation), XML (Extensible Markup Language), or YAML (YAML Ain’t Markup Language). The primary goals of serialization include:
- Data Persistence: Allowing objects to be saved to disk, databases, or cloud storage for later retrieval, enabling applications to resume their state or data across sessions.
- Inter-Process Communication (IPC): Facilitating the exchange of complex data structures between different processes, which may even run on different machines or operating systems.
- Web Services and APIs: Standardizing data exchange formats for communication between clients and servers, or between different microservices within a distributed system.
- Caching: Storing complex objects in cache memory or distributed caches to improve performance.
During serialization, an object’s instance variables (fields) are typically converted into a sequence of bytes or a textual representation that captures their values and types. For complex objects, this often involves traversing the object graph, serializing all referenced objects recursively.
Deserialization is the precise reverse process: reconstructing the original object from its serialized form. This involves parsing the byte stream or textual data, instantiating the appropriate classes, and populating their fields with the values extracted from the serialized data. This process relies heavily on the runtime environment’s ability to locate and load the corresponding class definitions for the objects being reconstructed. It is this dynamic class loading and object instantiation that forms the crux of the insecure deserialization vulnerability. Without proper controls, the deserializer can be tricked into loading unexpected classes or invoking dangerous methods during the reconstruction phase.
Various serialization formats exist, each with different characteristics and inherent security implications:
- Binary Formats (e.g., Java’s
ObjectOutputStream
, Python’spickle
, .NET’sBinaryFormatter
): These formats are highly efficient and language-specific, directly mapping object structures to byte streams. They often include metadata about the class definitions and can be highly powerful, allowing for the serialization of complex object graphs, including private fields and custom serialization logic (e.g.,readObject
in Java). This power, however, comes at a significant security cost, as they can embed executable logic or refer to internal methods that, when invoked during deserialization, can be abused to execute arbitrary code. They are generally considered unsafe for untrusted data. - Text-based Formats (e.g., JSON, XML, YAML): These formats are human-readable, language-agnostic, and widely used for data exchange over the web. While primarily designed for data representation, they can still lead to deserialization vulnerabilities if the deserializer dynamically instantiates objects or invokes methods based on arbitrary type information embedded within the data (e.g., polymorphic deserialization where a type hint in the JSON/XML tells the deserializer which specific class to instantiate). YAML, in particular, has a history of vulnerabilities due to its ability to represent arbitrary objects and invoke methods during parsing if not handled carefully (e.g.,
PyYAML
‘syaml.load
).
2.2 Insecure Deserialization Explained
Insecure deserialization occurs when an application deserializes data from an untrusted source without performing adequate validation, integrity checks, or restricting the types of objects that can be instantiated. The vulnerability stems from the implicit trust an application places in the structure and content of the incoming serialized data. An attacker can exploit this trust by crafting a malicious serialized payload that, when deserialized by the application, triggers unintended and harmful behaviors.
The core mechanism behind most insecure deserialization attacks involves abusing the deserialization logic to:
- Instantiate Arbitrary Classes: The deserializer dynamically creates objects based on the type information present in the serialized data. If this type information can be controlled by an attacker, they can force the application to instantiate classes that were never intended to be created from external input.
- Invoke Malicious Methods (Gadget Chains): Many object-oriented programming languages allow classes to define custom logic that executes during the deserialization process (e.g.,
readObject
in Java,__unserialize
or magic methods like__destruct
in PHP,__reduce__
in Python,ISerializable
in .NET). Attackers can identify existing code in the application’s classpath (libraries or application code) that, when chained together, can perform malicious actions. These chains of vulnerable methods, often within legitimate libraries, are known as ‘gadget chains.’ A single ‘gadget’ might perform an innocuous action, but when one gadget’s output becomes the input for another, a powerful chain of actions can be constructed, culminating in remote code execution. - Manipulate Object State: Even without RCE, attackers can alter the properties of deserialized objects to gain unauthorized access, elevate privileges, or tamper with application logic. For example, modifying a session object to change a user’s role from ‘guest’ to ‘administrator’.
Consider the concept of ‘gadget chains’ more closely. A gadget is any method or constructor in the application’s classpath that can be triggered during deserialization and has a useful side effect (from an attacker’s perspective). These are often legitimate methods (e.g., a toString()
method that executes external commands, or a finalize()
method that deletes files). Attackers compile lists of such gadgets from common libraries (e.g., Apache Commons Collections in Java, widely used in many applications). They then craft a serialized payload that, when deserialized, instantiates a sequence of objects such that the deserialization of one object calls a method on another, which in turn calls a method on a third, and so on, until a dangerous operation (like executing a shell command) is performed. The discovery and exploitation of these gadget chains are highly sophisticated and often rely on automated tools like YSoSerial for Java.
2.3 Common Attack Vectors and Consequences
Insecure deserialization vulnerabilities can be exploited through various vectors, leading to a spectrum of severe consequences:
-
Remote Code Execution (RCE): This is the most critical impact. By crafting malicious serialized data, attackers can force the application to execute arbitrary commands on the underlying server. This is typically achieved by exploiting gadget chains that ultimately invoke system commands, load arbitrary classes, or manipulate class loaders. For instance, in Java, an attacker might craft a payload that, when deserialized, utilizes methods within the
Runtime
class or Spring Framework components to execute shell commands. In Python, thepickle
module’s__reduce__
method can be overridden to execute arbitrary code during unpickling. RCE grants attackers complete control over the compromised system, allowing them to install backdoors, steal data, or launch further attacks. -
Denial of Service (DoS): Maliciously crafted serialized data can be designed to consume excessive system resources (CPU, memory, disk I/O) during the deserialization process, leading to the application or server becoming unresponsive or crashing. Examples include:
- Infinite Loops: Crafting objects that refer to themselves or create circular dependencies, leading to an infinite loop during deserialization graph traversal.
- Excessive Object Creation: Forcing the deserializer to create an enormous number of objects, exhausting available memory.
- CPU Exhaustion: Triggering computationally intensive operations within deserialization logic (e.g., complex string manipulations or cryptographic operations on very large inputs).
- Deep Object Graphs: Creating extremely deep nested object structures, causing stack overflows during recursive deserialization.
-
Privilege Escalation: By manipulating properties of deserialized objects, an attacker can alter the application’s behavior to gain elevated privileges or unauthorized access. This might involve changing user roles (e.g., from a standard user to an administrator), modifying security context objects, or altering access control lists managed by the application. For example, if a serialized session object contains a
isAdmin
boolean field, an attacker might flip this value totrue
to bypass authorization checks. -
Data Tampering and Data Exfiltration: Attackers can modify the data within serialized objects to alter application state or inject malicious data into databases or files. They might also extract sensitive information that is inadvertently included in serialized streams but not intended for external consumption. For instance, if an application serializes sensitive user data for caching, an attacker might intercept and deserialize this data to extract credit card numbers or personal identifiable information.
-
Bypassing Authentication/Authorization: In scenarios where session tokens, authentication credentials, or authorization contexts are serialized and deserialized (e.g., in cookies or hidden form fields), an attacker can tamper with these objects to bypass login mechanisms or gain access to restricted functionalities without proper authentication.
-
Injection Attacks (SQL, Command, LDAP): While less direct than RCE, insecure deserialization can sometimes be a precursor to other injection vulnerabilities. If a deserialized object’s properties are directly used in constructing database queries, shell commands, or LDAP queries without proper sanitization, the attacker can embed malicious payloads that lead to SQL injection, command injection, or LDAP injection.
-
Cross-Site Scripting (XSS) and Server-Side Request Forgery (SSRF): Although less common as a direct consequence, deserialization can sometimes be leveraged to achieve these. For example, if a deserialized object contains a URL that is then accessed by the server without validation (leading to SSRF), or if a deserialized object’s string representation is rendered directly to a web page without encoding (leading to XSS).
The broad range and severity of these consequences underscore why insecure deserialization is consistently ranked among the most dangerous application security vulnerabilities. Its impact is often amplified by the fact that the underlying deserialization libraries are typically robust and feature-rich, providing attackers with powerful primitives to build their exploits.
Many thanks to our sponsor Esdebe who helped us prepare this research report.
3. Case Study: CVE-2025-23120 in Veeam Backup & Replication
The CVE-2025-23120 vulnerability in Veeam Backup & Replication provides a compelling and contemporary illustration of the real-world impact of insecure deserialization. This incident highlights how even sophisticated enterprise software, widely used for critical data protection, can fall prey to such flaws.
3.1 Vulnerability Overview
CVE-2025-23120 is classified as a critical remote code execution (RCE) vulnerability that was discovered and publicly disclosed in March 2025. It specifically affected Veeam Backup & Replication versions 12.3.0.310 and earlier. This vulnerability was particularly concerning because it targeted backup servers that were joined to a Microsoft Active Directory domain, a common deployment scenario for enterprise environments. The flaw enabled authenticated domain users – meaning anyone with valid credentials to the domain, even those with low privileges – to execute arbitrary code on the Veeam backup server. This level of access could lead to catastrophic consequences for data integrity, availability, and confidentiality. Veeam promptly released a patch, addressing the vulnerability in version 12.3.1 (build 12.3.1.1139) (veeam.com).
Veeam Backup & Replication is a leading solution for data backup, recovery, and replication for virtual, physical, and cloud-based workloads. Its critical role in protecting an organization’s most valuable asset – its data – makes any vulnerability in the product, especially one leading to RCE, a severe threat. A successful exploit could allow an attacker to delete, encrypt, or exfiltrate all backed-up data, potentially facilitating ransomware attacks or extensive data breaches without affecting the primary production systems immediately.
3.2 Technical Details
The root cause of CVE-2025-23120 was identified as improper deserialization within specific .NET classes used by the Veeam Backup & Replication software. Specifically, the vulnerability was traced to the Veeam.Backup.EsxManager.xmlFrameworkDs
and Veeam.Backup.Core.BackupSummary
.NET classes. While the precise details of the exploit chain were not fully disclosed to limit further exploitation, the general mechanism aligns with typical .NET insecure deserialization patterns.
In the .NET ecosystem, deserialization vulnerabilities often stem from the use of insecure formatters such as BinaryFormatter
, SoapFormatter
, or NetDataContractSerializer
. These formatters are powerful because they allow for the serialization and deserialization of arbitrary .NET objects, including their methods and properties. When these formatters are used to deserialize untrusted input, an attacker can supply a malicious payload that contains references to dangerous classes and methods already present in the application’s memory or on disk. Upon deserialization, the formatter attempts to reconstruct these objects, inadvertently invoking the dangerous methods or constructors specified by the attacker.
In the context of Veeam, an authenticated domain user could craft specialized requests containing a malicious serialized payload. These requests were likely directed at specific API endpoints or internal services that, unbeknownst to the developers, invoked the vulnerable deserialization logic within the xmlFrameworkDs
or BackupSummary
classes. The .NET
deserialization process, lacking sufficient validation or a robust allow-list of permissible types, would then proceed to instantiate objects defined by the attacker’s payload. This could include objects from system libraries or third-party components that contained ‘gadget’ methods, leading to arbitrary code execution. For instance, common .NET deserialization gadgets involve types like System.Data.DataSet
, System.Data.DataTable
, or even TypeConfuseDelegate
gadget chains that can be abused to invoke arbitrary methods. The fact that the attack required ‘authenticated’ access means the attacker needed valid domain credentials, but these could be low-privileged user accounts, making the vulnerability widely exploitable in typical enterprise environments (bleepingcomputer.com).
The vulnerability highlighted a critical oversight in the handling of internal or inter-component communication within the Veeam application, where serialized data, even if originating from an authenticated user, was not treated with the necessary skepticism and validation before deserialization. This reinforces the principle that ‘authenticated’ access does not equate to ‘trusted’ data when it comes to deserialization.
3.3 Impact Assessment
The exploitation of CVE-2025-23120 carried profound implications for organizations leveraging Veeam Backup & Replication. The ability for an authenticated domain user – potentially a standard employee account – to achieve arbitrary code execution on a backup server is devastating. The impact assessment includes:
- Total Compromise of Backup Data: An attacker gaining RCE on the backup server could access, modify, delete, or encrypt all backup files and configurations. This directly jeopardizes the integrity and availability of an organization’s most critical data recovery mechanism. In a ransomware scenario, attackers commonly target backup systems to prevent recovery, and this vulnerability provided a direct avenue for such an attack.
- Lateral Movement and Network Intrusion: The compromised backup server, being an integral part of the enterprise network, could serve as a beachhead for further lateral movement within the network. Attackers could use it to launch attacks against production servers, domain controllers, or other critical assets.
- Data Exfiltration: Sensitive data contained within backups (e.g., intellectual property, customer data, financial records) could be exfiltrated without detection, leading to severe data breaches, regulatory fines, and reputational damage.
- Loss of Operational Continuity: The disruption or destruction of backup services could severely impede an organization’s ability to recover from other incidents (e.g., hardware failures, accidental deletions), leading to significant downtime and business interruption.
- Compliance and Regulatory Implications: For organizations operating under strict data protection regulations (e.g., GDPR, HIPAA, PCI DSS), a compromise of backup data due to such a critical vulnerability could lead to severe non-compliance penalties.
This incident unequivocally underscores the critical need for secure deserialization practices in the development lifecycle of all software, particularly those handling sensitive data or operating in critical infrastructure roles. It also highlights that authentication alone is insufficient to guarantee the safety of deserialized data.
Many thanks to our sponsor Esdebe who helped us prepare this research report.
4. Deserialization Vulnerabilities Across Programming Languages
While the fundamental concept of insecure deserialization remains consistent, its manifestation and the specific exploitation techniques vary significantly across different programming languages and their respective serialization frameworks. Understanding these language-specific nuances is crucial for both identifying and mitigating the vulnerability.
4.1 Java
Java’s native serialization mechanism, provided by java.io.ObjectOutputStream
and java.io.ObjectInputStream
, has historically been a prime target for insecure deserialization attacks. This mechanism allows for the serialization of complex object graphs, including private fields and custom logic. The core of the vulnerability in Java lies with the readObject()
method. When an ObjectInputStream
deserializes an object, it invokes the readObject()
method of each class in the object graph if that class implements Serializable
and defines a custom readObject()
method. Furthermore, even without a custom readObject()
, the default deserialization process can still be dangerous if it instantiates classes that have vulnerable constructors or static initializers.
Attackers leverage this by constructing ‘gadget chains’ from commonly used third-party libraries that are present in many Java applications. Notorious examples include Apache Commons Collections
, Spring Framework
, Hibernate
, and Groovy
. These libraries contain legitimate classes with methods that, when called in sequence during deserialization, can lead to arbitrary code execution. For instance, the InvokerTransformer
gadget in Apache Commons Collections
can be used to call arbitrary methods via reflection. The ysoserial
tool is widely used by security researchers and attackers to generate payloads for various Java gadget chains (owasp.org).
Mitigation in Java often involves:
- Avoiding
ObjectInputStream.readObject()
for untrusted data. - Implementing a custom
resolveObject()
method to restrict deserialized types to a strict allow-list. - Using
ObjectInputFilter
(introduced in Java 9 and backported to Java 8u121+) to whitelist specific classes that are permitted to be deserialized. - Employing safer data formats like JSON or XML with secure parsing libraries (e.g., Jackson or Gson in
SAFE_DEFAULT_DESERIALIZATION_LEVEL
mode, or JAXB with external entity resolution disabled for XML).
4.2 Python
Python’s pickle
module, used for serializing and deserializing Python object structures, poses significant security risks when handling untrusted data. The official Python documentation explicitly warns against unpickling data from untrusted sources, stating, ‘It is possible to construct malicious pickle data which will execute arbitrary code during unpickling’ (acunetix.com).
The danger in pickle
primarily stems from its ability to serialize and deserialize arbitrary Python objects, including instances of classes and even functions. During the unpickling process, pickle
reconstructs the object graph by calling methods and constructors. Attackers can leverage the __reduce__
method, a special method that allows a class to customize its pickling behavior. By defining a malicious __reduce__
method, an attacker can specify a callable and its arguments that will be executed when the object is unpickled. This allows for direct execution of arbitrary Python code.
Another source of vulnerability in Python is the PyYAML
library, particularly its yaml.load()
function (in older versions or when used without specifying a safe loader like yaml.safe_load()
). PyYAML
can parse YAML documents into Python objects. If a YAML document contains specific tags (e.g., !!python/object/apply:os.system
), yaml.load()
can execute arbitrary functions during parsing.
Mitigation in Python includes:
- Never unpickle data from untrusted sources. This is the golden rule for
pickle
. - For
PyYAML
, always useyaml.safe_load()
instead ofyaml.load()
when parsing untrusted data. - Consider using safer serialization formats like JSON or Protocol Buffers, which are data-centric and do not typically allow arbitrary code execution during deserialization unless custom insecure deserializers are implemented on top of them.
4.3 PHP
In PHP, the unserialize()
function is the primary culprit in insecure deserialization attacks. Similar to Java’s readObject()
, unserialize()
can trigger ‘magic methods’ (methods starting with __
) within PHP classes during the deserialization process. These magic methods are special methods that PHP calls automatically in certain circumstances.
Commonly abused magic methods include:
__wakeup()
: Called when an object is unserialized. Attackers might exploit vulnerabilities in this method or use it to bypass initial checks.__destruct()
: Called when an object is garbage collected or goes out of scope. This is a common target for gadget chains, as the attacker can control the object’s properties that are then used by the__destruct()
method, potentially leading to file deletion, arbitrary function calls, or command execution.__toString()
: Called when an object is treated as a string.__call()
,__callStatic()
: Used for overloading methods.
Attackers perform ‘PHP Object Injection’ by manipulating the serialized string to control the properties of objects and trigger dangerous magic methods in an unexpected order, forming a gadget chain. PHP also introduced a Phar
(PHP Archive) deserialization vulnerability, where a crafted phar
file containing serialized data could be processed by certain file operations (like file_exists
or file_get_contents
) without explicit unserialize()
calls, leading to RCE (invicti.com).
Mitigation in PHP involves:
- Avoiding
unserialize()
with untrusted data. This is the most crucial step. - If
unserialize()
is absolutely necessary for trusted data, implement strict input validation and ensure that the serialized data only contains expected values and types. - Use
json_encode()
andjson_decode()
as preferred alternatives for data serialization, as JSON is generally safer. - Regularly audit code for
unserialize()
calls and ensure they are appropriately protected. - Be aware of
phar
deserialization vectors and avoid processing untrustedphar
files with file system functions.
4.4 .NET
The .NET framework offers several serialization mechanisms, and some have been historically prone to insecure deserialization. The Veeam incident (CVE-2025-23120) is a prime example of this. Similar to Java and Python, the danger in .NET arises when arbitrary types can be instantiated during deserialization from untrusted data.
Key vulnerable serializers/formatters in .NET include:
BinaryFormatter
: This is one of the most dangerous formatters. It can deserialize an entire object graph, including private members and properties, and execute arbitrary code by invoking constructors and methods of the deserialized types. It is considered highly unsafe for untrusted input.SoapFormatter
: Similar toBinaryFormatter
in its capabilities and vulnerabilities.NetDataContractSerializer
: Can also be vulnerable if not restricted.
Safer alternatives, typically used for data-only serialization, include DataContractJsonSerializer
, XmlSerializer
, and JavaScriptSerializer
. However, even with these, if custom JsonConverters
or XmlAttributeOverrides
are used to allow type specification within the data, vulnerabilities can still emerge.
Attackers often exploit well-known .NET ‘gadget chains’ involving types like System.Data.DataSet
, System.Data.DataTable
, TypeConfuseDelegate
(from System.Windows.Data
in WPF), and others. These gadgets allow attackers to achieve RCE by abusing features intended for legitimate operations, such as loading external XML schemas or invoking delegates.
Mitigation in .NET focuses on:
- Avoiding
BinaryFormatter
,SoapFormatter
, andNetDataContractSerializer
for untrusted data. Microsoft strongly recommends against using them for security reasons. - Using safer, data-only serializers like
DataContractJsonSerializer
,XmlSerializer
, orNewtonsoft.Json
(Json.NET) withTypeNameHandling.None
orTypeNameHandling.Objects
only when strictly necessary and with a customSerializationBinder
to restrict types. - Implementing strict input validation and type checking on deserialized data.
- Applying the principle of least privilege to the deserialization process.
- Regularly patching and updating .NET frameworks and libraries.
4.5 Other Languages and Frameworks
Insecure deserialization is not limited to Java, Python, PHP, or .NET. Other languages and frameworks also present similar risks:
- Ruby: The
Marshal.load
andYAML.load
methods are known to be vulnerable if used with untrusted input.Marshal.load
can execute arbitrary code by loading classes, whileYAML.load
(in older versions or withoutsafe_load
) can execute code via custom tags. - Node.js/JavaScript: While JSON is the native data serialization format and is generally safer (as it’s data-centric and doesn’t directly support code execution), custom deserializers or libraries that mimic binary serialization (e.g.,
node-serialize
) can introduce vulnerabilities. If an application uses Node.js’svm
module to execute deserialized code within a sandbox, insufficient sandboxing can lead to escape. - Go: Go’s
gob
encoding is a binary serialization format that, while less commonly exploited for RCE due to Go’s strong typing and lack of runtime class loading by string name, can still be vulnerable to DoS attacks via recursive or deeply nested structures if not handled carefully.
The common thread across all these languages is the danger of allowing untrusted data to dictate the types of objects instantiated or the methods invoked during the deserialization process. This fundamental principle must guide mitigation strategies.
Many thanks to our sponsor Esdebe who helped us prepare this research report.
5. Mitigation Strategies
Mitigating insecure deserialization vulnerabilities requires a multi-layered, defense-in-depth approach. No single solution is foolproof, and a combination of strategies is necessary to effectively protect applications from these sophisticated attacks. The primary focus should always be on reducing the attack surface and minimizing the trust placed in external data during the deserialization process.
5.1 Avoid Deserializing Untrusted Data
The most fundamental and effective mitigation strategy is to completely avoid deserializing data from untrusted sources whenever possible (portswigger.net). This ‘golden rule’ acknowledges that any data originating from outside the application’s trusted boundary – including user input, network requests, files uploaded by users, or even data from potentially compromised third-party systems – should be considered hostile. If deserialization of external data is absolutely necessary, it must be treated as potentially malicious and subject to rigorous scrutiny.
Instead of full object deserialization, consider alternative, safer data exchange formats that are inherently data-centric rather than object-centric. Examples include:
- JSON (JavaScript Object Notation): A lightweight, human-readable format ideal for data exchange. Most JSON parsers do not support arbitrary code execution by default, making them generally safer for untrusted data, provided no custom deserialization logic introduces a vulnerability.
- XML (Extensible Markup Language): Another widely used format. While XML parsers can be vulnerable to XXE (XML External Entity) attacks, they are generally safer than binary object deserializers for RCE, assuming external entity resolution is disabled and no specific XML-to-object mapping mechanism introduces a deserialization gadget.
- Protocol Buffers (Protobuf), Apache Thrift, Apache Avro: These are language-agnostic data serialization formats that use schemas to define data structures. They generate code for data serialization and deserialization, which are typically type-safe and do not include mechanisms for executing arbitrary code from the data itself. They enforce strict data contracts, making them much harder to exploit for object-related attacks.
When these safer formats are used, the application should only deserialize the data fields that are explicitly defined in the schema and required for application logic, ignoring or rejecting any extraneous or unexpected fields.
5.2 Implement Integrity Checks
Before any deserialization occurs, the integrity and authenticity of the serialized data must be rigorously verified. This ensures that the data has not been tampered with in transit or storage by an attacker. Common methods for implementing integrity checks include:
- Digital Signatures: A digital signature provides both authenticity and integrity. The serialized data is signed by a trusted entity (e.g., the server that serialized it) using a private key. Before deserialization, the application verifies this signature using the corresponding public key. Any alteration to the data will invalidate the signature, allowing the application to reject the payload. This is a strong defense, but requires robust key management.
- Cryptographic Hashes (HMAC): A Hash-based Message Authentication Code (HMAC) combines a cryptographic hash function with a secret key. The sender calculates an HMAC of the serialized data using a shared secret key and appends it to the data. The receiver then independently recalculates the HMAC using the same secret key. If the calculated HMAC matches the received HMAC, the data’s integrity and authenticity are confirmed. This is less robust than digital signatures against certain advanced attacks if the key is compromised but is easier to implement for internal communication (invicti.com).
It is crucial that these integrity checks are performed before any attempt at deserialization. If the integrity check fails, the data should be immediately discarded, and an alert logged.
5.3 Use Secure Serialization Libraries and Formats
Choosing the right serialization library and format is paramount. Developers should opt for libraries that are designed with security in mind and have a proven track record of addressing vulnerabilities. Critically, avoid libraries or features that allow for arbitrary object instantiation or code execution during deserialization when dealing with untrusted input.
- Avoid inherently insecure binary serializers: As discussed, Java’s
ObjectInputStream
, Python’spickle
, and .NET’sBinaryFormatter
are highly dangerous for untrusted data. Migrate away from these if possible. - Prefer data-centric formats: Prioritize formats like JSON, XML (with secure parsers), Protocol Buffers, or Avro, which are designed primarily for data exchange, not arbitrary object graphs.
- Use schema validation: For formats like JSON and XML, enforce strict schemas (e.g., JSON Schema, XML Schema Definition – XSD) to define the expected structure, data types, and constraints of the incoming data. This ensures that only well-formed and expected data is processed.
- Configure libraries securely: Even safe libraries can become vulnerable if misconfigured. For example, in Json.NET (
Newtonsoft.Json
), ensureTypeNameHandling
is set toNone
orAuto
with a customSerializationBinder
to prevent type spoofing. In Java, useObjectInputFilter
to explicitly whitelist allowed classes for deserialization. - Regularly update libraries: Keep all serialization and deserialization libraries, as well as the underlying language runtime, updated to their latest stable versions to benefit from security patches and improvements. Many known gadget chains are discovered and patched in library versions.
5.4 Enforce Strict Input Validation
Implementing strict validation for all incoming serialized data is critical, even after integrity checks. This goes beyond simply verifying the format and delves into the actual content and structure of the deserialized objects.
- Whitelisting: Instead of blacklisting (trying to identify and block malicious inputs, which is often incomplete and bypassable), use a whitelisting approach. Define a strict list of allowed classes, properties, and values that the deserializer is permitted to handle. Any class or property not on this explicit allow-list should be rejected.
- Type Checking: Ensure that deserialized fields conform to expected data types (e.g., an integer should not be deserialized into a string or an object).
- Content Validation: Validate the actual values within the deserialized objects. For instance, if an object contains a file path, ensure it adheres to a safe directory, does not contain path traversal characters, and refers to an allowed file type.
- Size Limits: Impose strict size limits on incoming serialized data to prevent DoS attacks through excessive payload sizes.
- Depth Limits: Limit the nesting depth of deserialized object graphs to prevent stack overflow DoS attacks.
- Reflection Restrictions: If using reflection during custom deserialization, ensure that it is tightly controlled and only used for known, safe operations. Avoid dynamic method invocation based on untrusted input.
5.5 Apply the Principle of Least Privilege
The principle of least privilege dictates that any process, user, or system should operate with only the minimum necessary permissions required to perform its function. This applies directly to the deserialization process in several ways:
- Execution Environment: If possible, isolate the deserialization process in a separate, low-privilege execution environment, such as a dedicated microservice, a chrooted jail, or a container with limited capabilities. If an exploit occurs, its impact will be confined to this sandboxed environment, preventing it from affecting the entire application or underlying system.
- File System Access: Limit the file system access of the process performing deserialization to only directories and files that are absolutely necessary. Prevent it from writing to arbitrary locations or executing arbitrary binaries.
- Network Access: Restrict outbound network connections from the deserialization process to prevent attackers from using it to reach other internal systems or exfiltrate data.
- Database Access: Ensure the database user associated with the deserialization process has only the most restrictive permissions needed, preventing privilege escalation or data manipulation within the database.
- Deserialized Object Permissions: Ensure that any objects created during deserialization are instantiated with the lowest possible privileges and do not inherit elevated permissions based on the source of the serialized data.
5.6 Monitor and Log Deserialization Activities
Proactive monitoring and comprehensive logging of deserialization events are crucial for detecting, investigating, and responding to attempted exploits. By establishing a baseline of normal deserialization activity, anomalies can be quickly identified.
What to log:
- Source of Deserialization: IP address, user ID, API endpoint, or file path from which the serialized data originated.
- Type of Deserialized Object(s): Record the names of the classes being instantiated during deserialization. Unexpected class names are a strong indicator of an attack.
- Size of Payload: Log the size of the incoming serialized data. Unusually large payloads could indicate a DoS attempt.
- Deserialization Errors: Log any errors or exceptions that occur during the deserialization process. High rates of deserialization failures could indicate active probing or exploitation attempts.
- Timestamp: Record the exact time of the event.
- Contextual Information: Include relevant application context, such as session IDs, request IDs, or process IDs.
What to monitor for:
- Unusual Class Loads: Alerts should be triggered if the deserializer attempts to load classes that are not part of an allow-list of expected types.
- High Error Rates: A sudden surge in deserialization errors.
- Resource Consumption Spikes: Abnormal spikes in CPU, memory, or disk I/O correlated with deserialization events.
- Unexpected File System/Network Activity: If the sandboxed deserialization process attempts to access unauthorized files or make outbound network connections.
Integrate these logs with a Security Information and Event Management (SIEM) system for centralized analysis and alerting. Regular review of these logs can help in identifying both successful and unsuccessful exploitation attempts.
5.7 Deserialization Sandboxing and Custom Logic
Beyond basic logging, more advanced techniques involve actively controlling the deserialization process at a lower level:
- Custom Deserialization Logic: For complex scenarios where full object graph deserialization is unavoidable, developers can implement custom deserialization methods (e.g., overriding
readObject()
in Java) to manually control which fields are deserialized and how. This allows for fine-grained validation and the rejection of unsafe inputs. - Whitelisting/Blacklisting Classes at Runtime: Some frameworks or languages provide mechanisms to register or filter allowed/disallowed classes during deserialization. For example, Java’s
ObjectInputFilter
(since Java 9) allows setting a filter on anObjectInputStream
to accept or reject classes based on patterns or an explicit allow-list. While blacklisting is generally discouraged, in some legacy systems, it might be a temporary measure for known vulnerable classes. - Look-ahead Deserialization: In Java, a technique known as ‘look-ahead deserialization’ involves parsing the serialized stream without actually instantiating objects, just to extract class names. This allows a prior check against an allow-list before committing to full deserialization, which would invoke constructors and methods.
5.8 Regular Security Audits and Penetration Testing
Proactive security measures are indispensable. Regular security audits, code reviews, and penetration testing specifically targeting deserialization vulnerabilities are critical. These activities can help identify potential weaknesses that automated tools might miss.
- Code Reviews: Manual review of code sections that handle serialization and deserialization, paying close attention to input sources, the choice of serialization library, and any custom deserialization logic.
- Static Application Security Testing (SAST): Utilize SAST tools that can identify known insecure deserialization patterns in source code.
- Dynamic Application Security Testing (DAST) / Penetration Testing: Employ DAST tools and engage ethical hackers to perform black-box or white-box penetration tests, actively attempting to craft and send malicious serialized payloads to identify exploitable deserialization vulnerabilities in running applications. Tools like
ysoserial
can be used to generate payloads for testing.
5.9 Patch Management and Threat Intelligence
Staying informed about new vulnerabilities and ensuring all software components are up to date is a continuous and vital mitigation step.
- Software Updates: Regularly apply security patches and updates to operating systems, application servers, development frameworks, and all third-party libraries. Many deserialization gadget chains are discovered in older versions of common libraries.
- Threat Intelligence: Subscribe to security advisories and vulnerability databases (e.g., CVE, NVD, OWASP) to stay abreast of newly discovered deserialization vulnerabilities and exploitation techniques.
By combining these strategies, organizations can significantly reduce their exposure to insecure deserialization vulnerabilities and enhance the overall security posture of their applications.
Many thanks to our sponsor Esdebe who helped us prepare this research report.
6. Conclusion
Insecure deserialization vulnerabilities represent a significant and persistent threat to the security and integrity of modern software systems. Their ability to facilitate devastating attacks such as remote code execution, denial of service, and privilege escalation underscores their critical importance as a security concern. As demonstrated by the high-impact CVE-2025-23120 incident in Veeam Backup & Replication, even mission-critical enterprise software can be vulnerable, with potentially catastrophic consequences for data and operational continuity.
This paper has provided an exhaustive examination of insecure deserialization, elucidating the fundamental processes of serialization and deserialization, dissecting the technical mechanisms of attack vectors, and detailing the language-specific nuances across Java, Python, PHP, and .NET. The pervasive nature of these vulnerabilities stems from the inherent trust placed in reconstructed data and the powerful capabilities of object instantiation and method invocation during deserialization. Attackers expertly leverage ‘gadget chains’ – sequences of legitimate but exploitable methods within an application’s classpath – to achieve their malicious objectives.
Effective mitigation of insecure deserialization demands a robust, multi-layered, and proactive security strategy. The cornerstone of defense lies in the absolute avoidance of deserializing untrusted data. When deserialization is unavoidable, it must be coupled with rigorous integrity checks (e.g., digital signatures, HMACs), strict input validation (whitelisting allowed types and values), and the meticulous selection and secure configuration of serialization libraries. Furthermore, applying the principle of least privilege, isolating deserialization processes in sandboxed environments, and implementing comprehensive monitoring and logging are indispensable for both prevention and early detection of attacks.
Finally, continuous security auditing, regular penetration testing, and diligent patch management are vital to stay ahead of evolving threats and ensure that applications remain resilient against emerging deserialization exploits. By adopting these comprehensive secure deserialization practices, developers and security professionals can significantly enhance the resilience of their applications, safeguarding critical data and maintaining trust in a progressively interconnected and data-driven world.
Many thanks to our sponsor Esdebe who helped us prepare this research report.
References
- OWASP Foundation. (n.d.). Insecure Deserialization. Retrieved from https://owasp.org/www-community/vulnerabilities/Insecure_Deserialization
- Veeam. (2025, March 20). CVE-2025-23120. Retrieved from https://www.veeam.com/kb4724
- BleepingComputer. (2025, March 20). Veeam RCE bug lets domain users hack backup servers, patch now. Retrieved from https://www.bleepingcomputer.com/news/security/veeam-rce-bug-lets-domain-users-hack-backup-servers-patch-now/
- Acunetix. (n.d.). What is Insecure Deserialization? Retrieved from https://www.acunetix.com/blog/articles/what-is-insecure-deserialization/
- Invicti. (n.d.). Insecure Deserialization in Web Applications. Retrieved from https://www.invicti.com/blog/web-security/insecure-deserialization-in-web-applications/
- MCSI Library. (2022, July). Insecure Deserialization Attacks. Retrieved from https://mcsi-library.readthedocs.io/articles/2022/07/insecure-deserialization-attacks/insecure-deserialization-attacks.html
- PortSwigger. (n.d.). Insecure deserialization. Retrieved from https://portswigger.net/web-security/deserialization
- Bright Security. (n.d.). Deserialization: How it Works and Protecting Your Apps. Retrieved from https://www.brightsec.com/blog/deserialization/
- Cyserch Solutions. (2024, December 15). Insecure Deserialization in 2025: The Hidden Vulnerability That Could Cripple Your Systems. Retrieved from https://www.cyserch.com/blog/Insecure-Deserialization-in-2024
- LinkedIn. (n.d.). How to Detect and Mitigate Insecure Deserialization Attacks. Retrieved from https://www.linkedin.com/advice/1/how-do-you-detect-mitigate-insecure-deserialization
- CISecurity. (n.d.). Data Deserialization. Retrieved from https://www.cisecurity.org/insights/blog/data-deserialization
- NVD. (2025, March 20). CVE-2025-23120 Detail. Retrieved from https://nvd.nist.gov/vuln/detail/CVE-2025-23120
- OWASP Foundation. (n.d.). Insecure Deserialization. Retrieved from https://owasp.org/www-community/vulnerabilities/Insecure_Deserialization
- Python Software Foundation. (n.d.).
pickle
— Python object serialization. Retrieved from https://docs.python.org/3/library/pickle.html - Microsoft. (n.d.). BinaryFormatter security guide. Retrieved from https://learn.microsoft.com/en-us/dotnet/standard/serialization/binaryformatter-security-guide
This research highlights the critical need for secure coding practices. Implementing integrity checks before deserialization seems crucial, especially using digital signatures or HMAC to validate data authenticity and prevent tampering. It would be interesting to explore the performance overhead of these checks in high-throughput systems.