Developers / Admins / Architects / DevOps

Mastering Apex ZIP Compression: DEFLATED vs. STORED Methods

By Bassem Marji

The art of optimization is understanding your tools and data, a principle that holds true in Apex as well. The Compression class supports ZIP archive creation through two primary compression strategies: DEFLATED and STORED. Choosing between these approaches significantly impacts application performance, storage utilization, and processing efficiency.

Beyond selecting between DEFLATED and STORED compression methods, Apex allows you to tailor the compression strategy for each file within a ZIP archive. This flexibility lets you optimize performance and storage based on the specific characteristics of each item. Before we explore how to implement this granular control, it’s important to understand their use cases, technical trade-offs, and best practices.

Method 1: STORED “The Fast Tracker”

When Raw Speed Outweighs Space Efficiency

Think of it as digitally throwing everything into a box without folding it first. This method allows you to add files to a ZIP archive without applying any compression. While this might seem counterintuitive for an archiving process, it offers distinct advantages in certain scenarios, especially when speed and minimal CPU usage are more important than reducing file size. This method is especially useful when dealing with files that are already compressed or when performance constraints make traditional compression impractical.

Key Characteristics:

  • No Compression: Files are stored in the ZIP archive exactly as they are, without any size reduction.
  • Faster Processing: Avoids the overhead of compression algorithms, making the archiving process significantly quicker.
  • Larger Archive Size: Since no compression occurs, the resulting ZIP file will be larger than if you had used the DEFLATED method.

When STORED Wins the Race:

  • Pre-Compressed Files: Already optimized formats (e.g., JPEG, MP4, existing ZIPs) where further compression is ineffective.
  • Speed over size: When milliseconds matter more than megabytes (e.g., real-time operations or batch processes under strict time limits).
  • Governor limit constraints: In governor-limited environments where CPU time is constrained.
  • Temporary archives: For archives that are needed and will be unzipped immediately after creation without storage/transfer.

Method 2: DEFLATED “The Space Saver”

When Every Byte Counts

Think of it as painstakingly folding each item while placing it neatly into a box, maximizing space but necessitating additional resources. It leverages the Deflate algorithm, which is a combination of two algorithms: Lempel-Ziv 77 (LZ77) and Huffman coding, working together to identify and eliminate redundancy within files.

This method prioritizes storage efficiency through lossless compression, significantly reducing file sizes for data with high redundancy (e.g., text, unstructured binaries). While it introduces computational overhead, its ability to shrink archives by up to 80% makes it vital for scenarios where storage costs or transfer bandwidth outweigh processing time.

Key Characteristics:

  • Lossless Compression: Applies the Deflate algorithm to minimize file size without data loss.
  • Increased CPU Utilization: Introduces computational overhead during compression.
  • Significant Size Reduction: Achieves 50-80% compression for text-based or uncompressed binary data.

When DEFLATED Excels:

  • Redundant File Formats: Files containing lots of repetitive or redundant information – such as logs, CSV, or JSON files – benefit significantly from compression, resulting in much smaller file sizes. Below, a compression performance overview:
File TypeExpected Compression
Text/CSV/JSON70-85% reduction
XML/HTML60-80% reduction
Log Files80-90% reduction
Binary Data10-30% reduction
Images/Videos0-5% reduction
  • Cost-Efficient Storage: Long-term archives where a smaller size lowers storage costs (e.g., backups, audit logs).
  • Bandwidth-Sensitive Transfers: Reducing payload size for HTTP callouts, platform events, or bulk data transfers.

Key Differences at a Glance

FactorSTOREDDEFLATED
CompressionNoneDeflate algorithm (lossless)
SpeedFasterSlower (CPU-intensive)
Archive SizeLargerSmaller
CPU UsageLowHigh
Heap Memory *Lower (no compression buffers)Higher (compression buffers)
Ideal ForPre-compressed files, speed-critical tasksText/data reduction, storage/transfer

* Heap memory consumption is influenced by the number, size, and nature of the items undergoing compression, but it is typically lower for STORED and higher for DEFLATED methods.

Determining Compression Method

Before diving in, let’s highlight the core method to be utilized. Note the method parameter, which determines whether an item is stored or deflated. Here are the details about zipWriter.addEntry:

Method Signature:

zipWriter.addEntry(
    String entryName,
    String comment,
    Datetime lastModified,
    Compression.Method method,
    Blob data
)

Parameter Breakdown:

NameTypeDescription
entryNameStringName (and optional path) for the file within the ZIP archive.
CommentStringOptional comment for this ZIP entry.
lastModifiedDatetimeLast modified date/time for this ZIP entry.
MethodCompression.MethodCompression method to apply.
DataBlobBinary file content (typically the VersionData from ContentVersion records).

Real-World Scenario:

Roll up your sleeves and let’s get hands-on with these compression techniques. Imagine we’ve uploaded three files to the Salesforce’s ContentVersion object:

ContentVersion ID.ContentVersion ID.File TypeFile Size (Bytes)
068Qy00000BuoevIABtest_csv.csvCSV656,926
068Qy00000Bup1VIARtest_image.pngPNG179,708
068Qy00000Bup9ZIARtest_pdf.pdfPDF179,068

To illustrate compression techniques, we will create a class named ZipTester. Below is a summary of this Apex class:

  • Retrieves file content from Salesforce’s ContentVersion object.
  • Compresses these files into a ZIP archive using Salesforce’s native Compression namespace and its built-in ZipWriter class.
  • Monitors resource consumption such as heap size, CPU time, and output size during the compression process to ensure efficient performance.
public class ZipTester {
    /**
     * Fetches ContentVersion records by Ids and returns them as list
     */
    public static List<ContentVersion> getContentVersionRecords(List<Id> contentVersionIds) {
        List<ContentVersion> cvs = new List<ContentVersion>();
        try 
        {   
            cvs = [SELECT Title, PathOnClient, VersionData, FileType
            		 FROM ContentVersion
            		WHERE Id IN :contentVersionIds
            		 WITH SECURITY_ENFORCED
        		   ];
        } catch (Exception e) {
         System.debug('Unexpected error in getContentVersionRecords: ' +  
                         e.getMessage());
        } 
        return cvs;
    }
    /**
     * Compresses files using specified compression method with resources tracking
     */
    public static Blob compressFiles(List<ContentVersion> files, Compression.Method method) {
        Long startTime = DateTime.now().getTime();
        Integer heapStart = Limits.getHeapSize();
	 Blob zipBlob;
        try 
        {
        	Compression.ZipWriter zipWriter = new Compression.ZipWriter();
        	for (ContentVersion fe : files) {
            	zipWriter.addEntry(fe.PathOnClient 
                               ,String.valueOf(method) 
                               ,Datetime.now() 
                               ,method 
                               ,fe.VersionData);
        	}
            zipBlob = zipWriter.getArchive();


            System.debug('>>> Compression Method: ' + method
                 + ' - Heap used: ' + (Limits.getHeapSize() - heapStart) + ' bytes'
                 + ' - CPU time: ' + (DateTime.now().getTime() - startTime) + ' ms'
                 + ' - Zipped Size: ' + zipBlob.size() + ' bytes'
              );  
        } catch (Exception e) {
          System.debug('Unexpected error in compressFiles: ' + e.getMessage());  
        }
        return zipBlob;
    }
}

Now, let’s apply both compression methods to the files and evaluate the results:

MethodHeap Used (bytes)CPU Time (ms)Zipped Size (bytes)
STORED1,022,892151,016,106
DEFLATED1,016,3991458,040

Interpretation:

  • The DEFLATED method achieves a substantial reduction in file size, compressing data from approximately 1 MB down to about 58 KB, demonstrating the effectiveness of the DEFLATE compression algorithm.
  • CPU time is marginally reduced (by 1 ms) when using DEFLATED, indicating that the compression process is efficient and does not introduce significant overhead compared to the STORED method.

Overall, DEFLATED offers a highly efficient compression strategy with minimal impact on processing resources.

Let’s enhance our class with a new overloaded version of the compressFiles method, enabling smart compression of files using a method selected per file type.

/**
     * Helper method to select compression method based on file type
     */
  private static Compression.Method selectCompressionMethod(String fileType) {
    	String ext = fileType.toLowerCase();
    	// Already compressed formats
    	Set<String> compressedFormats = new  
           Set<String>{'zip','jpg','jpeg','png','gif','mp3','mp4','pdf'};
    	return compressedFormats.contains(ext) ? Compression.Method.STORED : 
               Compression.Method.DEFLATED;
  }
    
    /**
     * Compresses files while selecting the compression method per file based on file type.
     */ 
    public static Blob compressFiles(List<ContentVersion> files) {
    	Long startTime = DateTime.now().getTime();
    	Integer heapStart = Limits.getHeapSize();
	Blob zipBlob;
       try {
    		Compression.ZipWriter zipWriter = new Compression.ZipWriter();
    		for (ContentVersion fe : files) {
        	     Compression.Method method = selectCompressionMethod(fe.FileType);
        	     zipWriter.addEntry(fe.PathOnClient
                                   , String.valueOf(method)
                                   , Datetime.now()
                                   , method
                                   , fe.VersionData);
    		}
    		zipBlob = zipWriter.getArchive();
		System.debug('>>> Heap used: ' + (Limits.getHeapSize()-heapStart) + ' bytes'
            	        + ' - CPU time: ' + (DateTime.now().getTime() - startTime) + ' ms'
                	 + ' - Zipped Size: ' + zipBlob.size() + ' bytes'
                   	);        
		} catch (Exception e) {
       		System.debug('Error in compressFiles: ' + e.getMessage());
    	}
	return zipBlob;
   }
MethodHeap Used (bytes)CPU Time (ms)Zipped Size (bytes)
Smart1,016,36814431,220

Analysis:

The smart method leverages the strengths of both compression strategies and maximizes efficiency by:

  • Reducing storage and transfer costs where possible.
  • Avoiding unnecessary CPU overhead on files that won’t benefit from further compression.
  • Adhering to Salesforce governor limits by optimizing both CPU and heap usage.

Best Practices

To ensure optimal performance, resource efficiency, and maintainability when implementing ZIP compression in Salesforce, it’s important to follow proven strategies. The following best practices will help you select the right compression method, monitor resource usage, and build scalable solutions:

  • Adopt a Hybrid Strategy where possible: Always use STORED for already-compressed formats (JPEG, PNG, MP4, ZIP, PDF) and DEFLATED for text or uncompressed binary files.
  • Benchmark Regularly: Test both methods on your actual data to validate assumptions, as real-world results can vary based on file contents.
  • Monitor Resource Usage: Track heap, CPU, and output size to ensure you stay within Salesforce governor limits.
  • Automate File Type Detection: Use file signatures or metadata, not just extensions, to accurately classify files.
  • Graceful Degradation: If compression fails or a file type is unknown, default to STORED to ensure data is always archived.

Final Thoughts

In the Salesforce ecosystem, where governor limits reign supreme, performance is paramount, and where every byte and millisecond counts, understanding compression is a survival skill.

As Salesforce continues to evolve, developers who master these fundamental optimization techniques will build more scalable, cost-effective solutions. 

The Author

Bassem Marji

Bassem is a certified Salesforce Administrator, a Project Implementation Manager at BLOM Bank Lebanon, and a Technical Author at Educative. He is a big fan of the Salesforce ecosystem.

Leave a Reply