Working with JSON files is common in modern PHP applications. Whether you're importing product catalogs, syncing organization records, processing API exports, or migrating data between systems, JSON is often the preferred format.
For small files, most developers use a straightforward approach:
$json = file_get_contents('data.json');
$data = json_decode($json, true);While this works perfectly for small datasets, it can become a serious problem once files grow to hundreds of megabytes or even several gigabytes.
We've encountered this issue in production environments where importing large datasets caused memory exhaustion, slow execution times, and application crashes.
Here's what we've learned about handling large JSON files efficiently.
The Problem with file_get_contents()
The biggest issue is that PHP loads the entire file into memory.
Let's assume:
- JSON file size: 500 MB
- PHP memory limit: 256 MB
The import will fail before it even starts processing the data.
Even if your memory limit is increased, decoding a large JSON structure often consumes significantly more memory than the original file size.
A 500 MB JSON file can easily require over 1 GB of memory after decoding.
Common errors include:
Allowed memory size exhaustedor
Fatal error: Out of memoryWhy Increasing Memory Isn't the Best Solution
A common reaction is to increase the memory limit:
ini_set('memory_limit', '2048M');While this may temporarily solve the issue, it doesn't scale.
As data grows:
- Memory usage continues increasing
- Imports become slower
- Server stability decreases
- Multiple concurrent imports become risky
The goal should be reducing memory usage, not continuously increasing limits.
Process Data in Chunks
If the source allows it, divide data into smaller files.
Instead of:
organizations.jsonconsider:
organizations_1.json
organizations_2.json
organizations_3.jsonProcessing smaller datasets offers several benefits:
- Lower memory consumption
- Faster recovery when failures occur
- Easier monitoring
- Better queue management
In Laravel projects, chunked processing often integrates well with queues.
Stream Data Instead of Loading Everything
Streaming is one of the most effective solutions.
Rather than loading the entire file into memory, records are processed as they are read.
This approach dramatically reduces memory consumption.
A streaming parser only keeps a small portion of the file in memory at any given time.
Benefits include:
- Constant memory usage
- Better scalability
- Ability to process gigabyte-sized files
- Improved server stability
For large imports, streaming should usually be the preferred approach.
Move Heavy Processing to Queues
Large imports should rarely run during a web request.
Imagine a user uploads a large file and waits for processing to finish.
Problems may include:
- Request timeouts
- Server resource spikes
- Poor user experience
Instead, queue the work:
ProcessOrganizationImport::dispatch($filePath);The user receives immediate feedback while the queue worker handles processing in the background.
This approach is more reliable and scales significantly better.
Batch Database Operations
Another common performance issue occurs during database inserts.
Many developers start with:
foreach ($records as $record) {
Organization::create($record);
}This generates a database query for every record.
When importing hundreds of thousands of records, performance suffers dramatically.
A better approach is batch insertion:
DB::table('organizations')->insert($batch);Benefits include:
- Fewer database queries
- Faster imports
- Reduced database load
- Better scalability
Monitor Memory Usage
When optimizing imports, it's useful to monitor memory usage during execution.
PHP provides simple functions:
echo memory_get_usage(true);and
echo memory_get_peak_usage(true);These metrics help identify bottlenecks and confirm whether optimizations are working.
Use Database Indexes Carefully
Large imports often involve duplicate detection.
For example:
Organization::where('symphony_id', $id)->first();Without proper indexing, lookup performance degrades rapidly as the table grows.
Important columns used for searching should generally be indexed.
Examples:
- symphony_id
- external_id
- kvk_number
- slug
Proper indexing can reduce lookup times from seconds to milliseconds.
Consider Incremental Imports
Many systems repeatedly import the same data.
Instead of reprocessing everything:
Import 1,000,000 records dailyconsider:
Import only changed recordsBenefits include:
- Reduced processing time
- Lower server load
- Faster synchronization
- Better overall performance
Incremental imports become increasingly valuable as datasets grow.
Production Checklist for Large JSON Imports
Before processing large JSON files, we typically verify:
- Streaming or chunked processing is used
- Queue workers are running
- Database indexes are present
- Batch inserts are implemented
- Memory usage is monitored
- Failed imports can resume safely
- Logs capture processing statistics
Following these practices helps prevent unexpected failures during production imports.
Final Thoughts
Large JSON imports often work perfectly during development but become problematic in production once datasets increase in size.
Rather than relying on higher memory limits, focus on techniques that scale:
- Stream data instead of loading everything
- Process records in chunks
- Use queues for background processing
- Batch database operations
- Monitor memory consumption
- Optimize database indexes
These approaches not only prevent memory exhaustion but also create import systems that remain reliable as data volumes continue to grow.
If your application regularly handles large datasets, investing time in proper import architecture today can save countless hours of troubleshooting later.
