How to process large CSV files with Laravel

Dealing with large files of data is often an essential part of businesses. Laravel provides an easy way to process large CSV files. You can install the Simple Excel package by Spatie. Use the command below to install it.

composer require spatie/simple-excel

We can use the SimpleExcelReader to load a large CSV file. By default, it returns a ‘LazyCollection’ which handles application data without exhausting the server’s memory. It processes the large CSV file bit by bit, which avoids load on the application. 

Creating Laravel Jobs for Large CSV processing eases the process. 

Laravel Jobs for CSV Processing

Pass the command below to create a new Laravel Job. 

php artisan make:job ImportCsv

The command creates a Job file as shown below. 

<?php

namespace App\Jobs;

use Illuminate\Bus\Queueable;

use Illuminate\Contracts\Queue\ShouldQueue;

use Illuminate\Foundation\Bus\Dispatchable;

use Illuminate\Queue\InteractsWithQueue;

use Illuminate\Queue\SerializesModels;

use Spatie\SimpleExcel\SimpleExcelReader;

class ImportCsv implements ShouldQueue

{

   use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

    /**

    * Create a new job instance.

    */

   public function __construct()

   {

       //

   }

   /**

    * Execute the job.

    */

   public function handle(): void

   {

       SimpleExcelReader::create(storage_path('app/public/products.csv'))

           ->useDelimiter(',')

           ->useHeaders(['ID', 'title', 'description'])

           ->getRows()

           ->chunk(5000)

           ->each(

               // Here we have a chunk of 5000 products

           );

   }

}

The Job does the following tasks.

Chunk the CSV

Break the file into small manageable pieces that can be easily dealt with. The ‘LazyCollection’ slices large CSV files into small units. 

Create multiple Jobs for each chunk of data to process them easily. For example, Product files in the CSV can be processed in the ExportProducts Job as shown below.

php artisan make:job ExportProducts

The Job has the code shown below. To process Product data and save in the database. 

<?php

namespace App\Jobs;

use App\Models\Product;

use Illuminate\Bus\Queueable;

use Illuminate\Contracts\Queue\ShouldBeUnique;

use Illuminate\Contracts\Queue\ShouldQueue;

use Illuminate\Database\Eloquent\Model;

use Illuminate\Foundation\Bus\Dispatchable;

use Illuminate\Queue\InteractsWithQueue;

use Illuminate\Queue\SerializesModels;

use Illuminate\Support\Str;

class ImportProductChunk implements ShouldBeUnique, ShouldQueue

{

   use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

   public $uniqueFor = 3600;

   /**

    * Create a new job instance.

    */

   public function __construct(

       public $chunk

   ) {

       //

   }

   /**

    * Execute the job.

    */

   public function handle(): void

   {

       $this->chunk->each(function (array $row) {

           Model::withoutTimestamps(fn () => Product::updateOrCreate([

               'product_id' => $row['ID'],

               'title' => $row['title'],

               'description' => $row['description'],

          ]));

       });

   }

   public function uniqueId(): string

   {

       return Str::uuid()->toString();

   }

}

Add a $uniquefor and $uniqueID variable to the Job to distinguish it from other concurrent Jobs. 

Dispatch Chunks

Each chunk of data is sent as a Job. This retains app performance, takes in less server memory and avoids load on the system.

The code shown in the Job below will dispatch ‘each’ chunk to the database. Then ‘each’ chunk will help do this. 

<?php

namespace App\Jobs;

use Illuminate\Bus\Queueable;

use Illuminate\Contracts\Queue\ShouldQueue;

use Illuminate\Foundation\Bus\Dispatchable;

use Illuminate\Queue\InteractsWithQueue;

use Illuminate\Queue\SerializesModels;

use Spatie\SimpleExcel\SimpleExcelReader;

class ImportCsv implements ShouldQueue

{

   use Dispatchable, InteractsWithQueue, Queueable, SerializesModels;

   /**

    * Create a new job instance.

    */

   public function __construct()

   {

       //

   }

   /**

    * Execute the job.

    */

   public function handle(): void

   {

       SimpleExcelReader::create(storage_path('app/public/products.csv'))

           ->useDelimiter(',')

           ->useHeaders(['ID', 'title', 'description'])

           ->getRows()

           ->chunk(5000)

           ->each(

               fn ($chunk) => ImportProductChunk::dispatch($chunk)

           );

   }

}

The code above processes ‘each’ chunk separately without creating a memory load to the application. 

Insert Chunks to Database

The chunks passed along with Jobs are then inserted into the database in an easy and efficient way. 

Laravel tools make it easy to process large files of data by creating multiple Jobs for large CSV entries. So the more Jobs you create, the more efficiently your large files will process. 

Leave a Comment