Handling large Excel files in Node.js can be challenging, especially when you need to split them into smaller, more manageable parts. In this guide, we’ll walk you through the process of reading an Excel file, extracting specific columns, and splitting the file into multiple smaller files, each containing 1,000 rows, while keeping the header row intact. We will keep it simple and beginner friendly.
Prerequisites
Before we start, make sure you have the following:
- Node.js installed on your machine. You can download it here.
- A basic understanding of JavaScript and Node.js.
- An Excel file you’d like to split (e.g.,
input_file.xlsx
).
Step 1: Setting up your project
Create a new directory for your project:
mkdir split-excel
cd split-excel
Initialize a new Node.js project:
npm init -y
Install the necessary packages:
We’ll be using the xlsx
package to handle Excel files.
npm install xlsx
Step 2: Writing the Code
Let’s write the code to split the Excel file into smaller files
Create a new JavaScript file in your project directory, called split_excel.js
touch split_excel.js
Import the necessary modules
const XLSX = require('xlsx');
const path = require('path');
Create the function to split the Excel file:
Here’s a simple function that reads the Excel file and splits the data into smaller files, each containing 1,000 rows.
function splitExcelFile(filePath, linesPerFile = 1000) {
// Read the Excel file
const workbook = XLSX.readFile(filePath);
const sheetName = workbook.SheetNames[0];
const worksheet = workbook.Sheets[sheetName];
// Convert the sheet to JSON format (array of arrays)
const jsonData = XLSX.utils.sheet_to_json(worksheet, { header: 1 });
// Extract the header row and the data rows
const header = jsonData[0];
const data = jsonData.slice(1); // All rows except the header
// Calculate the number of files needed
const totalFiles = Math.ceil(data.length / linesPerFile);
// Loop through and create new Excel files
for (let i = 0; i < totalFiles; i++) {
// Slice the data for the current file
const start = i * linesPerFile;
const end = start + linesPerFile;
const currentData = data.slice(start, end);
// Combine the header with the current data
const newSheetData = [header, ...currentData];
// Create a new worksheet and workbook
const newWorksheet = XLSX.utils.aoa_to_sheet(newSheetData);
const newWorkbook = XLSX.utils.book_new();
XLSX.utils.book_append_sheet(newWorkbook, newWorksheet, sheetName);
// Write the new workbook to a file
const newFileName = path.join(__dirname, `output_file_${i + 1}.xlsx`);
XLSX.writeFile(newWorkbook, newFileName);
console.log(`Created file: ${newFileName}`);
}
}
Test the function:
To see the function in action, call it at the bottom of your script:
const inputFilePath = path.join(__dirname, 'input_file.xlsx');
splitExcelFile(inputFilePath, 1000);
Run the script:
Make sure your input_file.xlsx is in the same directory as your script, then run the following in your terminal.
node split_excel.js
You should see output files (output_file_1.xlsx, output_file_2.xlsx, etc.) created in your directory, each containing up to 1,000 rows and the original header.
Conclusion
By following these steps, you’ve successfully split a large Excel file into smaller, more manageable files while retaining the header row. This approach can be particularly useful when working with large datasets, making it easier to process, analyze, or share the data in smaller chunks.
Feel free to customize this script to fit your specific needs. Whether you need to adjust the row limit or process the data further, this guide provides a solid foundation for working with Excel files in Node.js.
You can view the complete code on Github – https://github.com/boffincoders/excel-splitter
Additional Tips:
- Error Handling: Consider adding error handling to manage cases where the file doesn’t exist or the data is missing.
- Optimization: For extremely large files, consider streaming the data to reduce memory usage.
Need help with managing large datasets or developing custom solutions in Node.js? Our expert team at Boffin Coders is here to assist. Whether it’s optimizing your workflows or building robust applications, we’ve got you covered. Contact us today to discuss how we can help your business thrive!