Importing CSV Data Into MySQL
Reading a CSV file and importing it into a MySQL Database With NodeJS

We’re going to be looking at how to read a CSV file and how to dump that data into a MySQL database.
Setting Everything Up
Lets go ahead and make sure we have these dependencies installed and imported
const fs = require('fs');
const mysql = require('mysql');
const csv = require('fast-csv');
Loading The CSV File
Next lets go ahead and create a read the data into an array
let stream = fs.createReadStream("my_file.csv");
Now we have our stream, lets use the CSV module and parse it. Note we listen for two events (data, end). The data event is when a new line of the CSV has been parsed, and the end event is once the file has been completely read.
let myArray = [];
let csvStream = csv
.parse()
.on("data", (data) => {
myArray.push(data);
})
.on("end", () => {
});
stream.pipe(csvStream);
For those of you who have already looked at a CSV file in a text editor, you may already be aware that the first line of the file is the column headers. We want to get rid of those, so lets just shift the array. This removes the first element of the array.
.on("end", function () {
myArray.shift();
}
Inserting Multiple Records
I have already covered how to insert multiple records easily in my post Insert Multiple Records Into MySQL. It may be worth you reading that before getting started on inserting the data into a database..
On the event event, we’re going to just open up a new database connection, and bulk insert the records into the database using the MySQL’s native bulk insert functionality.
.on("end", function () {
// remove the first element in the array
myArray.shift();
// create a new connection to the database
const connection = mysql.createConnection({
host: 'localhost',
user: 'username',
password: 'super_secure_password',
database: 'database_name'
});
// open the connection
connection.connect((error) => {
if (error) {
console.error(error);
} else {
let query: string = 'INSERT INTO table (field1, field2, field3, field4) VALUES ?';
connection.query(query, [myArray], (error, response) => {
if(error) {
console.log(error);
} else {
console.log(response); // x records inserted
});
}
});
}
Known Bugs
Packet Too Large
There’s a few issues I’ve had when implementing this. As discussed in the Insert Multiple Records Into MySQL, having a large CSV file can lead to the database throwing an error due to a large packet file exceeding the ‘maximum_allowed_packet’ variable. You can see an easy fix below for that.
I was trying to bulk insert around 180k records from a CSV file, and I kept getting an error. When I traced the error back to MySQL output in the console, I could see that my MySQL server was running with a fairly small ‘maximum_allowed_packet’.
In order to check what packet size you have set currently, just run the following command, and the one underneath it to update the variable.
show variables like 'max_allowed_packet';
set global max_allowed_packet=33554432;
Incorrect DateTime For MySQL
Something else I had to deal with, is the date field in the CSV did not match the DateTime field in the MySQL database. In order to solve this, I just looped through the array and replaced correct index of the element with a converted value. Something along the lines of this:
.on("end", function () {
myArray.shift();
for (let i = 0, len = myArray.length; i < len; i++) {
myArray[i][3] = convertToMysql(myArray[i][3]);
}
// ...
The ‘convertToMySQL on line 4 is a custom function I wrote. You will have to write your own as your input may differ from mine.
Side Note: If you don’t want to shift the array to remove the first index, you can pass an options object to the CSV Stream, see below
csv
.fromStream(stream, {headers : true})
Errors In The CSV
This one is kind of situational, but you can listen to two events that may help you here. One is the ‘error’ event, and the other is the ‘data-invalid’ event. The error event gets triggered when an error occurs such as the file not found, or not being CSV etc. The data-invalid will be called when a row has invalid data on it.
.on('error', (error) {
// error event handler
});
.on('data-invalid', (data) => {
// data was invalid
});
Side Note: “Any rows consisting of nothing but empty strings and/or commas will be skipped, without emitting a ‘data’ or ‘error’ event.”
Whole Thing
If you want to double check your code after reading through; it might look something like below.
const fs = require('fs');
const mysql = require('mysql');
const csv = require('fast-csv');
let stream = fs.createReadStream("path_to_my_file.csv");
let myData = [];
let csvStream = csv
.parse()
.on("data", function (data) {
myData.push(data);
})
.on("end", function () {
myData.shift();
// create a new connection to the database
const connection = mysql.createConnection({
host: 'localhost',
user: 'username',
password: 'super_secure_password',
database: 'database_name'
});
// open the connection
connection.connect((error) => {
if (error) {
console.error(error);
} else {
let query = 'INSERT INTO table (field1, field2, field3, fieldn) VALUES ?';
connection.query(query, [myData], (error, response) => {
console.log(error || response);
});
}
});
});
stream.pipe(csvStream);
And that just about does it for this. I hope you guys found it useful! If you have any issues or suggestions, please don’t hesitate to comment!
Thank you for this.
No problem, glad you found this useful.
work properly
Thanks this was awesome.
Only issue I had was “con.query” instead of “connection.query”
Ah good catch! Will change that now, thanks 😀
If there are millions of records then it myData will increase node memory
Yes, but only when the CSV file is held in memory. Once that process has completed and exited, or you have disposed of the file then your CSV will no longer take up memory. You can also chunk the logic if you want to minimise (say 100k rows at a time)
have you encountered this error :
[367116:000001A6A4B1B6D0] 27189 ms: Mark-sweep 1402.6 (1429.3) -> 1402.2 (1431.8) MB, 1575.4 / 0.0 ms (average mu = 0.091, current mu = 0.003)
allocation failure scavenge might not succeed
[367116:000001A6A4B1B6D0] 29322 ms: Mark-sweep 1404.7 (1431.8) -> 1404.4 (1433.8) MB, 2130.0 / 0.0 ms (average mu = 0.041, current mu = 0.002)
allocation failure scavenge might not succeed
==== JS stack trace =========================================
0: ExitFrame [pc: 000002AFB135C5C1]
1: StubFrame [pc: 000002AFB135D9BF]
Security context: 0x0290a8a1e6e1
2: arrayToList [000002712D755719] [C:\Users\shaunpow\Documents\BitBucketLocalRepo\Testing-space\node_modules\sqlstring\lib\SqlString.js:~60] [pc=000002AFB160F9E8](this=0x02288173ff71 ,array=0x01c18c340909 ,timeZone=0x02712d756161 )
3: arrayToList [0…
FATAL ERROR: Ineffective mark-compacts near heap limit Allocation failed – JavaScript heap out of memory
1: 00007FF7BEAE0EFA v8::internal::GCIdleTimeHandler::GCIdleTimeHandler+4810
2: 00007FF7BEABA296 node::MakeCallback+4518
Yeah, that’s because the memory allocation has run out on your process.
You can pass an optional parameter to Node when running your TS / JS file – –max-old-space-size=2048
See more here: How to solve process out of memory node-js
Hi,
I tried this code, but I found this error:”UnhandledPromiseRejectionWarning: RequestError: Incorrect syntax near ‘?'”.
How to solve this issue?.
const express = require(‘express’);
const router = express.Router({
mergeParams: true
});
const multer = require(‘multer’);
const fs = require(‘fs’);
const mysql = require(‘mysql’);
const csv = require(‘fast-csv’);
// SET STORAGE
var storage = multer.diskStorage({
destination: function (req, file, cb) {
cb(null, ‘uploads’)
},
filename: function (req, file, cb) {
cb(null, file.fieldname + ‘-‘ + Date.now())
}
})
var upload = multer({ storage: storage })
router.post(‘/uploadfile’, upload.single(‘myFile’), (req, res, next) => {
const file = req.file
if (!file) {
const error = new Error(‘Please upload a file’)
error.httpStatusCode = 400
return next(error)
}
let stream = fs.createReadStream(file.originalname);
let myData = [];
let csvStream = csv
.parse()
.on(“data”, function (data) {
myData.push(data);
})
.on(“end”, function () {
myData.shift();
// create a new connection to the database
const connection = mysql.createConnection({
host: ‘localhost’,
user: ‘root’,
password: ”,
database: ‘dbname’
});
// open the connection
connection.connect((error) => {
if (error) {
console.error(error);
} else {
let query = ‘INSERT INTO tbl_bank_statement (id,transaction_id,dr_amount,cr_amount,final_balance,created_date,updated_date) VALUES ?’;
connection.query(query, [myData], (error, response) => {
console.log(error || response);
});
}
});
});
stream.pipe(csvStream);
})
module.exports = router;
Error :
events.js:288
throw er; // Unhandled ‘error’ event
^
Error: ENOENT: no such file or directory, open ‘bank.csv’
Emitted ‘error’ event on ReadStream instance at:
at internal/fs/streams.js:132:12
at FSReqCallback.oncomplete (fs.js:154:23) {
errno: -2,
code: ‘ENOENT’,
syscall: ‘open’,
path: ‘bank.csv’
}
file is properly uploaded into the folder but inside a database isn’t…please help
Hey, it looks like you’re trying to load a file that doesn’t exist. Can you try using __dirname or saving to a static directory? If that doesn’t fix it, then it may be an async issue where the IO is taking N time to save the file to disk, but the csv parser tries to read a file that hasn’t finished saving yet.