JDBC API in Java allows the program to batch insert and update data into the database, which
tends to provide better performance by simple virtue of fact that it reduces a lot of database round-trip which eventually improves overall performance. In
fact, it’s one of JDBC
best practices to insert and update data in batches. For those who don’t
know what is batch insert and update,
Java provides several ways to execute SQL queries, one of them is JDBC batch insert
and update, on which instead of executing SQL query one by one using either Statement or PreparedSatement,
you execute a query in batch and send a batch of the query to the database for execution
instead of a single query.
Since multiple queries are combined into batches and one
batch is sent to the database instead of individual queries, it reduces database round trip by a factor of batch size.
The batch size can be anything but needs to be decided carefully. JDBC specification supports up to 100 but individual databases e.g. Oracle, MySQL, Sybase, or SQL Server has their own limit on maximum batch size,, normal jdbc batch size ranges from 50 to 100.
JDBC API provides the addBatch() method to add queries into a batch and then later execute them using the executeBatch() method. Both Statement and PreparedStatement can be used to execute batch queries in Java.
.
By the way batch insert and update also provide performance boost to Data access Object or DAO layer, as discussed in our last post 4 ways to improve Performance of JDBC applications.
The batch size can be anything but needs to be decided carefully. JDBC specification supports up to 100 but individual databases e.g. Oracle, MySQL, Sybase, or SQL Server has their own limit on maximum batch size,, normal jdbc batch size ranges from 50 to 100.
JDBC API provides the addBatch() method to add queries into a batch and then later execute them using the executeBatch() method. Both Statement and PreparedStatement can be used to execute batch queries in Java.
.
By the way batch insert and update also provide performance boost to Data access Object or DAO layer, as discussed in our last post 4 ways to improve Performance of JDBC applications.
How to run batch insert and update in JDBC
There are multiple ways you can run batch queries in Java application, You
have a choice of using plain old JDBC or you can leverage Spring's JdbcTemplate Utility
class.
Though both Statement and PreparedStatment can
execute batch queries, It’s better to use PreparedStatement because of
several benefits it provides including improved performance and prevention from
SQL injection as suggested on Why
you should use PreparedStatement in Java.
In the next section, we will compare
performance of same INSERT SQL query when running as without batch and running
as batch insert query. In both cases we will use PreparedStatement to make
testing similar.
SQL query without JDBC batch update using PreparedStatement
Here is an example of running SQL query without using JDBC batch update.
Performance of this example can be used to compare how JDBC Batch update
perform.
//query for inserting batch data
String query = "insert into employee values (?,?,NULL)";
PreparedStatement pStatement = conn.prepareStatement(query);
long startTime = System.currentTimeMillis();
for(int count = 0; count < 1000; count++ ){
pStatement.setString(1, Integer.toString(count));
pStatement.setString(2, "Employee"+count);
pStatement.executeUpdate();
}
long endTime = System.currentTimeMillis();
long elapsedTime = (endTime - startTime)/1000; //in seconds
System.out.println("Total time required to execute 1000 SQL INSERT queries using PreparedStatement without JDBC batch update is :" + elapsedTime);
Output:
Total time required to execute 1000 queries using Statement without JDBC batch update is :38
String query = "insert into employee values (?,?,NULL)";
PreparedStatement pStatement = conn.prepareStatement(query);
long startTime = System.currentTimeMillis();
for(int count = 0; count < 1000; count++ ){
pStatement.setString(1, Integer.toString(count));
pStatement.setString(2, "Employee"+count);
pStatement.executeUpdate();
}
long endTime = System.currentTimeMillis();
long elapsedTime = (endTime - startTime)/1000; //in seconds
System.out.println("Total time required to execute 1000 SQL INSERT queries using PreparedStatement without JDBC batch update is :" + elapsedTime);
Output:
Total time required to execute 1000 queries using Statement without JDBC batch update is :38
So it took 38 seconds to insert 1000 records on employee table on MySQL
database running on localhost.
Yes, indeed its quite high but don't bother
about absolute number yet, what is important here is to find out whether JDBC
batch insert or update gives better performance or not.
By the way above
example uses PreparedStatement and bind variables to ensure standard
JDBC practices are followed.
JDBC Batch INSERT example using PreparedStatement
Now, let’s run same set of SQL query as JDBC batch INSERT. In this
example, instead of running every SQL INSERT query as executeUpdate() , we are
adding them in a batch using addBatch() method and
once we reaches batch size, which is 100 here, we send them to database using executeBatch() method of
JDBC API.
import java.sql.Connection;
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
/**
* Java program to demonstrate JDBC Batch Insert example. Inserting data in batch
* seems to improve performance a lot. executeBatch() method of PreparedStatement is
* used to run batch queries in Java JDBC.
*/
public class MySQLJdbcExample {
public static void main(String args[]) throws SQLException {
//creating JDBC Connection to mysql database
String url="jdbc:mysql://localhost:3306/test";
Connection conn = DriverManager.getConnection(url, "root", "root");
// conn.setAutoCommit(false); keep auto commit false for better performance
//query for inserting batch data
String query = "insert into employee values (?,?,NULL)";
PreparedStatement pStatement = conn.prepareStatement(query);
int batchSize = 100;
long startTime = System.currentTimeMillis();
for (int count = 0; count < 1000; count++) {
pStatement.setString(1, Integer.toString(count));
pStatement.setString(2, "Employee" + count);
pStatement.addBatch();
if (count % batchSize == 0) {
pStatement.executeBatch();
}
}
pStatement.executeBatch() ; //for remaining batch queries if total record is odd no.
// conn.commit();
pStatement.close();
conn.close();
long endTime = System.currentTimeMillis();
long elapsedTime = (endTime - startTime)/1000; //in seconds
System.out.println("Total time required to execute 1000 queries using PreparedStatement with JDBC batch insert is :" + elapsedTime);
}
}
Output:
Total time required to execute 1000 queries using PreparedStatement with JDBC batch insert is :28
import java.sql.DriverManager;
import java.sql.PreparedStatement;
import java.sql.SQLException;
/**
* Java program to demonstrate JDBC Batch Insert example. Inserting data in batch
* seems to improve performance a lot. executeBatch() method of PreparedStatement is
* used to run batch queries in Java JDBC.
*/
public class MySQLJdbcExample {
public static void main(String args[]) throws SQLException {
//creating JDBC Connection to mysql database
String url="jdbc:mysql://localhost:3306/test";
Connection conn = DriverManager.getConnection(url, "root", "root");
// conn.setAutoCommit(false); keep auto commit false for better performance
//query for inserting batch data
String query = "insert into employee values (?,?,NULL)";
PreparedStatement pStatement = conn.prepareStatement(query);
int batchSize = 100;
long startTime = System.currentTimeMillis();
for (int count = 0; count < 1000; count++) {
pStatement.setString(1, Integer.toString(count));
pStatement.setString(2, "Employee" + count);
pStatement.addBatch();
if (count % batchSize == 0) {
pStatement.executeBatch();
}
}
pStatement.executeBatch() ; //for remaining batch queries if total record is odd no.
// conn.commit();
pStatement.close();
conn.close();
long endTime = System.currentTimeMillis();
long elapsedTime = (endTime - startTime)/1000; //in seconds
System.out.println("Total time required to execute 1000 queries using PreparedStatement with JDBC batch insert is :" + elapsedTime);
}
}
Output:
Total time required to execute 1000 queries using PreparedStatement with JDBC batch insert is :28
So JDBC batch insert and update do give us better performance over
queries running without batches. One of the important thing which I have not
used here is, I have not disabled auto-commit mode.
You should always run SQL
query with auto-commit mode disabled even with JDBC Batch insert and update example and
do commit() explicitly.
That will further boost the performance of your JDBC code. Try
running the above code with auto-commit mode disabled and it won't take even a
second to execute.
Benefits of using JDBC batch update
Significant improvement in performance can be achieved by using the JDBC
batch update and insert. Since in the case of batch queries, You effectively reduce
database round-trip, You save a lot of
time spent on network latency, which results in better performance of Java
applications.
Always combine JDBC batch insert or update with PreparedStatement to get
best of both world and also follow these Top
10JDBC best practices while writing JDBC code in Java. e.g. running SQL
query with auto-commit mode disabled.
That's all on how to run JDBC Batch insert and update in Java. We have
seen how using JDBC batch insert can improve performance and how we can execute PreparedStatement queries in
batch.
Choose batch size based on what suits your application and run query
with auto-commit mode disabled for better performance.
Other JDBC tutorials for Java programmers
is there any benefit using core JDBC API for running batch update queries, I am fond of using Spring framework and almost always use JdbcTemplate to group SQL queries in batch and then run. Also executeBatch() method returns an integer array to reflect number of rows affected by this batch update queries.
ReplyDeleteI think using Spring framworks and JdbcTemplate or SimpleJdbcTemplate for batch queries are more simpler example. You just need to call batchInsert() or batchUpdate() method that's it.
ReplyDeleteMay be you need to replace this:
ReplyDeleteif (count % batchSize == 0) {
pStatement.executeBatch();
}
for this:
if (count % batchSize == 0) {
pStatement.executeBatch();
// conn.commit();
}
In case if you set autocomit=false.
Do you know if conn.rollback(); works with MySQL ?
ReplyDeleteI've tested version 5.6 and latest Connector-J 5.1.33 and unfortunately the records that succeeded
are still in the database even pStatement.executeBatch() throws an exception
I was expecting that conn.rollback() will remove the records that succeeded.
Couple of things
ReplyDelete1) You can get better performance by increasing batch size. In order to store 1000 records, you should have a batch size of 100, instead of 2, which you have currently by calling executeBatch() for every even records.
2) You should execute batch statements inside a transaction to keep your database consistent. It's possible that one or more statements in the batch fail to execute. In that case do you want to keep the successful insert or you want to roll it back so that either all records are inserted or none of them. This is another reason why you should call setAutoCommit(false).
3) It's difficult to find which record failed to execute in the batch of 100 or 1000. In case of java.sql.BatchUpdateException, you may need to insert data one by one .
Mr Javin saves the day yet again.I have survived many and interview and many a coding issue just having this one blog bookmarked. God bless.
ReplyDeleteThank you, you made my day. thanks a lot for your kind comment.
ReplyDeleteif I want to add different sql statements itself to batch and execut it all in once can we achieve using prepared statement?
ReplyDelete