Use pv as your MySQL import progress indicator

Use pv as your MySQL import progress indicator
Waiting for a task to complete, can be boring. Photo: Kome 8 / Flickr https://www.flickr.com/photos/kome8/

There is only one relief for the pain of a long running command – a progress indicator, but commands only may or may not show their progress – and this is by design. They are good at one thing, and through regular pipes, the data can pass through a lot of steps. Processing a file may take time, but how much? Here is a description of how you would add a mysql progress indicator for data  imports. 

In my case, I had 11 Gigabytes of zipped MySQL database dump, that needed to be restored. The create and inserts by themselves does not take up a lot of time, but the amount of data was huge and there was no way of telling how complex the indexes were (and theywould add to the time needed for the mysql restore). This could take 1 hour, 3 hours or 3 days – I had no idea. To restore, I would simply do:

gunzip -c databasebackup-file.sql.gz | mysql -uxxx -p databasename

In my search for answers, quite a few of the search results mentioned pv – which you may have encountered, as some services actually uses a similar output to display progress like:

1.43GiB 1:43:13 [ 392KiB/s] [=========> ] 13% ETA 10:56:22

The restore above has processed 1.43 Gigabytes of data so far, that took an hour and 43 minutes and 13 seconds. The current throughput is 392 Kilobytes of processing per second. We are 13% complete, and it is estimated that the restore will take almost 11 hours more (ETA) to complete, based on the historic processing performance.

As the file is restored, the numbers are updated – one second at a time. In this particular case, there may be statements that run for a longer times than others, initially leading to quite a few “jumps” up and down in the ETA, but as time passes, the bumps will even out, and the ETA becomes more accurate. In this particular case the restore actually ended up taking just over 4 hours, but the percent indicator alone gives a great deal of usable insights.

So how is this done?

It is quite simple actually – instead of running

gunzip -c databasebackup-file.sql.gz | mysql -uxxx -p databasename

you would run

pv databasebackup-file.sql.gz | gunzip -c | mysql -uxxx -p databasename

The indicator starts running right away – even though you have to enter the database users’ password before the restore can begin. That may mess up the terminal, but after that you are good to go. Useful switches include:

– I, –fineta to display the estimated local time when done, instead of the ETA

-a Show the average data throughput – not the current

If you use pv for a database export, the size of the export is unknown until the export is complete, so pv cannot calculate an ETA, but if you know the expected size of the export, you can tell pv about that using the

-s SIZE, –size SIZE argument for pv which can help it calculate the ETA and percentage complete. Read the man page for further tips on using pv.

This is just one example of using pv to display the progress of an operation. You can easily use it for any operation using pipe, and even when copying large files or synchronizing using rsync.