Golang : Recombine chunked files example




Problem :

You use the previous tutorial on how to split or chunk a big file into smaller pieces and now you want to recombine the smaller chunks back together again. How to do that?

Solution :

In Linux/Unix, you can use the cat command to recombine the chunks with this example line:

cat bigfile_0 bigfile_1 bigfile_2 bigfile_3 bigfile_4 bigfile_5 > catNEWbigfile.zip

However, it can be a tedious process if you have many files to recombine back.

Below is an example of how to split a file and then recombine the chunks back again.

I've tested this solution on a 6 MB bigfile.zip file and able to unzip the new file successfully. Please try on a larger file to see if this solution works or not with a file larger than 4GB.

Here you go!

 package main

 import (
 "bufio"
 "fmt"
 "io/ioutil"
 "math"
 "os"
 "strconv"
 )

 func main() {

 fileToBeChunked := "./bigfile.zip" // change here!

 file, err := os.Open(fileToBeChunked)

 if err != nil {
 fmt.Println(err)
 os.Exit(1)
 }

 defer file.Close()

 fileInfo, _ := file.Stat()

 var fileSize int64 = fileInfo.Size()

 const fileChunk = 1 * (1 << 20) // 1 MB, change this to your requirement

 // calculate total number of parts the file will be chunked into

 totalPartsNum := uint64(math.Ceil(float64(fileSize) / float64(fileChunk)))

 fmt.Printf("Splitting to %d pieces.\n", totalPartsNum)

 for i := uint64(0); i < totalPartsNum; i++ {

 partSize := int(math.Min(fileChunk, float64(fileSize-int64(i*fileChunk))))
 partBuffer := make([]byte, partSize)

 file.Read(partBuffer)

 // write to disk
 fileName := "bigfile_" + strconv.FormatUint(i, 10)
 _, err := os.Create(fileName)

 if err != nil {
 fmt.Println(err)
 os.Exit(1)
 }

 // write/save buffer to disk
 ioutil.WriteFile(fileName, partBuffer, os.ModeAppend)

 fmt.Println("Split to : ", fileName)
 }

 // just for fun, let's recombine back the chunked files in a new file

 newFileName := "NEWbigfile.zip"
 _, err = os.Create(newFileName)

 if err != nil {
 fmt.Println(err)
 os.Exit(1)
 }

 //set the newFileName file to APPEND MODE!!
 // open files r and w

 file, err = os.OpenFile(newFileName, os.O_APPEND|os.O_WRONLY, os.ModeAppend)

 if err != nil {
 fmt.Println(err)
 os.Exit(1)
 }

 // IMPORTANT! do not defer a file.Close when opening a file for APPEND mode!
 // defer file.Close()

 // just information on which part of the new file we are appending
 var writePosition int64 = 0

 for j := uint64(0); j < totalPartsNum; j++ {

 //read a chunk
 currentChunkFileName := "bigfile_" + strconv.FormatUint(j, 10)

 newFileChunk, err := os.Open(currentChunkFileName)

 if err != nil {
 fmt.Println(err)
 os.Exit(1)
 }

 defer newFileChunk.Close()

 chunkInfo, err := newFileChunk.Stat()

 if err != nil {
 fmt.Println(err)
 os.Exit(1)
 }

 // calculate the bytes size of each chunk
 // we are not going to rely on previous data and constant

 var chunkSize int64 = chunkInfo.Size()
 chunkBufferBytes := make([]byte, chunkSize)

 fmt.Println("Appending at position : [", writePosition, "] bytes")
 writePosition = writePosition + chunkSize

 // read into chunkBufferBytes
 reader := bufio.NewReader(newFileChunk)
 _, err = reader.Read(chunkBufferBytes)

 if err != nil {
 fmt.Println(err)
 os.Exit(1)
 }

 // DON't USE ioutil.WriteFile -- it will overwrite the previous bytes!
 // write/save buffer to disk
 //ioutil.WriteFile(newFileName, chunkBufferBytes, os.ModeAppend)

 n, err := file.Write(chunkBufferBytes)

 if err != nil {
 fmt.Println(err)
 os.Exit(1)
 }

 file.Sync() //flush to disk

 // free up the buffer for next cycle
 // should not be a problem if the chunk size is small, but
 // can be resource hogging if the chunk size is huge.
 // also a good practice to clean up your own plate after eating

 chunkBufferBytes = nil // reset or empty our buffer

 fmt.Println("Written ", n, " bytes")

 fmt.Println("Recombining part [", j, "] into : ", newFileName)
 }

 // now, we close the newFileName
 file.Close()

 }

Sample output:

Splitting to 6 pieces.

Split to : bigfile_0

Split to : bigfile_1

Split to : bigfile_2

Split to : bigfile_3

Split to : bigfile_4

Split to : bigfile_5

Appending at position : [ 0 ] bytes

Written 1048576 bytes

Recombining part [ 0 ] into : NEWbigfile.zip

Appending at position : [ 1048576 ] bytes

Written 1048576 bytes

Recombining part [ 1 ] into : NEWbigfile.zip

Appending at position : [ 2097152 ] bytes

Written 1048576 bytes

Recombining part [ 2 ] into : NEWbigfile.zip

Appending at position : [ 3145728 ] bytes

Written 1048576 bytes

Recombining part [ 3 ] into : NEWbigfile.zip

Appending at position : [ 4194304 ] bytes

Written 1048576 bytes

Recombining part [ 4 ] into : NEWbigfile.zip

Appending at position : [ 5242880 ] bytes

Written 907617 bytes

Recombining part [ 5 ] into : NEWbigfile.zip

NOTE: This example iterates the chunked files with data from a previous function. IF you need to create a separate program to scan the number of small files in a directory to determine how many files to loop through..... start by looking at this tutorial, https://www.socketloop.com/tutorials/golang-increment-string-example

Happy coding!

References:

https://www.socketloop.com/tutorials/golang-how-to-split-or-chunking-a-file-to-smaller-pieces

https://www.socketloop.com/tutorials/golang-flush-and-close-file-created-by-os-create-and-bufio-newwriter-example

https://socketloop.com/references/golang-os-file-write-writestring-and-writeat-functions-example

https://www.socketloop.com/tutorials/golang-reset-buffer-example

  See also : Golang : Increment string example





By Adam Ng

IF you gain some knowledge or the information here solved your programming problem. Please consider donating to the less fortunate or some charities that you like. Apart from donation, planting trees, volunteering or reducing your carbon footprint will be great too.


Advertisement