Golang : Find duplicate files with filepath.Walk
Sometimes we downloaded a lot of files in a directory and although the files have different names, they could be duplicates of the same file . This small Golang program will scan a target directory and create a hash map for each files. If any files have similar sha512 hash, then they are ... essentially the same.
package main
import (
"crypto/sha512"
"fmt"
"io/ioutil"
"os"
"path/filepath"
)
var files = make(map[[sha512.Size]byte]string)
func checkDuplicate(path string, info os.FileInfo, err error) error {
if err != nil {
fmt.Println(err)
return nil
}
if info.IsDir() { // skip directory
return nil
}
data, err := ioutil.ReadFile(path)
if err != nil {
fmt.Println(err)
return nil
}
hash := sha512.Sum512(data) // get the file sha512 hash
if v, ok := files[hash]; ok {
fmt.Printf("%q is a duplicate of %q\n", path, v)
} else {
files[hash] = path // store in map for comparison
}
return nil
}
func main() {
if len(os.Args) != 2 {
fmt.Printf("USAGE : %s <target_directory> \n", os.Args[0])
os.Exit(0)
}
dir := os.Args[1] // get the target directory
err := filepath.Walk(dir, checkDuplicate)
if err != nil {
fmt.Println(err)
os.Exit(1)
}
}
Sample output :
"/Users/sweetlogic/Applications/.localized" is a duplicate of "/Users/.localized"
"/Users/sweetlogic/Desktop/.localized" is a duplicate of "/Users/.localized"
"/Users/sweetlogic/Desktop/01.jpg" is a duplicate of "/Users/sweetlogic/01.jpg"
"/Users/sweetlogic/Desktop/02.jpg" is a duplicate of "/Users/sweetlogic/02.jpg"
"/Users/sweetlogic/Desktop/03.jpg" is a duplicate of "/Users/sweetlogic/03.jpg"
See also : Generate checksum for a file in Go
By Adam Ng
IF you gain some knowledge or the information here solved your programming problem. Please consider donating to the less fortunate or some charities that you like. Apart from donation, planting trees, volunteering or reducing your carbon footprint will be great too.
Advertisement
Tutorials
+1k Golang : Converting individual Jawi alphabet to Rumi(Romanized) alphabet example
+5.2k Use systeminfo to find out installed Windows Hotfix(s) or updates
+7.7k Golang : Convert(cast) bytes.Buffer or bytes.NewBuffer type to io.Reader
+12.4k Golang : Archive directory with tar and gzip
+4.7k Golang : Convert a rune to unicode style string \u
+5.6k Golang : flag provided but not defined error
+2.6k Golang : Test input string for unicode example
+12k Golang : Linked list example
+25.8k Golang : Get hardware information such as disk, memory and CPU usage
+9.9k Swift : Convert (cast) Int to String ?
+6k Golang : List running EC2 instances and descriptions
+2.9k Unix/Linux : How to fix CentOS yum duplicate glibc or device-mapper-libs dependency error?