Golang : Check if user agent is a robot or crawler example
Problem:
You need to determine if the user agent that visiting your web server is a bot/robot/crawler. You have tried the hash map solution but found out that it can be easily broken if the robot version string changed. How to create a generic function that can detect if a user agent is a robot?
Solution:
Ported this solution from CodeIgniter for my own use. Feel free to adapt it for your own use.
Here you go!
package main
import (
"fmt"
"net/http"
"strings"
)
func is_robot(useragent string) bool {
// There are hundreds of bots but these are the most common.
// You can see other bots list at
// http://www.botsvsbrowsers.com/category/1/index.html
// the list below is taken from
// https://github.com/bcit-ci/CodeIgniter/blob/develop/system/libraries/User_agent.php
// Hash map/table method requires exact match of the user agent string and can be easily broken
// if the version number change. Therefore, it is better to check the user agent against a slice/dictionary
robots := []string{"Googlebot", "Google Page Speed Insights", "MSNBot", "Baiduspider", "Bing", "DuckDuckBot", "Inktomi Slurp", "Yahoo", "Ask Jeeves", "FastCrawler", "YandexBot", "MediaPartners Google", "Crazy Webcrawler", "AdsBot Google", "Feedfetcher Google", "Curious George", "facebookexternalhit"}
for _, bot := range robots {
if strings.Index(useragent, bot) > -1 {
return true
}
}
return false
}
func checkIfUserAgentIsRobot(w http.ResponseWriter, r *http.Request) {
ua := r.Header.Get("User-Agent")
fmt.Printf("user agent is: %s \n", ua)
w.Write([]byte("user agent is " + ua + "\n"))
result := "no"
if is_robot(ua) {
result = "yes"
}
fmt.Printf("user agent is a robot: %v \n", is_robot(ua))
w.Write([]byte("user agent is a robot:" + result + "\n"))
}
func main() {
http.HandleFunc("/", checkIfUserAgentIsRobot)
http.ListenAndServe(":8080", nil)
}
Output:
Browse page with Chrome browser
user agent is: Mozilla/5.0 (Macintosh; Intel Mac OS X 1085) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36
user agent is a robot: false
user agent is: Mozilla/5.0 (Macintosh; Intel Mac OS X 1085) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/48.0.2564.116 Safari/537.36
user agent is a robot: false
Browse page with Google Page Speed Insights bot
user agent is: Mozilla/5.0 (iPhone; CPU iPhone OS 8_3 like Mac OS X) AppleWebKit/537.36 (KHTML, like Gecko; Google Page Speed Insights) Version/8.0 Mobile/12F70 Safari/600.1.4
user agent is a robot: true
user agent is: Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko; Google Page Speed Insights) Chrome/27.0.1453 Safari/537.36
user agent is a robot: true
References:
https://www.socketloop.com/tutorials/golang-check-if-item-is-in-slice-array
See also : Golang : How to determine if request or crawl is from Google robots
By Adam Ng
IF you gain some knowledge or the information here solved your programming problem. Please consider donating to the less fortunate or some charities that you like. Apart from donation, planting trees, volunteering or reducing your carbon footprint will be great too.
Advertisement
Tutorials
+6.6k Mac OSX : Find large files by size
+14.3k Golang : Convert(cast) int to float example
+10.6k Golang : Natural string sorting example
+33.7k Golang : Call a function after some delay(time.Sleep and Tick)
+36.1k Golang : Convert date or time stamp from string to time.Time type
+11.1k Android Studio : Create custom icons for your application example
+22.3k Golang : Convert Unix timestamp to UTC timestamp
+8.3k Android Studio : Import third-party library or package into Gradle Scripts
+13k Golang : Date and Time formatting
+14.8k Golang : How to check if IP address is in range
+5.3k Python : Print unicode escape characters and string
+17k Golang : When to use init() function?