Golang : Generate random Chinese, Japanese, Korean and other runes




Alright, got a rare request here from a friend on how to generate a string of random Chinese, Japanese and Korean(CJK) characters. Well, I won't get into the details on what his ultimate goal is. Basically, he wanted to use it for his software to fight bots. Kinda like re-captcha but for CJK markets.

Below is an enhancement of my previous tutorials on how to generate random runes. Instead of defining the characters that we want to randomize, we will use the Unicode ranges for let say Chinese characters. Below is the code example on how to randomly select characters from the Unicode range.

Here you go!

randomCJK.go

 package main

 import (
  "fmt"
  "math/rand"
  "time"
 )

 var (
  thai = []int64{3585, 3654}
  armenian = []int64{1328, 1423}
  chinese = []int64{19968, 40869}
  japaneseKatakana = []int64{12449, 12531}
  japaneseHiragana = []int64{12353, 12435}
  koreanHangul = []int64{12593, 12686}
  cyrillianRussian = []int64{1025, 1169}
  greek = []int64{884, 974}
 )

 func RandInt(start, end int64) int64 {
  rand.Seed(time.Now().UnixNano())
  return (start + rand.Int63n(end-start))
 }

 func generateRandomRune(size int, start, end int64) string {
  randRune := make([]rune, size)
  for i := range randRune {
 randRune[i] = rune(RandInt(start, end))
  }
  return string(randRune)
 }

 func main() {
  fmt.Println("Random Chinese :\n ", generateRandomRune(10, chinese[0], chinese[1]))
  fmt.Println("Random Thai :\n ", generateRandomRune(10, thai[0], thai[1]))
  fmt.Println("Random Japanese Katakana :\n ", generateRandomRune(10, japaneseKatakana[0], japaneseKatakana[1]))
  fmt.Println("Random Japanese Hiragana :\n ", generateRandomRune(10, japaneseHiragana[0], japaneseHiragana[1]))
  fmt.Println("Random Korean :\n ", generateRandomRune(10, koreanHangul[0], koreanHangul[1]))
  fmt.Println("Random Russian :\n ", generateRandomRune(10, cyrillianRussian[0], cyrillianRussian[1]))
  fmt.Println("Random Armenian :\n ", generateRandomRune(10, armenian[0], armenian[1]))
 }

output:

$ ./randomCJK

Random Chinese :

襤池鏖臶唹泿聱倂庂楓

Random Thai :

ขอพฮศโำฐเฦ

Random Japanese Katakana :

ペツワケザユプルヂザ

Random Japanese Hiragana :

ゑふぱうばぇさぇぷゎ

Random Korean :

ㅴㆀㅽㅮㆇㅗㅓㅜㅰㅩ

Random Russian :

жѳМѭњЂЯёЧВ

Random Armenian :

ՏՈԽհշԲՍ՜ԳՍ

For the unicode ranges, please see https://wjsn.home.xs4all.nl/htmlcodes.htm

Latin Extended, B 384-535

Greek 884-974

Cyrillian (Russian) 1025-1169

Armenian 1328-1423

Hebrew 1488-1514

Arabic 1575-1610

Thaana (Dhivehi) 1920-1969

Devanagari (Hindi) 2304-2431

Bangla, Bengali 2433-2554

Tamil 2944-3071

Sinhala 3456-3583

Thai 3585-3654

Lao 3712-3839

Tibetan 3840-4031

Burmese 4096-4255

Georgian 4256-4351

Ethiopian 4608-4991

Khmer 6016-6143

Mongolian, Uyghur 6144-6319

Classical Greek 7936-8190

Japanese - Hiragana 12353-12435

Japanese - Katakana 12449-12531

Korean, Hangul 12593-12686

Chinese 19968-40869

References:

https://www.socketloop.com/tutorials/golang-random-rune-generator

https://golang.org/pkg/math/rand/#Int63n

https://wjsn.home.xs4all.nl/htmlcodes.htm

  See also : Golang : Extract unicode string from another unicode string example





By Adam Ng

IF you gain some knowledge or the information here solved your programming problem. Please consider donating to the less fortunate or some charities that you like. Apart from donation, planting trees, volunteering or reducing your carbon footprint will be great too.


Advertisement