python random module and python implementation method of weighted random algorithm

  • 2020-05-19 05:10:39
  • OfStack

random is used to generate random Numbers. We can use it to randomly generate Numbers or select strings.

The & # 8226; random.seed (x) changes the seed seed of the random number generator.

1 you don't have to set seed, Python will automatically select seed.

The & # 8226; random.random () is used to generate a random floating point number, n,0 < = n < 1

The & # 8226; random.uniform (a,b) is used to generate a random floating point number in a specified range, generating a random integer a < =n < =b;

The & # 8226; random.randint (a,b) is used to generate an integer in a specified range. a is the lower limit, b is the upper limit, and a is the random integer generated < =n < = b; If a=b, n=a; If a > b, an error

The & # 8226; random.randrange ([start], stop [,step]) gets a random number from the collection with the specified cardinality increment within the specified range [start,stop]. The cardinality default is 1

The & # 8226; random.choice (sequence) gets a random element from a sequence. The parameter sequence represents an ordered type, not a specific type, but list, tuple, string, etc

The & # 8226; random.shuffle (x[,random]) is used to shuffle (shuffle) elements in a list, which changes the original list

The & # 8226; random.sample (sequence,k) gets k elements from the specified sequence at random and returns them as 1 fragment without changing the original sequence

So now that we have the basic knowledge, let's implement a weighted random algorithm:

Weighted random algorithm 1 as used in the following scenario: one set S, such as inside A, B, C, D this 4 items. In this case, we want to pick one term at random, but the probability is different. For example, we want to pick A with a 50% chance, B and C with a 20% chance, and D with a 10% chance. 1 in general, we can attach a weight to each term, and the probability of extraction is proportional to this weight. Then the above set becomes:

{A:5, B:2, C:2, D:1}

Method 1:

The easiest way to do this is to:

Expanded sequence according to weight value into: lists = [A A, A, A, A, B, B, C, C, D], and random choice (lists) randomly selected one. Although the time complexity of this selection is O(1), the space consumption is too large due to the large amount of data.


# coding:utf-8
import random


def weight_choice(list, weight):
  """
  :param list:  To select sequence 
  :param weight: list The corresponding weight sequence 
  :return: The value of the selected 
  """
  new_list = []
  for i, val in enumerate(list):
    new_list.extend(val * weight[i])
  return random.choice(new_list)


if __name__ == "__main__":
  print(weight_choice(['A', 'B', 'C', 'D'], [5, 2, 2, 1]))

Method 2:

The more common method is as follows:

Calculate the sum of weights sum, and then randomly select a number R between 1 and sum. Then iterate through the whole set and calculate the sum of weights of the items traversed. If it is greater than or equal to R, stop traversing and select the items encountered.

As an example, sum is equal to 10, and if it is random to 1-5, it will exit the traversal at the first digit. It fits the chosen probability.

The time required to select is O (n).


# coding:utf-8
import random

list = ['A', 'B', 'C', 'D']


def weight_choice(weight):
  """
  :param weight: list The corresponding weight sequence 
  :return: The index of the selected value in the original list 
  """
  t = random.randint(0, sum(weight) - 1)
  for i, val in enumerate(weight):
    t -= val
    if t < 0:
      return i


if __name__ == "__main__":
  print(list[weight_choice([5, 2, 2, 1])])

Method 3:

You can sort the original sequence by weight first. So when you're walking through, you're going to be able to run into things that have a high probability, and you're going to be able to reduce the number of things that you're walking through. (because rnd decreases the fastest (subtract the largest number first))

Compare {A:5, B:2, C:2, D:1} with {B:2, C:2, A:5, D:1}

The expectation of the former steps traversal is 5/10 * 2/10 * * 2 + 3 + 1 + 2/10 1/10 * * 1 + 4 = 19/10 while the latter is 2/10 2/10 5/10 * * 2 + 3 * 4 + 1/10 = 25/10.

This improves the average selection speed, but the sorting of the original sequence also takes time.

After generating a random number t, you can use the two-point method to find the prefix and the sequence. Then, the time complexity of the selection is O(logn).


# coding:utf-8
import random
import bisect

list = ['A', 'B', 'C', 'D']


def weight_choice(weight):
  """
  :param weight: list The corresponding weight sequence 
  :return: The index of the selected value in the original list 
  """
  weight_sum = []
  sum = 0
  for a in weight:
    sum += a
    weight_sum.append(sum)
  t = random.randint(0, sum - 1)
  return bisect.bisect_right(weight_sum, t)


if __name__ == "__main__":
  print(list[weight_choice([5, 2, 2, 1])])

Related articles: