How To Generate Random Pairs Of Numbers In Python, Including Pairs With One Entry Being The Same And Excluding Pairs With Both Entries Being The Same?
Solution 1:
Generator random unique coordinates:
from random import randint
def gencoordinates(m, n):
seen = set()
x, y = randint(m, n), randint(m, n)
while True:
seen.add((x, y))
yield (x, y)
x, y = randint(m, n), randint(m, n)
while (x, y) in seen:
x, y = randint(m, n), randint(m, n)
Output:
>>>g = gencoordinates(1, 100)>>>next(g)
(42, 98)
>>>next(g)
(9, 5)
>>>next(g)
(89, 29)
>>>next(g)
(67, 56)
>>>next(g)
(63, 65)
>>>next(g)
(92, 66)
>>>next(g)
(11, 46)
>>>next(g)
(68, 21)
>>>next(g)
(85, 6)
>>>next(g)
(95, 97)
>>>next(g)
(20, 6)
>>>next(g)
(20, 86)
As you can see coincidentally an x
coordinate was repeated!
Solution 2:
Let's say that your x and y coordinates are all integers between 0 and n. For small n a simple method might be to generate the set of all possible xy coordinates using np.mgrid
, reshape it to a (nx * ny, 2)
array, then sample random rows from this:
nx, ny = 100, 200
xy = np.mgrid[:nx,:ny].reshape(2, -1).T
sample = xy.take(np.random.choice(xy.shape[0], 100, replace=False), axis=0)
Creating the array of all possible coordinates can become expensive if nx and/or ny is very large, in which case it might be better to use a generator object and keep track of previously used coordinates, as in James' answer.
Following @morningsun's suggestion, an alternative method is to sample from the set of nx*ny indices into the flattened array then convert these directly to x, y coordinates, which avoids constructing the whole nx*ny array of possible x, y permutations.
For comparison, here's a version of my original approach generalized for N-dimensional arrays, plus a version that uses the new approach:
def sample_comb1(dims, nsamp):
perm = np.indices(dims).reshape(len(dims), -1).T
idx = np.random.choice(perm.shape[0], nsamp, replace=False)
return perm.take(idx, axis=0)
def sample_comb2(dims, nsamp):
idx = np.random.choice(np.prod(dims), nsamp, replace=False)
return np.vstack(np.unravel_index(idx, dims)).T
There's not a huge difference in practice, but the benefits of the second method become a bit more apparent for larger arrays:
In [1]:%timeitsample_comb1((100,200),100)100loops,best of 3:2.59msperloopIn [2]:%timeitsample_comb2((100,200),100)100loops,best of 3:2.4msperloopIn [3]:%timeitsample_comb1((1000,2000),100)1loops,best of 3:341msperloopIn [4]:%timeitsample_comb2((1000,2000),100)1loops,best of 3:319msperloop
If you have scikit-learn installed, sklearn.utils.random.sample_without_replacement
offers a much faster method for generating random indices without replacement using Floyd's algorithm:
from sklearn.utils.random import sample_without_replacement
def sample_comb3(dims, nsamp):
idx = sample_without_replacement(np.prod(dims), nsamp)
return np.vstack(np.unravel_index(idx, dims)).T
In [5]: %timeit sample_comb3((1000, 2000), 100)
The slowest run took 4.49 times longer than the fastest. This could mean that an
intermediate result is being cached
10000 loops, best of 3: 53.2 µs per loop
Solution 3:
@James Miles answer is great, but just to avoid endless loops when accidentally asking for too many arguments I suggest the following (it also removes some repetitions):
defgencoordinates(m, n):
seen = set()
x, y = randint(m, n), randint(m, n)
whilelen(seen) < (n + 1 - m)**2:
while (x, y) in seen:
x, y = randint(m, n), randint(m, n)
seen.add((x, y))
yield (x, y)
return
Note that wrong range of values will still propagate down.
Post a Comment for "How To Generate Random Pairs Of Numbers In Python, Including Pairs With One Entry Being The Same And Excluding Pairs With Both Entries Being The Same?"