Initializing Numpy Array From Np.empty
Solution 1:
numpy.empty
isn't clearing the sign bits manually or anything. The sign bits are just whatever garbage happens to be left in those bits of the malloc
return value. The effect you're seeing is due to a numpy.absolute
call somewhere else.
The thing is, numpy.empty
isn't reusing the randn
return value's buffer. After all, the randn
return value is still alive when empty
creates its array, due to the _
variable.
numpy.empty
is reusing the buffer of an array created in the process of stringifying the first array. I believe it's this one:
def fillFormat(self, data):
# only the finite values are used to compute the number of digits
finite_vals = data[isfinite(data)]
# choose exponential mode based on the non-zero finite values:
abs_non_zero = absolute(finite_vals[finite_vals != 0])
...
See that absolute
call? That's the one.
Here's some additional testing that supports that conclusion:
>>> a = numpy.random.randn(3, 3)
>>> b = numpy.arange(-5, 4, dtype=float)
>>> c = numpy.arange(-5, 13, 2, dtype=float)
>>> a
array([[-0.96810932, 0.86091026, -0.32675013],
[-1.23458136, 0.56151178, -0.37409982],
[-1.71348979, 0.64170792, -0.20679512]])
>>> numpy.empty((3, 3))
array([[ 0.96810932, 0.86091026, 0.32675013],
[ 1.23458136, 0.56151178, 0.37409982],
[ 1.71348979, 0.64170792, 0.20679512]])
>>> b
array([-5., -4., -3., -2., -1., 0., 1., 2., 3.])
>>> numpy.empty((3, 3))
array([[ 0.96810932, 0.86091026, 0.32675013],
[ 1.23458136, 0.56151178, 0.37409982],
[ 1.71348979, 0.64170792, 0.20679512]])
>>> c
array([ -5., -3., -1., 1., 3., 5., 7., 9., 11.])
>>> numpy.empty((3, 3))
array([[ 5., 3., 1.],
[ 1., 3., 5.],
[ 7., 9., 11.]])
>>> numpy.array([1.0, 0, 2, 3, 4, 5, 6, 7, 8, 9])
array([ 1., 0., 2., 3., 4., 5., 6., 7., 8., 9.])
>>> numpy.empty((3, 3))
array([[ 1., 2., 3.],
[ 4., 5., 6.],
[ 7., 8., 9.]])
The numpy.empty
results are affected by printing a
and c
, rather than by the process of creating those arrays. b
has no effect, because it has 8 nonzero elements. The final array([1.0, 0, 2, ...])
has an effect, because even though it has 10 elements, exactly 9 of them are nonzero.
Solution 2:
Keeping in mind that NumPy is written in C (and some Fortran, C++), and the answer may be unrelated to Python, I'll try to use a few example to elucidate what's happening. The multi-language aspect makes this quite tricky, so you may need to inspect the implementation of the np.empty() function here: https://github.com/numpy/numpy/blob/master/numpy/matlib.py#L13
Did you try:
import numpy as np
print(np.random.randn(3,3))
print(np.empty((3,3)))
I get output: (signs are preserved)
[[-1.13898052 0.99079467 -0.07773854]
[ 1.18519122 1.30324795 -0.38748375]
[-1.46435162 0.53163777 0.22004651]]
[[-1.13898052 0.99079467 -0.07773854]
[ 1.18519122 1.30324795 -0.38748375]
[-1.46435162 0.53163777 0.22004651]]
You'll notice the behavior changes based on two things:
- whether you print or just output the value
- how many empty arrays you create
For example, try running these two examples:
# Run this over and over and you'll always get different results!
a = np.random.randn(3,3)
b = np.empty((3,3))
c = np.empty((3,3))
print(a, id(a)) # id gives memory address of array
print(b, id(b))
print(c, id(c))
with output:
[[ 0.25754195 1.13184341 -0.46048928]
[-0.80635852 0.92340661 2.08962923]
[ 0.09552521 0.14940356 0.5644782 ]] 139865678073408
[[-1.63665076 -0.41916461 0.9251386 ]
[ 2.72595838 0.10575355 -0.03555088]
[ 0.71242678 0.09749262 0.24742165]] 139865678071568
[[-0.41824453 0.66565604 1.52995102]
[ 0.8365397 0.32796832 -0.07150151]
[-0.08558753 0.96326938 -0.56601338]] 139865678072688
versus
# Run this 2 or more times and b and c will always be the same!
a = np.random.randn(3,3)
b = np.empty((3,3))
c = np.empty((3,3))
>>> a, id(a) # output without using print
(array([[-0.04230878, 0.18081425, 0.36880091],
[ 0.4426956 , -1.31697583, 1.53143212],
[ 0.58197615, 0.42028897, 0.27644022]]), 139865678070528)
>>> b, id(b)
(array([[-0.41824453, 0.66565604, 1.52995102],
[ 0.8365397 , 0.32796832, -0.07150151],
[-0.08558753, 0.96326938, -0.56601338]]), 139865678048912)
>>> c, id(c) # c will have the same values as b!
(array([[-0.41824453, 0.66565604, 1.52995102],
[ 0.8365397 , 0.32796832, -0.07150151],
[-0.08558753, 0.96326938, -0.56601338]]), 139865678069888)
Trying running each multiple times in a row to give the memory a chance to fall into a pattern. Also, you'll get different behavior depending on which order you run those two blocks.
Noting the behavior of 'empty' arrays b and c when we print and don't print, I'd guess there is a sort of "lazy evaluation" happening with using output and because the memory remains 'free' (that's why c gets the same value as b in the last example), Python has no obligation to print exact values for an array that hasn't actually memory-allocated (malloc'd) yet, i.e. unsigned representations, or really anything is fair game until you 'use'. In my examples, I 'use' the array by printing it, and that may explain why in my first example you see the signs are preserved with printing.
Post a Comment for "Initializing Numpy Array From Np.empty"