So basically I want 5 > np.nan
return np.nan
or Nan instead of FALSE
In pandas series, here's the code :
import pandas as pd
import numpy as np
a = pd.DataFrame({"x":[1,2,3,4],"y":[1,np.nan,5,1]})
a["x"]>a["y"]
will return :
0 False
1 False
2 False
3 True
dtype: bool
My current approach to preserve the Nan information is :
value_comparison = a["x"]>a["y"]
nan_comparison = a["x"].isna() | a["y"].isna()
value_comparison.where(~nan_comparison,np.nan)
where it returns
0 0.0
1 NaN
2 0.0
3 1.0
dtype: float64
I took the similar approach for numpy comparison too
Even when the result is correct, I believe my solution is not elegant, is there any better(pandas and numpy) way to do this, which follows the zen of python ? (better readability, more straighforward)
Best Answer
Only a bit improved/(changed) your solution:
Last is possible convert to
nullable boolean
: