Python Pandas – Drop Row if Two Columns Are NaN

dataframepandaspython

This is an extension to this question, where OP wanted to know how to drop rows where the values in a single column are NaN.

I'm wondering how I can drop rows where the values in 2 (or more) columns are both NaN. Using the second answer's created Data Frame:

In [1]: df = pd.DataFrame(np.random.randn(10,3))

In [2]: df.ix[::2,0] = np.nan; df.ix[::4,1] = np.nan; df.ix[::3,2] = np.nan;

In [3]: df
Out[3]:
          0         1         2
0       NaN       NaN       NaN
1  2.677677 -1.466923 -0.750366
2       NaN  0.798002 -0.906038
3  0.672201  0.964789       NaN
4       NaN       NaN  0.050742
5 -1.250970  0.030561 -2.678622
6       NaN  1.036043       NaN
7  0.049896 -0.308003  0.823295
8       NaN       NaN  0.637482
9 -0.310130  0.078891       NaN

If I use the drop.na() command, specifically the drop.na(subset=[1,2]), then it completes an "or" type drop and leaves:

In[4]: df.dropna(subset=[1,2])
Out[4]: 
          0         1         2
1  2.677677 -1.466923 -0.750366
2       NaN  0.798002 -0.906038
5 -1.250970  0.030561 -2.678622
7  0.049896 -0.308003  0.823295

What I want is an "and" type drop, where it drops rows where there is an NaN in column index 1 and 2. This would leave:

          0         1         2
1  2.677677 -1.466923 -0.750366
2       NaN  0.798002 -0.906038
3  0.672201  0.964789       NaN
4       NaN       NaN  0.050742
5 -1.250970  0.030561 -2.678622
6       NaN  1.036043       NaN
7  0.049896 -0.308003  0.823295
8       NaN       NaN  0.637482
9 -0.310130  0.078891       NaN

where only the first row is dropped.

Any ideas?

EDIT: changed data frame values for consistency

Best Answer

Any one of the following two:

df.dropna(subset=[1, 2], how='all')

or

df.dropna(subset=[1, 2], thresh=1)
Related Question