How can I select rows from a DataFrame based on values in some column in Pandas?
In SQL, I would use:
SELECT *
FROM table
WHERE column_name = some_value
dataframepandaspython
How can I select rows from a DataFrame based on values in some column in Pandas?
In SQL, I would use:
SELECT *
FROM table
WHERE column_name = some_value
Best Answer
To select rows whose column value equals a scalar,
some_value
, use==
:To select rows whose column value is in an iterable,
some_values
, useisin
:Combine multiple conditions with
&
:Note the parentheses. Due to Python's operator precedence rules,
&
binds more tightly than<=
and>=
. Thus, the parentheses in the last example are necessary. Without the parenthesesis parsed as
which results in a Truth value of a Series is ambiguous error.
To select rows whose column value does not equal
some_value
, use!=
:The
isin
returns a boolean Series, so to select rows whose value is not insome_values
, negate the boolean Series using~
:For example,
yields
If you have multiple values you want to include, put them in a list (or more generally, any iterable) and use
isin
:yields
Note, however, that if you wish to do this many times, it is more efficient to make an index first, and then use
df.loc
:yields
or, to include multiple values from the index use
df.index.isin
:yields