Warning: the following is an oversimplification; I'm ignoring __new__()
and a bunch of other special class methods, and handwaving a lot of details. But this explanation will get you pretty far in Python.
When you create an instance of a class in Python, like calling ABC() in your example:
abc = ABC()
Python creates a new empty object and sets its class to ABC. Then it calls the __init__()
if there is one. Finally it returns the object.
When you ask for an attribute of an object, first it looks in the instance. If it doesn't find it, it looks in the instance's class. Then in the base class(es) and so on. If it never finds anybody with the attribute defined, it throws an exception.
When you assign to an attribute of an object, it creates that attribute if the object doesn't already have one. Then it sets the attribute to that value. If the object already had an attribute with that name, it drops the reference to the old value and takes a reference to the new one.
These rules make the behavior you observe easy to predict. After this line:
abc = ABC()
only the ABC object (the class) has an attribute named x. The abc instance doesn't have its own x yet, so if you ask for one you're going to get the value of ABC.x. But then you reassign the attribute x on both the class and the object. And when you subsequently examine those attributes you observe the values you put there are still there.
Now you should be able to predict what this code does:
class ABC:
x = 6
a = ABC()
ABC.xyz = 5
print(ABC.xyz, a.xyz)
Yes: it prints two fives. You might have expected it to throw an AttributeError exception. But Python finds the attribute in the class--even though it was added after the instance was created.
This behavior can really get you in to trouble. One classic beginner mistake in Python:
class ABC:
x = []
a = ABC()
a.x.append(1)
b = ABC()
print(b.x)
That will print [1]. All instances of ABC() are sharing the same list. What you probably wanted was this:
class ABC:
def __init__(self):
self.x = []
a = ABC()
a.x.append(1)
b = ABC()
print(b.x)
That will print an empty list as you expect.
To answer your exact questions:
My question is, what is the purpose of defining a variable in the class definition in python if it seems like i can anyone set at any time any variable without doing so?
I assume this means "why should I assign members inside the class, instead of inside the __init__
method?"
As a practical matter, this means the instances don't have their own copy of the attribute (or at least not yet). This means the instances are smaller; it also means accessing the attribute is slower. It also means the instances all share the same value for that attribute, which in the case of mutable objects may or may not be what you want. Finally, assignments here mean that the value is an attribute of the class, and that's the most straightforward way to set attributes on the class.
As a purely stylistic matter it's shorter code, as you don't have all those instances of self. all over. Beyond that it doesn't make much difference. However, assigning attributes in the __init__
method ensures they are unambiguously instance variables.
I'm not terribly consistent myself. The only thing I'm sure to do is assign all the mutable objects that I don't want shared in the __init__
method.
Also, does python know the difference between instance and static variables? From what I saw, I'd say so.
Python classes don't have class static variables like C++ does. There are only attributes: attributes of the class object, and attributes of the instance object. And if you ask for an attribute, and the instance doesn't have it, you'll get the attribute from the class.
The closest approximation of a class static variable in Python would be a hidden module attribute, like so:
_x = 3
class ABC:
def method(self):
global _x
# ...
It's not part of the class per se. But this is a common Python idiom.
Assuming what you want is "a variable that is initialised only once on first function call", there's no such thing in Python syntax. But there are ways to get a similar result:
1 - Use a global. Note that in Python, 'global' really means 'global to the module', not 'global to the process':
_number_of_times = 0
def yourfunc(x, y):
global _number_of_times
for i in range(x):
for j in range(y):
_number_of_times += 1
2 - Wrap you code in a class and use a class attribute (ie: an attribute that is shared by all instances). :
class Foo(object):
_number_of_times = 0
@classmethod
def yourfunc(cls, x, y):
for i in range(x):
for j in range(y):
cls._number_of_times += 1
Note that I used a classmethod
since this code snippet doesn't need anything from an instance
3 - Wrap you code in a class, use an instance attribute and provide a shortcut for the method:
class Foo(object):
def __init__(self):
self._number_of_times = 0
def yourfunc(self, x, y):
for i in range(x):
for j in range(y):
self._number_of_times += 1
yourfunc = Foo().yourfunc
4 - Write a callable class and provide a shortcut:
class Foo(object):
def __init__(self):
self._number_of_times = 0
def __call__(self, x, y):
for i in range(x):
for j in range(y):
self._number_of_times += 1
yourfunc = Foo()
4 bis - use a class attribute and a metaclass
class Callable(type):
def __call__(self, *args, **kw):
return self._call(*args, **kw)
class yourfunc(object):
__metaclass__ = Callable
_numer_of_times = 0
@classmethod
def _call(cls, x, y):
for i in range(x):
for j in range(y):
cls._number_of_time += 1
5 - Make a "creative" use of function's default arguments being instantiated only once on module import:
def yourfunc(x, y, _hack=[0]):
for i in range(x):
for j in range(y):
_hack[0] += 1
There are still some other possible solutions / hacks, but I think you get the big picture now.
EDIT: given the op's clarifications, ie "Lets say you have a recursive function with default parameter but if someone actually tries to give one more argument to your function it could be catastrophic", it looks like what the OP really wants is something like:
# private recursive function using a default param the caller shouldn't set
def _walk(tree, callback, level=0):
callback(tree, level)
for child in tree.children:
_walk(child, callback, level+1):
# public wrapper without the default param
def walk(tree, callback):
_walk(tree, callback)
Which, BTW, prove we really had Yet Another XY Problem...
Best Answer
The idea behind this omission is that static variables are only useful in two situations: when you really should be using a class and when you really should be using a generator.
If you want to attach stateful information to a function, what you need is a class. A trivially simple class, perhaps, but a class nonetheless:
If you want your function's behavior to change each time it's called, what you need is a generator:
Of course, static variables are useful for quick-and-dirty scripts where you don't want to deal with the hassle of big structures for little tasks. But there, you don't really need anything more than
global
— it may seem a but kludgy, but that's okay for small, one-off scripts: