副作用还是Feature？

我们的Python应用需要一个全局变量保存一些公用的值，但是不希望其他人随意往里面添加属性，导致这个对象很乱。于是我们是这样定义的：

In [1]: class A:
   ...:     __slots__ = ('foo', 'bar')
   ...:

In [2]: a = A()

In [3]: a.foo = 'hello'

In [4]: a.bar = 'world'

In [5]: a.new_attr = 100
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-5-d2dacf7c681b> in <module>()
----> 1 a.new_attr = 100

AttributeError: 'A' object has no attribute 'new_attr'

In [1]: class A:

...: __slots__ = ('foo', 'bar')

...:

In [2]: a = A()

In [3]: a.foo = 'hello'

In [4]: a.bar = 'world'

In [5]: a.new_attr = 100

---------------------------------------------------------------------------

AttributeError Traceback (most recent call last)

<ipython-input-5-d2dacf7c681b> in <module>()

----> 1 a.new_attr = 100

AttributeError: 'A' object has no attribute 'new_attr'

很多人提到“限制类的属性”就会很自然的想到__slots__，我认为这并不合适。__slots__的初衷是节省对象占用的内存，如果我们的app中某个对象可能有上百万个，就要考虑到将该对象变成__slots__定义的了。禁止赋予对象__slots__声明之外的属性名，这只是节省内存的Feature所带来的一个副作用。它是用来优化程序的，并不是来约束程序员的。

如果这么写，可能带来的缺点有：

__slots__并不会继承，也就是说，如果子类继承了有__slots__的类，子类不会有__slots__存在，你要记住在每一个子类都写上。
__slots__存在之后，该类就不能成为弱引用的目标（具体原因可以看弱引用的原理），除非将__weakref__加入到__slots__中。但是这样做将会污染__slots__变量，其他看到这个东西的时候需要分辨哪些是app的变量，哪些语言需要的变量。
前面已经说到了，__slots__存在的目标是为了优化存储空间。如果有一天，Python发现可以动态地向对象添加属性而依然节省内存的方法，可能就会破坏我们的程序。换句话说，Python是不会保证未来依然保留“节省内存”所带来的这个副作用的。

我觉得这个地方争取的实现应该是用魔术方法 __setattr__每次赋值都会经过该方法：

In [6]: class B:
   ...:     const_attr = ('foo', 'bar')
   ...:     def __setattr__(self, key, value):
   ...:         if key in self.const_attr:
   ...:             super().__setattr__(key, value)
   ...:         else:
   ...:             raise AttributeError("{} can't set new attribute '{}'".format(self, key))
   ...:

In [7]: b = B()

In [8]: b.foo = 'a'

In [9]: b.bar = 'b'

In [10]: b.hooo = 'c'
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-10-20179c345e92> in <module>()
----> 1 b.hooo = 'c'

<ipython-input-6-298c6584c3ff> in __setattr__(self, key, value)
      5             super().__setattr__(key, value)
      6         else:
----> 7             raise AttributeError("{} can't set new attribute '{}'".format(self, key))
      8

AttributeError: <__main__.B object at 0x1084d98d0> can't set new attribute 'hooo'

In [6]: class B:

...: const_attr = ('foo', 'bar')

...: def __setattr__(self, key, value):

...: if key in self.const_attr:

...: super().__setattr__(key, value)

...: else:

...: raise AttributeError("{} can't set new attribute '{}'".format(self, key))

...:

In [7]: b = B()

In [8]: b.foo = 'a'

In [9]: b.bar = 'b'

In [10]: b.hooo = 'c'

---------------------------------------------------------------------------

AttributeError Traceback (most recent call last)

<ipython-input-10-20179c345e92> in <module>()

----> 1 b.hooo = 'c'

<ipython-input-6-298c6584c3ff> in __setattr__(self, key, value)

5 super().__setattr__(key, value)

6 else:

----> 7 raise AttributeError("{} can't set new attribute '{}'".format(self, key))

AttributeError: <__main__.B object at 0x1084d98d0> can't set new attribute 'hooo'

是不是更加Pythonic？

另一个例子是Python的dict，在Python3.6中，提到内置的dict遍历的时候会保持插入的顺序，但是又强调这是一个为了节省dict内存而带来的一个副作用，并不是语言设计的标准，不应该被依赖：

The dict type has been reimplemented to use a more compact representation based on a proposal by Raymond Hettinger and similar to the PyPy dict implementation. This resulted in dictionaries using 20% to 25% less memory when compared to Python 3.5.

The order-preserving aspect of this new implementation is considered an implementation detail and should not be relied upon。

有关是否能使用Python3.6的dict保持插入顺序，这里有个很好的讨论。我也认为依赖这个“副作用”是个严重错误。

这不是Python语言的标准，不兼容其他解释器
Python并不知道你是否依赖了dict的顺序，如果有错误，Python解释器层面不会报出错误
鉴于这不是一个语言标准，很可能被碰巧知道这个“实现细节”的人运用了这个“副作用”，但是别人却不知道（不知道实现细节也可以是一个合格Python程序员），未来修改代码可能不会注意其实个地方遍历是要根据插入顺序的，留下了隐患。

Python3.7中，这成为了一个语言特性，所以3.7+我们就可以放心使用啦！

Make it so. “Dict keeps insertion order” is the ruling. Thanks!

Guido van Rossum

今天在论坛看到一个有意思的问题：if foobar != None 和 if foobar is not None 是完全等价的吗？

挺有意思，这个问题我又想了一下：为什么大家比较 None 的时候用 is ，但是比价字符串（字符串也会有驻留）却用 == 。我自己的思考是：None是文档写明的全局变量，而字符串的驻留确是一个解释器为了优化而带来的副作用，不能依赖，解释器可能在某个时候决定不再缓存某个字符串，所以这是不可靠的。正好这个问题和本文的主题比较切合，我就贴一下自己的回答：

楼上 @gwki 说的很清楚了！

但是对于 None 来说有一点区别，你看很多 Python 代码就会发现：大部分情况下我们用 if foo is None 来做判断，因为 None 在 Python 中是一个全局唯一变量。官方文档中说：Since None is a singleton, testing for object identity (using == in C) is sufficient. 所以官方是推荐用 id 来 check 的。

即：None 只有一个，不存在值为 None 但是与 id(None) 不相等的情况。

写作 if foo != None 有点不 Pythonic （反正我是没这么见过哈哈哈）。

问题 2：

foo = 0
if foo 判断为假，
if foo is not None 判断为真。所以 is 判断的是 id 相同（对于 None 来说判断 id 相同和判断值相同没有太大区别，反正只有 1 个）。

所以二者是不一样的，除了 None 之外，文档（ https://docs.python.org/3.6/library/stdtypes.html#truth-value-testing ）还有下面的判断为假：

– constants defined to be false: None and False.
– zero of any numeric type: 0, 0.0, 0j, Decimal(0), Fraction(0, 1)
– empty sequences and collections: ”, (), [], {}, set(), range(0)

再啰嗦一点，对于不可变对象，为了避免重复创建，Python 做了驻留处理。比如下面代码：

>>> s1 = “ABC”
>>> s2 = “ABC”
>>> s1 is s2
True

但是我们实际比较二者的时候，应该用 s1 == s2。因为驻留操作是 CPython 的实现细节。副作用不应该被依赖。

一	二	三	四	五	六	日
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30	31

“副作用还是Feature？”已经有2条评论

Leave a comment 取消回复