*There are some things that I kind of feel torn about, like operator overloading. I left out operator overloading as a fairly personal choice because I had seen too many people abuse it in C++.[1] — James Gosling Creator of Java*
Operator overloading allows user-defined objects to interoperate with infix operators such as+and|or unary operators like-and~. More generally, function invocation (()), attribute access (.), and item access/slicing ([]) are also operators in Python, but this chapter covers unary and infix operators.
运算符重载
In “Emulating Numeric Types” on page 9 (Chapter 1) we saw some trivial implemen‐ tations of operators in a bare bones Vector class. The add and mul methods in Example 1-2 were written to show how special methods support operator overload‐ ing, but there are subtle problems in their implementations that we overlooked. Also, in Example 9-2, we noted that the Vector2d.eq method considers this to be True: Vector(3, 4) == [3, 4]—which may or not make sense. We will address those matters in this chapter.
In the following sections, we will cover:
接下来的小节中,我们涉及的内容有:
-
How Python supports infix operators with operands of different types
-
Using duck typing or explicit type checks to deal with operands of various types
-
How an infix operator method should signal it cannot handle an operand
-
The special behavior of the rich comparison operators (e.g., ==, >, <=, etc.)
-
Python如何使用不同类型的运算对象来支持中缀运算符
-
使用鸭子类型或者明确的类型检查来出处理多种类型的运算对象
[1. Source: “The C Family of Languages: Interview with Dennis Ritchie, Bjarne Stroustrup, and James Gosling”. 371]
- The default handling of augmented assignment operators, like +=, and how to over‐ load them
Operator overloading has a bad name in some circles. It is a language feature that can be (and has been) abused, resulting in programmer confusion, bugs, and unexpected performance bottlenecks. But if well used, it leads to pleasurable APIs and readable code. Python strikes a good balance between flexibility, usability, and safety by imposing some limitations:
在某些圈子里运算符重载声名狼藉。
-
We cannot overload operators for the built-in types.
-
We cannot create new operators, only overload existing ones.
-
A few operators can’t be overloaded: is, and, or, not (but the bitwise &, |, ~, can).
-
我们不能够重载内建类型运算符
-
我们不能够创建新的运算符,只能重载已有的。
-
部分运算符是不能够被重载的:is和or,not(但是比特位运算符却可以,&, |, ~ 。)
In Chapter 10, we already had one infix operator in Vector: ==, supported by the eq method. In this chapter, we’ll improve the implementation of eq to better handle operands of types other than Vector. However, the rich comparison operators (==, !=, >, <, >=, <=) are special cases in operator overloading, so we’ll start by overloading four arithmetic operators in Vector: the unary - and +, followed by the infix + and *.
在第十章,我们在Vector中已经拥有一个中缀运算符:==,由__eq__
方法提供支持。在本章,我们
Let’s start with the easiest topic: unary operators.
In The Python Language Reference, “6.5. Unary arithmetic and bitwise operations” lists three unary operators, shown here with their associated special methods:
在Python语言手册中,
- (__neg__)
Arithmetic unary negation. If x is -2 then -x == 2.
+ (__pos__)
Arithmetic unary plus. Usually x == +x, but there are a few cases when that’s not true. See “When x and +x Are Not Equal” on page 373 if you’re curious.
~ (__invert__)
Bitwise inverse of an integer, defined as ~x == -(x+1). If x is 2 then ~x == -3.
The Data Model” chapter of The Python Language Reference also lists the abs(...) built- in function as a unary operator. The associated special method is abs, as we’ve seen before, starting with “Emulating Numeric Types” on page 9.
It’s easy to support the unary operators. Simply implement the appropriate special method, which will receive just one argument: self. Use whatever logic makes sense in your class, but stick to the fundamental rule of operators: always return a new object. In other words, do not modify self, but create and return a new instance of a suitable type.
支持一元运算符很简单。简单地实现只接受一个参数的特殊方法即可:self。在类中使用任意能理解的逻辑,但是要
In the case of - and +, the result will probably be an instance of the same class as self; for +, returning a copy of self is the best approach most of the time. For abs(...), the result should be a scalar number. As for ~, it’s difficult to say what would be a sensible result if you’re not dealing with bits in an integer, but in an ORM it could make sense to return the negation of an SQL WHERE clause, for example.
在 - 和 + 的情景中,结果可能是同一个类的实例
As promised before, we’ll implement several new operators on the Vector class from Chapter 10. Example 13-1 shows the abs method we already had in Example 10-16, and the newly added neg and pos unary operator method.
就像之前承诺过的那样,我们会基于第十章的Vector类智商实现多个新的运算。例子13-1展示了我们在例子10-16中已有的__abs__
方法,并新增了一元运算符方法__neg__
和__pos__
。
Example 13-1. vector_v6.py: unary operators - and + added to Example 10-16
例子13-1。
def __abs__(self):
return math.sqrt(sum(x * x for x in self))
def __neg__(self):
return Vector(-x for x in self) # 1
def __pos__(self): # 2
return Vector(self)
-
To compute -v, build a new Vector with every component of self negated.
-
To compute +v, build a new Vector with every component of self.
-
为了计算-v,
Recall that Vector instances are iterable, and the Vector.init takes an iterable argument, so the implementations of neg and pos are short and sweet.
回一下,Vector的实例是可变的,而且Vector.__init__接受一个可变参数,所以
We’ll not implement invert, so if the user tries ~v on a Vector instance, Python will raise TypeError with a clear message: “bad operand type for unary ~: 'Vector'.” The following sidebar covers a curiosity that may help you win a bet about unary + someday. The next important topic is “Overloading + for Vector Addition” on page 375.
我们不会实现__invert__,所以对Vector的实例操作~v,Python会抛出包含明确消息的TypeError:“bad operand type for unary ~: 'Vector'.”。
Everybody expects that x == +x, and that is true almost all the time in Python, but I found two cases in the standard library where x != +x. 每个人都期待x == +x,并且大多数时候在Python中都是true,但是我在标准库中发现有两种情况, x != +x。 The first case involves the decimal.Decimal class. You can have x != +x if x is a Deci mal instance created in an arithmetic context and +x is then evaluated in a context with different settings. For example, x is calculated in a context with a certain precision, but the precision of the context is changed and then +x is evaluated. See Example 13-2 for a demonstration. 第一中情况涉及decimal.Decimal类。你可以
Example 13-2. A change in the arithmetic context precision may cause x to differ from +x 例子13-2. 数字上下文精度的改变可能引起x与+x的不同
>> import decimal >> ctx = decimal.getcontext() >> ctx.prec = 40 >> one_third = decimal.Decimal('1') / decimal.Decimal('3') >> one_third Decimal('0.3333333333333333333333333333333333333333') >> one_third == +one_third
True
ctx.prec = 28 one_third == +one_third False +one_third Decimal('0.3333333333333333333333333333')
1. Get a reference to the current global arithmetic context.
2. Set the precision of the arithmetic context to 40.
3. Compute 1/3 using the current precision.
4. Inspect the result; there are 40 digits after the decimal point.
5. one_third == +one_third is True.
5. Lower precision to 28—the default for Decimal arithmetic in Python 3.4. Now one_third == +one_third is False.
6. Inspect +one_third; there are 28 digits after the '.' here.
1.
>The fact is that each occurrence of the expression +one_third produces a new Deci mal instance from the value of one_third, but using the precision of the current arith‐ metic context.
>The second case where x != +x you can find in the collections.Counter documen‐ tation. The Counter class implements several arithmetic operators, including infix + to add the tallies from two Counter instances. However, for practical reasons, Counter addition discards from the result any item with a negative or zero count. And the prefix + is a shortcut for adding an empty Counter, therefore it produces a new Counter preserving only the tallies that are greater than zero. See Example 13-3.
Example 13-3. Unary + produces a new Counter without zeroed or negative tallies
```shell
>>> ct = Counter('abracadabra')
>>> ct
Counter({'a': 5, 'r': 2, 'b': 2, 'd': 1, 'c': 1}) >>> ct['r'] = -3
>>> ct['d'] = 0
>>> ct
Counter({'a': 5, 'b': 2, 'c': 1, 'd': 0, 'r': -3}) >>> +ct Counter({'a': 5, 'b': 2, 'c': 1}) Now, back to our regularly scheduled programming.
The Vector class is a sequence type, and the section “3.3.6. Em‐ ulating container types” in the “Data Model” chapter says sequen‐ ces should support the + operator for concatenation and * for repetition. However, here we will implement + and * as mathe‐ matical vector operations, which are a bit harder but more mean‐ ingful for a Vector type.
Vector类是一个序列类型,而且在“数据模型”章中的小节“3.3.6 模拟容器类型”也说过,要支持合并与重复序列应该支持 + 和 * 运算符。不过这里我们会把 + 和 * 实现为数学矢量运算符,
Adding two Euclidean vectors results in a new vector in which the components are the pairwise additions of the components of the addends. To illustrate:
在新的vector中添加两个欧几里德矢量结果
>>> v1 = Vector([3, 4, 5])
>>> v2 = Vector([6, 7, 8])
>>>v1+v2
Vector([9.0, 11.0, 13.0])
>>> v1 + v2 == Vector([3+6, 4+7, 5+8])
True
What happens if we try to add two Vector instances of different lengths? We could raise an error, but considering practical applications (such as information retrieval), it’s better to fill out the shortest Vector with zeros. This is the result we want:
如果我们尝试对两个不同长度的Vector实例合并会发生什么?我们
>>> v1 = Vector([3, 4, 5, 6])
>>> v3 = Vector([1, 2])
>>>v1+v3
Vector([4.0, 6.0, 5.0, 6.0])
Given these basic requirements, the implementation of add is short and sweet, as shown in Example 13-4.
Example 13-4. Vector.add method, take #1
# inside the Vector class
def __add__(self, other):
pairs = itertools.zip_longest(self, other, fillvalue=0.0) # return Vector(a + b for a, b in pairs) #
- pairs is a generator that will produce tuples (a, b) where a is from self, and b is from other. If self and other have different lengths, fillvalue is used to supply the missing values for the shortest iterable.
- A new Vector is built from a generator expression producing one sum for each item in pairs.
Note how add returns a new Vector instance, and does not affect self or other.
Special methods implementing unary or infix operators should never change their operands. Expressions with such operators are expected to produce results by creating new objects. Only aug‐ mented assignment operators may change the first operand (self), as discussed in “Augmented Assignment Operators” on page 388.
Example 13-4 allows adding Vector to a Vector2d, and Vector to a tuple or to any iterable that produces numbers, as Example 13-5 proves.
Example 13-5. Vector.add take #1 supports non-Vector objects, too
>>> v1 = Vector([3, 4, 5])
>>> v1 + (10, 20, 30)
Vector([13.0, 24.0, 35.0])
>>> from vector2d_v3 import Vector2d
>>> v2d = Vector2d(1, 2)
>>>v1+v2d
Vector([4.0, 6.0, 5.0])
Both additions in Example 13-5 work because add uses zip_longest(...), which can consume any iterable, and the generator expression to build the new Vector merely performs a + b with the pairs produced by zip_longest(...), so an iterable producing any number items will do.
However, if we swap the operands (Example 13-6), the mixed-type additions fail.
Example 13-6. Vector.add take #1 fails with non-Vector left operands
>>> v1 = Vector([3, 4, 5])
>>> (10, 20, 30) + v1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: can only concatenate tuple (not "Vector") to tuple >>> from vector2d_v3 import Vector2d
>>> v2d = Vector2d(1, 2)
>>>v2d+v1
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: unsupported operand type(s) for +: 'Vector2d' and 'Vector'
To support operations involving objects of different types, Python implements a special dispatching mechanism for the infix operator special methods. Given an expression a + b, the interpreter will perform these steps (also see Figure 13-1):
- If a has add, call a.add(b) and return result unless it’s NotImplemented.
- If a doesn’t have add, or calling it returns NotImplemented, check if b has radd, then call b.radd(a) and return result unless it’s NotImplemented.
- If b doesn’t have radd, or calling it returns NotImplemented, raise TypeError with an unsupported operand types message.
img:
Figure 13-1. Flowchart for computing a + b with add and radd
The radd method is called the “reflected” or “reversed” version of add. I prefer to call them “reversed” special methods.2 Three of this book’s technical reviewers—Alex, Anna, and Leo—told me they like to think of them as the “right” special methods, because they are called on the righthand operand. Whatever “r”-word you prefer, that’s what the “r” prefix stands for in radd, rsub, and the like.
Therefore, to make the mixed-type additions in Example 13-6 work, we need to imple‐ ment the Vector.radd method, which Python will invoke as a fall back if the left operand does not implement add or if it does but returns NotImplemented to signal that it doesn’t know how to handle the right operand.
Do not confuse NotImplemented with NotImplementedError. The first, NotImplemented, is a special singleton value that an infix operator special method should return to tell the interpreter it cannot handle a given operand. In contrast, NotImplementedEr ror is an exception that stub methods in abstract classes raise to warn that they must be overwritten by subclasses.
The simplest possible radd that works is shown in Example 13-7.
Example 13-7. Vector.add and radd methods
# inside the Vector class
def __add__(self, other): #
pairs = itertools.zip_longest(self, other, fillvalue=0.0)
return Vector(a + b for a, b in pairs)
def __radd__(self, other): # return self + other
No changes to add from Example 13-4; listed here because radd uses it. radd just delegates to add.
Often, radd can be as simple as that: just invoke the proper operator, therefore delegating to add in this case. This applies to any commutative operator; + is com‐ mutative when dealing with numbers or our vectors, but it’s not commutative when concatenating sequences in Python.
[注释2. The Python documentation uses both terms. The “Data Model” chapter uses “reflected,” but “9.1.2.2. Imple‐ menting the arithmetic operations” in the numbers module docs mention “forward” and “reverse” methods, and I find this terminology better, because “forward” and “reversed” clearly name each of the directions, while “reflected” doesn’t have an obvious opposite. ]