forked from scrapy/scrapy
-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathsep-003.rst
166 lines (124 loc) · 4.9 KB
/
sep-003.rst
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
==============================
SEP 3
Title Nested items API (ItemField)
Author Pablo Hoffman
Created 2009-07-19
Status Obsolete by :ref:`sep-008`
==============================
=======================================
SEP-003 - Nested items API (ItemField)
=======================================
This page presents different usage scenarios for the new nested items field API
called !ItemField.
Prerequisites
=============
This API proposal relies on the following API:
1. instantiating a item with an item instance as its first argument (i.e.
``item2 = MyItem(item1)``) must return a **copy** of the first item
instance)
2. items can be instantiated using this syntax: ``item = Item(attr1=value1,
attr2=value2)``
Proposed Implementation of ItemField
====================================
::
#!python
from scrapy.item.fields import BaseField
class ItemField(BaseField):
def __init__(self, item_type, default=None):
self._item_type = item_type
super(ItemField, self).__init__(default)
def to_python(self, value):
return self._item_type(value) if not isinstance(value, self._item_type) else value
def get_default(self):
# WARNING: returns default item instead of a copy - this must be
# well documented, as Items are mutable objects and may lead to
# unexpected behaviors # always returning a copy may not be desirable
# either (see Supplier item, for example). this method can be
# overridden to change this behaviour
return self._default
Usage Scenarios
===============
Defining an item containing ItemField's
---------------------------------------
::
#!python
from scrapy.item.models import Item
from scrapy.item.fields import ListField, ItemField, TextField, UrlField, DecimalField
class Supplier(Item):
name = TextField(default="anonymous supplier")
url = UrlField()
class Variant(Item):
name = TextField(required=True)
url = UrlField()
price = DecimalField()
class Product(Variant):
supplier = ItemField(Supplier, default=Supplier(name="default supplier")
variants = ListField(ItemField(Variant))
# these ones are used for documenting default value examples
supplier2 = ItemField(Supplier)
variants2 = ListField(ItemField(Variant), default=[])
It's important to note here that the (perhaps most intuitive) way of defining a
Product-Variant relationship (i.e. defining a recursive !ItemField) doesn't
work. For example, this fails to compile:
::
#!python
class Product(Item):
variants = ItemField(Product) # Fails to compile
Assigning an item field
-----------------------
::
#!python
supplier = Supplier(name="Supplier 1", url="http://example.com")
p = Product()
# standard assignment
p['supplier'] = supplier
# this also works as it tries to instantiate a Supplier with the given dict
p['supplier'] = {'name': 'Supplier 1' url='http://example.com'}
# this fails because it can't instantiate a Supplier
p['supplier'] = 'Supplier 1'
# this fails because url doesn't have the valid type
p['supplier'] = {'name': 'Supplier 1' url=123}
v1 = Variant()
v1['name'] = "lala"
v1['price'] = Decimal("100")
v2 = Variant()
v2['name'] = "lolo"
v2['price'] = Decimal("150")
# standard assignment
p['variants'] = [v1, v2] # OK
# can also instantiate at assignment time
p['variants'] = [v1, Variant(name="lolo", price=Decimal("150")]
# this also works as it tries to instantiate a Variant with the given dict
p['variants'] = [v1, {'name': 'lolo', 'price': Decimal("150")]
# this fails because it can't instantiate a Variant
p['variants'] = [v1, 'test']
# this fails because 'coco' is not a valid value for price
p['variants'] = [v1, {'name': 'lolo', 'price': 'coco']
Default values
--------------
::
#!python
p = Product()
p['supplier'] # returns: Supplier(name='default supplier')
p['supplier2'] # raises KeyError
p['supplier2'] = Supplier()
p['supplier2'] # returns: Supplier(name='anonymous supplier')
p['variants'] # raises KeyError
p['variants2'] # returns []
p['categories'] # raises KeyError
p.get('categories') # returns None
p['numbers'] # returns []
Accessing and changing nested item values
----------------------------------------
::
#!python
p = Product(supplier=Supplier(name="some name", url="http://example.com"))
p['supplier']['url'] # returns 'http://example.com'
p['supplier']['url'] = "http://www.other.com" # works as expected
p['supplier']['url'] = 123 # fails: wrong type for supplier url
p['variants'] = [v1, v2]
p['variants'][0]['name'] # returns v1 name
p['variants'][1]['name'] # returns v2 name
# XXX: decide what to do about these cases:
p['variants'].append(v3) # works but doesn't check type of v3
p['variants'].append(1) # works but shouldn't?