Skip to content

Commit

Permalink
- convertd sep 8
Browse files Browse the repository at this point in the history
  • Loading branch information
AphonicChaos committed Mar 7, 2014
1 parent f43c99f commit f1e0faa
Show file tree
Hide file tree
Showing 2 changed files with 111 additions and 102 deletions.
111 changes: 111 additions & 0 deletions sep/sep-008.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
========= ==============================================================
SEP 8
Title Item Parsers
Author Pablo Hoffman
Created 2009-08-11
Status Final (implemented with variations)
Obsoletes :doc:`sep-001`, :doc:`sep-002`, :doc:`sep-003`, :doc:`sep-005`
========= ==============================================================

======================
SEP-008 - Item Loaders
======================

Item Parser is the final API proposed to implement Item Builders/Loader
proposed in :doc:`sep-001`.

.. note:: This is the API that was finally implemented with the name "Item
Loaders", instead of "Item Parsers" along with some other minor fine
tuning to the API methods and semantics.

Dataflow
========

1. ``ItemParser.add_value()``
1. **input_parser**
2. store
2. ``ItemParser.add_xpath()`` *(only available in XPathItemLoader)*
1. selector.extract()
2. **input_parser**
3. store
3. ``ItemParser.populate_item()`` *(ex. get_item)*
1. **output_parser**
2. assign field

Modules and classes
===================

- ``scrapy.contrib.itemparser.ItemParser``
- ``scrapy.contrib.itemparser.XPathItemParser``
- ``scrapy.contrib.itemparser.parsers.``MapConcat`` *(ex. ``TreeExpander``)*
- ``scrapy.contrib.itemparser.parsers.``TakeFirst``
- ``scrapy.contrib.itemparser.parsers.Join``
- ``scrapy.contrib.itemparser.parsers.Identity``

Public API
==========

- ``ItemParser.add_value()``
- ``ItemParser.replace_value()``
- ``ItemParser.populate_item()`` *(returns item populated)*

- ``ItemParser.get_collected_values()`` *(note the 's' in values)*
- ``ItemParser.parse_field()``

- ``ItemParser.get_input_parser()``
- ``ItemParser.get_output_parser()``

- ``ItemParser.context``

- ``ItemParser.default_item_class``
- ``ItemParser.default_input_parser``
- ``ItemParser.default_output_parser``
- ``ItemParser.*field*_in``
- ``ItemParser.*field*_out``

Alternative Public API Proposal
===============================

- ``ItemLoader.add_value()``
- ``ItemLoader.replace_value()``
- ``ItemLoader.load_item()`` *(returns loaded item)*

- ``ItemLoader.get_stored_values()`` or ``ItemLoader.get_values()`` *(returns the ``ItemLoader values)*
- ``ItemLoader.get_output_value()``

- ``ItemLoader.get_input_processor()`` or ``ItemLoader.get_in_processor()`` *(short version)*
- ``ItemLoader.get_output_processor()`` or ``ItemLoader.get_out_processor()`` *(short version)*

- ``ItemLoader.context``

- ``ItemLoader.default_item_class``
- ``ItemLoader.default_input_processor`` or ``ItemLoader.default_in_processor`` *(short version)*
- ``ItemLoader.default_output_processor`` or ``ItemLoader.default_out_processor`` *(short version)*
- ``ItemLoader.*field*_in``
- ``ItemLoader.*field*_out``

Usage example: declaring Item Parsers
=====================================

::

#!python
from scrapy.contrib.itemparser import XPathItemParser, parsers

class ProductParser(XPathItemParser):
name_in = parsers.MapConcat(removetags, filterx)
price_in = parsers.MapConcat(...)

price_out = parsers.TakeFirst()

Usage example: declaring parsers in Fields
==========================================

::

#!python
class Product(Item):
name = Field(output_parser=parsers.Join(), ...)
price = Field(output_parser=parsers.TakeFirst(), ...)

description = Field(input_parser=parsers.MapConcat(removetags))
102 changes: 0 additions & 102 deletions sep/sep-008.trac

This file was deleted.

0 comments on commit f1e0faa

Please sign in to comment.