在 Python Scrapy 中创建动态管道

settings.py启用管道 ****

ITEM_PIPELINES = {
    'project_folder.pipelines.MyPipeline': 100 
}

然后在 items.py编写此代码 ****

# -*- coding: utf-8 -*-
from scrapy import Item, Field
from collections import OrderedDict

class DynamicItem(Item):
    def __setitem__(self, key, value):
        self._values[key] = value
        self.fields[key] = {}

然后在你的`project_folder / spiders / spider_file.py 中

from project_folder.items import DynamicItem
       def parse(self, response):
               # create an ordered dictionary
               data = OrderedDict()
               data['first'] = ...
               data['second'] = ...
               data['third'] = ...
               .
               .
               .
               # create dictionary as long as you need
               
               # now unpack dictionary
               yield DynamicItem( **data )

               # above line is same as this line
               yield DynamicItem( first = data['first'], second = data['second'], third = data['third'])

这段代码有什么好处?

无需逐个创建 items.py 中的每个项目。