When I was new to Python, I wanted to customize the Item and reference it when I was learning crawlers. The custom Item is shown below
import scrapy
class MyItem(scrapy.Item):
title = scrapy.Field()
pass
In the crawler file, the code is as follows:
# -*- coding: utf-8 -*-
import scrapy
from tutorial.items.MyItem import MyItem
class MySpider(scrapy.Spider):
name = 'myitem.demo'
allowed_domains = ['toscrape.com']
def start_requests(self):
yield scrapy.Request('http://toscrape.com/tag/humor/', self.parse)
def parse(self, response):
for h1 in response.xpath('//h1').getall():
yield MyItem(title=h1).print_item()
for href in response.xpath('//a/@href').getall():
yield scrapy.Request(response.urljoin(href), self.parse)
Here, Tutorial.item.myItem needs to write the class file name