When I was new to Python, I wanted to customize the Item and reference it when I was learning crawlers. The custom Item is shown below
import scrapy
class MyItem(scrapy.Item):
title = scrapy.Field()
pass
In the crawler file, the code is as follows:
# -*- coding: utf-8 -*-
import scrapy
from tutorial.items.MyItem import MyItem
class MySpider(scrapy.Spider):
name = 'myitem.demo'
allowed_domains = ['toscrape.com']
def start_requests(self):
yield scrapy.Request('http://toscrape.com/tag/humor/', self.parse)
def parse(self, response):
for h1 in response.xpath('//h1').getall():
yield MyItem(title=h1).print_item()
for href in response.xpath('//a/@href').getall():
yield scrapy.Request(response.urljoin(href), self.parse)
Here, Tutorial.item.myItem needs to write the class file name
Read More:
- Python error: typeerror: ‘module’ object is not callable
- Typeerror: ‘module’ object is not callable
- Typeerror in Python: ‘nonetype’ object is not Iterable
- Python error: typeerror: ‘Int’ object is not subscriptable
- Python error: typeerror: ‘Int’ object is not subscribable
- Python 3 error typeerror: ‘dict’_ keys‘ object is not subscriptable
- Python reported error: typeerror:’int’object is not subscriptable
- Python error prompt: typeerror: ‘builtin’_ function_ or_ method‘ object is not subscriptable
- TypeError: ‘numpy.int64′ object is not iterable ,’int’ object is not iterable
- “class“ object is not subscriptable
- TypeError: ‘int’ object is not iterable
- Uncaught (in promise) TypeError: Object(…) is not a function
- Vue error in mounted hook: “typeerror: object (…) is not a function“
- TypeError: this.getOptions is not a function at Object.lessLoader
- Attributeerror: ‘module’ object has no attribute ‘handlers’ — Python sub module import problem
- Webpack error module build failed: typeerror: fileSystem.statSync is not a function
- Vue — report error with less module build failed: typeerror: loaderContext.getResolve is not a function
- Vue project error: uncaught typeerror: vuex__ WEBPACK_ IMPORTED_ MODULE_ 1__ . default.store is not a constructor
- Python TypeError: ‘newline’ is an invalid keyword argument for this function
- Typeerror: object of type ‘response’ has no len() why?