# py **Repository Path**: lockerF/py ## Basic Information - **Project Name**: py - **Description**: python小程序 - **Primary Language**: Python - **License**: Apache-2.0 - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2017-01-02 - **Last Updated**: 2020-12-18 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README #py 1. spider 爬虫程序,入口是 [spider_main.py](https://git.oschina.net/beborn/py/blob/master/spider/spider_main.py?dir=0&filepath=spider%2Fspider_main.py&oid=89b7bcd6e4344a8bf846ba33029404b47efb07f0&sha=f4cb7b04e98da8e7b2fa86192901b532968e665b),包括[url调度器](https://git.oschina.net/beborn/py/blob/master/spider/url_manager.py?dir=0&filepath=spider%2Furl_manager.py&oid=0e239ecd44148be32f53c9bad49a44d29ea1796d&sha=f4cb7b04e98da8e7b2fa86192901b532968e665b)、[下载器](https://git.oschina.net/beborn/py/blob/master/spider/html_downloader.py?dir=0&filepath=spider%2Fhtml_downloader.py&oid=cc5f7dc1de3a24b1df94ad0c912c95e74ca1b0fe&sha=f4cb7b04e98da8e7b2fa86192901b532968e665b)、[解析器](https://git.oschina.net/beborn/py/blob/master/spider/html_parser.py?dir=0&filepath=spider%2Fhtml_parser.py&oid=518173ff3d000c419f19116c14ce5ef2a1de4580&sha=f4cb7b04e98da8e7b2fa86192901b532968e665b)、[输出器](https://git.oschina.net/beborn/py/blob/master/spider/html_outputer.py?dir=0&filepath=spider%2Fhtml_outputer.py&oid=e56f27c005160a5eb985fb1f706b17389dafe440&sha=f4cb7b04e98da8e7b2fa86192901b532968e665b )。入口的root_url为根路径,即爬虫访问入口。解析逻辑在解析其中设置。默认输出为html,可以在输出器中设置。