|
This article is cited in 1 scientific paper (total in 1 paper)
Scraping on the fly of external web resources, driven by HTML page markup
E. L. Kitaev, R. Yu. Skornyakova
Abstract:
The paper presents an approach to displaying data from cross origin resources on web pages using the REST API and describes a tool based on this approach that allows one to extract and display on the web page metadata of html documents, pdf files and documents Word posted on the Internet, as well as microdata and data in JSON LD format. The tool includes the REST API on the IIS web server and JavaScript scripts. Examples of using this tool are given. The created REST API enables cross origin resource sharing (CORS) and can be requested from web pages of any origins.
Keywords:
web scraping, semantic markup, microdata, JSON-LD, REST API, CORS.
Citation:
E. L. Kitaev, R. Yu. Skornyakova, “Scraping on the fly of external web resources, driven by HTML page markup”, Keldysh Institute preprints, 2019, 020, 31 pp.
Linking options:
https://www.mathnet.ru/eng/ipmp2658 https://www.mathnet.ru/eng/ipmp/y2019/p20
|
Statistics & downloads: |
Abstract page: | 562 | Full-text PDF : | 1138 | References: | 48 |
|