table of contents
- Tumbleweed 2.2.1-1.1
- Leap-16.0
- Leap-15.6
| libwget-robots(3) | Library Functions Manual | libwget-robots(3) |
NAME¶
libwget-robots - Robots Exclusion file parser
SYNOPSIS¶
Data Structures¶
struct wget_robots_st
Macros¶
#define parse_record_field(d, f)
Functions¶
int wget_robots_parse (wget_robots **_robots, const char
*data, const char *client)
void wget_robots_free (wget_robots **robots)
int wget_robots_get_path_count (wget_robots *robots)
wget_string * wget_robots_get_path (wget_robots *robots, int
index)
int wget_robots_get_sitemap_count (wget_robots *robots)
const char * wget_robots_get_sitemap (wget_robots *robots, int index)
Detailed Description¶
The purpose of this set of functions is to parse a Robots Exclusion Standard file into a data structure for easy access.
Macro Definition Documentation¶
#define parse_record_field( d, f)¶
Value:
parse_record_field(d, f, sizeof(f) - 1)
Function Documentation¶
int wget_robots_parse (wget_robots ** _robots, const char * data, const char * client)¶
Parameters
client Name of the client / user-agent
Returns
The function parses the robots.txt data in accordance to https://www.robotstxt.org/orig.html#format and returns a ROBOTS structure including a list of the disallowed paths and including a list of the sitemap files.
The ROBOTS structure has to be freed by calling wget_robots_free().
void wget_robots_free (wget_robots ** robots)¶
Parameters
wget_robots_free() free's the formerly allocated wget_robots structure.
int wget_robots_get_path_count (wget_robots * robots)¶
Parameters
Returns
wget_string * wget_robots_get_path (wget_robots * robots, int index)¶
Parameters
index Index of the wanted path
Returns
int wget_robots_get_sitemap_count (wget_robots * robots)¶
Parameters
Returns
const char * wget_robots_get_sitemap (wget_robots * robots, int index)¶
Parameters
index Index of the wanted sitemap URL
Returns
Author¶
Generated automatically by Doxygen for wget2 from the source code.
| Version 2.2.1 | wget2 |