.\" Generated by kramdown-man 1.0.1 .\" https://github.com/postmodern/kramdown-man#readme .TH ronin-web-spider 1 "2022-01-01" Ronin Web "User Manuals" .SH NAME .PP ronin\-web\-spider \- Spiders a website .SH SYNOPSIS .PP \fBronin\-web spider\fR \[lB]\fIoptions\fP\[rB] \[lC]\fB\-\-host\fR \fIHOST\fP \[or] \fB\-\-domain\fR \fIDOMAIN\fP \[or] \fB\-\-site\fR \fIURL\fP\[rC] .SH DESCRIPTION .PP Spiders a website\. .SH OPTIONS .TP \fB\-\-host\fR \fIHOST\fP Spiders the specific \fIHOST\fP\. .TP \fB\-\-domain\fR \fIDOMAIN\fP Spiders the whole \fIDOMAIN\fP\. .TP \fB\-\-site\fR \fIURL\fP Spiders the website, starting at the \fIURL\fP\. .TP \fB\-\-open\-timeout\fR \fISECS\fP Sets the connection open timeout\. .TP \fB\-\-read\-timeout\fR \fISECS\fP Sets the read timeout\. .TP \fB\-\-ssl\-timeout\fR \fISECS\fP Sets the SSL connection timeout\. .TP \fB\-\-continue\-timeout\fR \fISECS\fP Sets the continue timeout\. .TP \fB\-\-keep\-alive\-timeout\fR \fISECS\fP Sets the connection keep alive timeout\. .TP \fB\-P\fR, \fB\-\-proxy\fR \fIPROXY\fP Sets the proxy to use\. .TP \fB\-H\fR, \fB\-\-header\fR \[lq]\fINAME\fP: \fIVALUE\fP\[rq] Sets a default header\. .TP \fB\-\-host\-header\fR \fINAME\fP\[eq]\fIVALUE\fP Sets a default header\. .TP \fB\-u\fR, \fB\-\-user\-agent\fR chrome\-linux\[or]chrome\-macos\[or]chrome\-windows\[or]chrome\-iphone\[or]chrome\-ipad\[or]chrome\-android\[or]firefox\-linux\[or]firefox\-macos\[or]firefox\-windows\[or]firefox\-iphone\[or]firefox\-ipad\[or]firefox\-android\[or]safari\-macos\[or]safari\-iphone\[or]safari\-ipad\[or]edge The \fBUser\-Agent\fR to use\. .TP \fB\-U\fR, \fB\-\-user\-agent\-string\fR \fISTRING\fP The raw \fBUser\-Agent\fR string to use\. .TP \fB\-R\fR, \fB\-\-referer\fR \fIURL\fP Sets the \fBReferer\fR URL\. .TP \fB\-\-delay\fR \fISECS\fP Sets the delay in seconds between each request\. .TP \fB\-l\fR, \fB\-\-limit\fR \fICOUNT\fP Only spiders up to \fICOUNT\fP pages\. .TP \fB\-d\fR, \fB\-\-max\-depth\fR \fIDEPTH\fP Only spiders up to max depth\. .TP \fB\-\-enqueue\fR \fIURL\fP Adds the URL to the queue\. .TP \fB\-\-visited\fR \fIURL\fP Marks the URL as previously visited\. .TP \fB\-\-strip\-fragments\fR Enables\[sl]disables stripping the fragment component of every URL\. .TP \fB\-\-strip\-query\fR Enables\[sl]disables stripping the query component of every URL\. .TP \fB\-\-visit\-scheme\fR \fISCHEME\fP Visit URLs with the URI scheme\. .TP \fB\-\-visit\-schemes\-like\fR \fB\[sl]\fR\fIREGEX\fP\fB\[sl]\fR Visit URLs with URI schemes that match the \fIREGEX\fP\. .TP \fB\-\-ignore\-scheme\fR \fISCHEME\fP Ignore URLs with the URI scheme\. .TP \fB\-\-ignore\-schemes\-like\fR \fB\[sl]\fR\fIREGEX\fP\fB\[sl]\fR Ignore URLs with URI schemes matching the \fIREGEX\fP\. .TP \fB\-\-visit\-host\fR \fIHOST\fP Visit URLs with the matching host name\. .TP \fB\-\-visit\-hosts\-like\fR \fB\[sl]\fR\fIREGEX\fP\fB\[sl]\fR Visit URLs with hostnames that match the \fIREGEX\fP\. .TP \fB\-\-ignore\-host\fR \fIHOST\fP Ignore the host name\. .TP \fB\-\-ignore\-hosts\-like\fR \fB\[sl]\fR\fIREGEX\fP\fB\[sl]\fR Ignore the host names matching the \fIREGEX\fP\. .TP \fB\-\-visit\-port\fR \fIPORT\fP Visit URLs with the matching port number\. .TP \fB\-\-visit\-ports\-like\fR \fB\[sl]\fR\fIREGEX\fP\fB\[sl]\fR Visit URLs with port numbers that match the \fIREGEX\fP\. .TP \fB\-\-ignore\-port\fR \fIPORT\fP Ignore the port number\. .TP \fB\-\-ignore\-ports\-like\fR \fB\[sl]\fR\fIREGEX\fP\fB\[sl]\fR Ignore the port numbers matching the \fIREGEXP\fP\. .TP \fB\-\-visit\-link\fR \fIURL\fP Visit the \fIURL\fP\. .TP \fB\-\-visit\-links\-like\fR \fB\[sl]\fR\fIREGEX\fP\fB\[sl]\fR Visit URLs that match the \fIREGEX\fP\. .TP \fB\-\-ignore\-link\fR \fIURL\fP Ignore the \fIURL\fP\. .TP \fB\-\-ignore\-links\-like\fR \fB\[sl]\fR\fIREGEX\fP\fB\[sl]\fR Ignore URLs matching the \fIREGEX\fP\. .TP \fB\-\-visit\-ext\fR \fIFILE\[ru]EXT\fP Visit URLs with the matching file ext\. .TP \fB\-\-visit\-exts\-like\fR \fB\[sl]\fR\fIREGEX\fP\fB\[sl]\fR Visit URLs with file exts that match the \fIREGEX\fP\. .TP \fB\-\-ignore\-ext\fR \fIFILE\[ru]EXT\fP Ignore the URLs with the file ext\. .TP \fB\-\-ignore\-exts\-like\fR \fB\[sl]\fR\fIREGEX\fP\fB\[sl]\fR Ignore URLs with file exts matching the REGEX\. .TP \fB\-r\fR, \fB\-\-robots\fR Specifies whether to honor \fBrobots\.txt\fR\. .TP \fB\-\-print\-status\fR Print the status codes for each URL\. .TP \fB\-\-print\-headers\fR Print response headers for each URL\. .TP \fB\-\-print\-header\fR \fINAME\fP Prints a specific header\. .TP \fB\-\-history\fR \fIFILE\fP Sets the history file to write every visited URL to\. .TP \fB\-\-archive\fR \fIDIR\fP Archive every visited page to the \fIDIR\fP\. .TP \fB\-\-git\-archive\fR \fIDIR\fP Archive every visited page to the git repository\. .TP \fB\-X\fR, \fB\-\-xpath\fR \fIXPATH\fP Evaluates the XPath on each HTML page\. .TP \fB\-C\fR, \fB\-\-css\-path\fR \fIXPATH\fP Evaluates the CSS\-path on each HTML page\. .TP \fB\-\-print\-hosts\fR Print all discovered hostnames\. .TP \fB\-\-print\-certs\fR Print all encountered SSL\[sl]TLS certificates\. .TP \fB\-\-save\-certs\fR Saves all encountered SSL\[sl]TLS certificates\. .TP \fB\-\-print\-js\-strings\fR Print all JavaScript strings\. .TP \fB\-\-print\-js\-url\-strings\fR Print URL strings found in JavaScript\. .TP \fB\-\-print\-js\-path\-strings\fR Print path strings found in JavaScript\. .TP \fB\-\-print\-js\-absolute\-path\-strings\fR Only print absolute path strings found in JavaScript\. .TP \fB\-\-print\-js\-relative\-path\-strings\fR Only print relative path strings found in JavaScript\. .TP \fB\-\-print\-html\-comments\fR Print HTML comments\. .TP \fB\-\-print\-js\-comments\fR Print JavaScript comments\. .TP \fB\-\-print\-comments\fR Print all HTML and JavaScript comments\. .TP \fB\-v\fR, \fB\-\-verbose\fR Enables verbose output\. .TP \fB\-h\fR, \fB\-\-help\fR Print help information\. .SH ENVIRONMENT .TP \fIHTTP\[ru]PROXY\fP Sets the global HTTP proxy\. .TP \fIRONIN\[ru]HTTP\[ru]PROXY\fP Sets the HTTP proxy for Ronin\. .SH AUTHOR .PP Postmodern .MT postmodern\.mod3\[at]gmail\.com .ME .SH SEE ALSO .PP .BR ronin\-web\-server (1) .BR ronin\-web\-proxy (1) .BR ronin\-web\-diff (1) .BR ronin\-web\-new\-spider (1)