JDownloader2 crawler single-rules
A validating a single jdownloader2 rule
| Type | object |
|---|---|
| File match |
*.jd2cr
*.jd2cr.json
|
| Schema URL | https://catalog.lintel.tools/schemas/schemastore/jdownloader2-crawler-single-rules/latest.json |
| Source | https://raw.githubusercontent.com/sergxerj/jdownloader2-crawler-rule-json-schema/main/jd2cr.schema.json |
Validate with Lintel
npx @lintel/lintel check
Type:
object
A crawler rule.
- Anything prefixed as "HTML RegEx" will do Java-flavored pattern-matching on the page's raw (unmodified by javascript) HTML.
Properties
enabled
boolean
required
- Type: Boolean
- Applies to rule: ALL
- Purpose: Enables or disables this rule
name
string
required
- Type: String
- Applies to rule: ALL
- Purpose: name of the rule.
pattern
string
required
- Type: RegEx
- Applies to rule: ALL
- Purpose: defines on which URLs will this rule apply by matching it to the pattern
format=regex
rule
string
required
cookies
array[]
- Type: Array of length-2 Arrays
- Applies to rule: DIRECTHTTP, DEEPDECRYPT, SUBMITFORM or FOLLOWREDIRECT
- Purpose: A list of length-2 arrays in the form [["cookieName", "cookieValue"],[..., ...]...]. Here you can put in your personal cookies e.g. login cookies of websites which JD otherwise fails to parse.Also if "updateCookies" is enabled, JD will update these with all cookies it receives from the website(s) that match the "pattern" property.
updateCookies
boolean
- Type: Boolean
- Applies to rule: DIRECTHTTP, DEEPDECRYPT, SUBMITFORM or FOLLOWREDIRECT
- Purpose: If the target websites returns new cookies, save these inside this rule and update this rule.
logging
boolean
- Type: Boolean
- Applies to rule: ALL
- Purpose: Enable this for support purposes. Logs of your LinkCrawler Rules can be found in your JD install dir/logs/: LinkCrawlerRule.
.log.0 and /LinkCrawlerDeep.*
maxDecryptDepth
integer
- Type: Integer
- Applies to rule: ALL
- Purpose: How many layers deep do should your rule crawl (e.g. rule returns URLs matching the same rule again recursively - how often is this chain allowed to happen?)
id
integer
- Type: Integer
- Applies to rule: ALL
- Purpose: Auto generated ID of the rule. Normally leave this blank and JD2 will autoinsert.
packageNamePattern
string
- Type: HTML RegEx
- Applies to rule: DEEPDECRYPT
- Purpose: All URLs crawled by this rule will be grouped into the same package that is the HTML RegEx's first capture
format=regex
passwordPattern
string
- Type: HTML RegEx or null
- Applies to rule: DEEPDECRYPT
- Purpose: Matches against archive extraction password that may be found as text inside the page's (unmodified by javascript) HTML code. First returned capture must be the password.
format=regex
formPattern
string
- Type: HTML RegEx
- Applies to rule: DEEPDECRYPT
- Purpose:
format=regex
deepPattern
string | null
- Type: HTML RegEx or null
- Applies to rule: DEEPDECRYPT
- Purpose: Which URLs should this rule look for inside the page's (unmodified by javascript) HTML code. null (or blank) = auto scan and return all supported URLs found in HTML code. Keep in mind that, if the url's found in the html are relative (e.g. starting with a slash / character instead of a protocol like http or a domain (website root) name) you WILL have to enclose the entire expression in quotes (outside of the parentheses that do capturing) like "(...)" AND enclose the part that would match the missing part of the url (i.e. from the protocol to the slash-or-other-character it begins with) in an 'OPTIONAL, NON CAPTURING' group (?:...)?. Resulting in the following pattern for pretty much every case "((?:missing part of full url)?rest of url that is in the html)"
format=regex
rewriteReplaceWith
string
- Type: String
- Applies to rule: REWRITE
- Purpose:Pattern for new URL, can use captures from "pattern" in the $1 form
format=regex
All of
1.
variant
2.
variant
3.
variant