JDownloader2 crawler single-rules (SchemaStore) JSON Schema

Type	object
File match	`.jd2cr` `.jd2cr.json`
Schema URL	https://catalog.lintel.tools/schemas/schemastore/jdownloader2-crawler-single-rules/latest.json
Source	https://raw.githubusercontent.com/sergxerj/jdownloader2-crawler-rule-json-schema/main/jd2cr.schema.json

Validate with Lintel

npx @lintel/lintel check

Type: object

A crawler rule.

Anything prefixed as "HTML RegEx" will do Java-flavored pattern-matching on the page's raw (unmodified by javascript) HTML.

Properties

enabled boolean required

Type: Boolean
Applies to rule: ALL
Purpose: Enables or disables this rule

name string required

Type: String
Applies to rule: ALL
Purpose: name of the rule.

pattern string required

Type: RegEx
Applies to rule: ALL
Purpose: defines on which URLs will this rule apply by matching it to the pattern

Examples: "https?://www\.example\.com/subdir1/subdir2/subdir3"

format=regex

rule string required

Values: "DEEPDECRYPT" "REWRITE" "DIRECTHTTP" "FOLLOWREDIRECT" "SUBMITFORM"

cookies array[]

Type: Array of length-2 Arrays
Applies to rule: DIRECTHTTP, DEEPDECRYPT, SUBMITFORM or FOLLOWREDIRECT
Purpose: A list of length-2 arrays in the form [["cookieName", "cookieValue"],[..., ...]...]. Here you can put in your personal cookies e.g. login cookies of websites which JD otherwise fails to parse.Also if "updateCookies" is enabled, JD will update these with all cookies it receives from the website(s) that match the "pattern" property.

Examples: [["cookieName","cookieValue"]]

updateCookies boolean

Type: Boolean
Applies to rule: DIRECTHTTP, DEEPDECRYPT, SUBMITFORM or FOLLOWREDIRECT
Purpose: If the target websites returns new cookies, save these inside this rule and update this rule.

logging boolean

Type: Boolean
Applies to rule: ALL
Purpose: Enable this for support purposes. Logs of your LinkCrawler Rules can be found in your JD install dir/logs/: LinkCrawlerRule..log.0 and /LinkCrawlerDeep.*

maxDecryptDepth integer

Type: Integer
Applies to rule: ALL
Purpose: How many layers deep do should your rule crawl (e.g. rule returns URLs matching the same rule again recursively - how often is this chain allowed to happen?)

Examples: 1

id integer

Type: Integer
Applies to rule: ALL
Purpose: Auto generated ID of the rule. Normally leave this blank and JD2 will autoinsert.

packageNamePattern string

Type: HTML RegEx
Applies to rule: DEEPDECRYPT
Purpose: All URLs crawled by this rule will be grouped into the same package that is the HTML RegEx's first capture

Examples: "<title>(.*?)</title>"

format=regex

passwordPattern string

Type: HTML RegEx or null
Applies to rule: DEEPDECRYPT
Purpose: Matches against archive extraction password that may be found as text inside the page's (unmodified by javascript) HTML code. First returned capture must be the password.

Examples: "password:([^>]+)>"

format=regex

formPattern string

Type: HTML RegEx
Applies to rule: DEEPDECRYPT
Purpose:

Examples: "<form id="example">(.*?)</form>"

format=regex

deepPattern string | null

Type: HTML RegEx or null
Applies to rule: DEEPDECRYPT
Purpose: Which URLs should this rule look for inside the page's (unmodified by javascript) HTML code. null (or blank) = auto scan and return all supported URLs found in HTML code. Keep in mind that, if the url's found in the html are relative (e.g. starting with a slash / character instead of a protocol like http or a domain (website root) name) you WILL have to enclose the entire expression in quotes (outside of the parentheses that do capturing) like "(...)" AND enclose the part that would match the missing part of the url (i.e. from the protocol to the slash-or-other-character it begins with) in an 'OPTIONAL, NON CAPTURING' group (?:...)?. Resulting in the following pattern for pretty much every case "((?:missing part of full url)?rest of url that is in the html)"

Examples: "(https?://absolute\.url\.example\.com/subdir1/.*?\.ext)", ""((?:https?://missing\.domain\.example\.com\)?/partial_url_existing_in_page)""

format=regex

rewriteReplaceWith string

Type: String
Applies to rule: REWRITE
Purpose:Pattern for new URL, can use captures from "pattern" in the $1 form

Examples: "https://example.com/$1"

format=regex

All of

1. variant

2. variant

3. variant