aboutsummaryrefslogtreecommitdiff
path: root/Plugin/URL_Title.pm
Commit message (Collapse)AuthorAge
* URL_Title: Add URL blacklist regex abilityDavid Phillips2020-06-27
| | | | | | This patch adds a new configuration option to the URL_Title module so that the bot configuration may declare a list of regular expressions to match on a URL in order to determine if it is blacklisted.
* URL_Title: Fix typo in headerDavid Phillips2020-06-14
|
* URL_Title: Set Mozilla 5.0 compatible UA stringDavid Phillips2020-06-12
| | | | | This should hopefully address some intermittent blank/simple/default looking pages returned by YouTube intermittently.
* URL_Title: Accept gzip content encodingDavid Phillips2020-03-22
|
* URL_Title: Add option to dump page content to logDavid Phillips2020-03-22
|
* URL_Title: clear globals before parsingHEADmasterDavid Phillips2019-09-29
| | | | | | | | This patch prevents accidental leakage of content-type header and charset between calls to get_title. Without clearing these, it is possible for a URL title to be decoded from the wrong charset if a URL was previously titled with a differing charset to the current one. This patch clears these stale values to guarantee accurate charset decoding per URL.
* URL_Title: use Content-Type header or its http-equivDavid Phillips2019-09-14
| | | | | | | This patch adds the ability for URL_Title to fall back on the Content-Type meta http-equiv tag, or failing that, the Content-Type HTTP header itself. This should improve correctnes when dealing with HTML documents other than HTML5.
* URL_Title: Allow entities and wchars mixed in titlesDavid Phillips2019-09-14
| | | | | | | | | This patch moves the HTML entity decoding until after the raw bytes from the HTML document are translated through charsets. Previously, entities were used as decoded by the HTML parser into UTF-8, which meant that non-UTF-8-encoded strings from documents could become mixed with UTF-8 characters, making the subsequent character encoding transformation impossible to perform correctly.
* URL_Title: Allow capitalised `UTF-8` charsestDavid Phillips2019-09-14
|
* URL_Title: Workaround for UTF-8 decoder crashDavid Phillips2019-06-22
|
* URL_Title: Extract charset from HTML tag if presentDavid Phillips2019-06-19
|
* URL_Title: Don't log senseless warnings in on_messageDavid Phillips2019-01-06
|
* Implement no-reentry request on modulesDavid Phillips2019-01-05
| | | | | This fixes duplicate URL titles from a `title of` command, and will likely find use in future.
* URL_Title: Fix return code textDavid Phillips2019-01-05
|
* URL_Title: add direct commandDavid Phillips2019-01-05
|
* URL_Title: Don't die, just logDavid Phillips2019-01-03
|
* Large refactor - modularise logging, rejoin and join-on-invitationDavid Phillips2018-11-24
|
* URL_Title: Don't try and use non-existent headerDavid Phillips2018-10-02
|
* Validate configuration parameter presence and typeDavid Phillips2018-09-21
|
* Overhaul config parsingDavid Phillips2018-09-17
| | | | | | | | | | | | | | | | * makes plugin config more private: The config file now uses sections denoted with [Plugin::Foo] where plugin- private config can be stored. Plugins are now passed the usual, as well as a hashref for their own config section. They are also passed the config section of the core, i.e. those config options not appearing in an explicit section. Generally, these are used for bot-global options, so should be accessible to all plugins, but plugin-specific config shall be hidden * tries to improve parsing of hash-like strings and arrays The previous mechanism of using regex to pull out possible tokens was only ever meant to be temporary, and caused problems with escaping or encapsulation inside strings. I have made steps on hash parsing to allow tokens inside strings. Both array and hash parsing still to provide an escape character to escape the item separator (,)
* Rename handlers to on_*David Phillips2018-09-10
|
* Reinstate action handling in modules that need itDavid Phillips2018-09-10
| | | | Also remove debug logging statements from Jinx.pm
* Configurable modulesDavid Phillips2018-09-03
|
* Remove unnecessary shebangs from modulesDavid Phillips2018-08-12
|
* URL_Title.pm: allow for percent encodingDavid Phillips2018-08-10
|
* No need for [A-z] when case-insensitive flag usedDavid Phillips2018-06-20
|
* Change URL parsing from space to RFC 3982David Phillips2018-06-20
|
* URL_Title: Allow SVG titlingDavid Phillips2018-06-19
|
* Truncate URLs based on shorturl length, not full URLDavid Phillips2018-05-07
|
* Fold URL title whitespace into same lineDavid Phillips2018-05-07
|
* Replace HTML::HeadParser with HTML::ParserDavid Phillips2018-05-07
| | | | Weird bugs with HeadParser, cannot debug and patch for upstream as yet
* Decode HTML body before passing to to head parserDavid Phillips2018-04-10
| | | | | | | | | From the HTML::HeadParser docs: > Note that the HTML::HeadParser might get confused if raw undecoded UTF-8 is > passed to the parse() method. Make sure the strings are properly decoded > before passing them on. This explains some hard-to-trace bugs with character mangling
* Correct capitalisation on module namesDavid Phillips2018-04-10