Commit Graph

9 Commits

Author SHA1 Message Date
ugla
52109776f6
[bugfix] Fix unicode-unaware word boundary check in hashtags (#1049)
* [bugfix] Fix unicode-unaware word boundary check in hashtag regex

Go `\b` does not care for Unicode, and without lookahead, the workarounds got
very ugly. So I replaced the regex with a parser.

The parser runs in O(n) time and performance should not be affected.

* [bugfix] Add back hashtag max length and add tests for it
2022-11-15 16:05:34 +01:00
tobi
664713ddd4
[bugfix] Make hashtag regex work with non-ascii characters (#682) 2022-07-03 11:03:03 +02:00
tobi
dfdc473cef
[chore] Webfinger rework (#627)
* move finger to dereferencer

* totally break GetRemoteAccount

* start reworking finger func a bit

* start reworking getRemoteAccount a bit

* move mention parts to namestring

* rework webfingerget

* use util function to extract webfinger parts

* use accountDomain

* rework finger again, final form

* just a real nasty commit, the worst

* remove refresh from account

* use new ASRepToAccount signature

* fix incorrect debug call

* fix for new getRemoteAccount

* rework GetRemoteAccount

* start updating tests to remove repetition

* break a lot of tests
Move shared test logic into the testrig,
rather than having it scattered all over
the place. This allows us to just mock
the transport controller once, and have
all tests use it (unless they need not to
for some other reason).

* fix up tests to use main mock httpclient

* webfinger only if necessary

* cheeky linting with the lads

* update mentionName regex
recognize instance accounts

* don't finger instance accounts

* test webfinger part extraction

* increase default worker count to 4 per cpu

* don't repeat regex parsing

* final search for discovered accountDomain

* be more permissive in namestring lookup

* add more extraction tests

* simplify GetParseMentionFunc

* skip long search if local account

* fix broken test
2022-06-11 11:01:34 +02:00
kim
26b74aefaf
[bugfix] Fix existing bio text showing as HTML (#531)
* fix existing bio text showing as HTML

- updated replaced mentions to include instance
- strips HTML from account source note in Verify handler
- update text formatter to use buffers for string writes

Signed-off-by: kim <grufwub@gmail.com>

* go away linter

Signed-off-by: kim <grufwub@gmail.com>

* change buf reset location, change html mention tags

Signed-off-by: kim <grufwub@gmail.com>

* reduce FindLinks code complexity

Signed-off-by: kim <grufwub@gmail.com>

* fix HTML to text conversion

Signed-off-by: kim <grufwub@gmail.com>

* Update internal/regexes/regexes.go

Co-authored-by: Mina Galić <mina.galic@puppet.com>

* use improved html2text lib with more options

Signed-off-by: kim <grufwub@gmail.com>

* fix to produce actual plaintext from html

Signed-off-by: kim <grufwub@gmail.com>

* fix span tags instead written as space

Signed-off-by: kim <grufwub@gmail.com>

* performance improvements to regex replacements, fix link replace logic for un-html-ing in the future

Signed-off-by: kim <grufwub@gmail.com>

* fix tag/mention replacements to use input string, fix link replace to not include scheme

Signed-off-by: kim <grufwub@gmail.com>

* use matched input string for link replace href text

Signed-off-by: kim <grufwub@gmail.com>

* remove unused code (to appease linter :sobs:)

Signed-off-by: kim <grufwub@gmail.com>

* improve hashtagFinger regex to be more compliant

Signed-off-by: kim <grufwub@gmail.com>

* update breakReplacer to include both unix and windows line endings

Signed-off-by: kim <grufwub@gmail.com>

* add NoteRaw field to Account to store plaintext account bio, add migration for this, set for sensitive accounts

Signed-off-by: kim <grufwub@gmail.com>

* drop unnecessary code

Signed-off-by: kim <grufwub@gmail.com>

* update text package tests to fix logic changes

Signed-off-by: kim <grufwub@gmail.com>

* add raw note content testing to account update and account verify

Signed-off-by: kim <grufwub@gmail.com>

* remove unused modules

Signed-off-by: kim <grufwub@gmail.com>

* fix emoji regex

Signed-off-by: kim <grufwub@gmail.com>

* fix replacement of hashtags

Signed-off-by: kim <grufwub@gmail.com>

* update code comment

Signed-off-by: kim <grufwub@gmail.com>

Co-authored-by: Mina Galić <mina.galic@puppet.com>
2022-05-07 17:55:27 +02:00
tobi
ef5a9256a8
Extend license notices to 2022 (#354) 2021-12-20 18:42:19 +01:00
kim
7e4c3fa5c7
fix mention extracting when no domain exists (usually intra-instance mentions) (#272)
* fix mention extracting when no domain exists (usually when intra-instance mentions)

Signed-off-by: kim <grufwub@gmail.com>

* fix search logic to match new mention matching logic

Signed-off-by: kim <grufwub@gmail.com>

* appease the linter :p

Signed-off-by: kim <grufwub@gmail.com>
2021-10-17 14:19:49 +02:00
tobi
28b6ce59d6
don't catch mentions within links (#257) 2021-09-30 18:11:57 +02:00
tsmethurst
5d5327614d lint 2021-09-02 12:24:18 +02:00
tsmethurst
4696e1a7b3 moving stuff around 2021-09-01 18:29:25 +02:00