Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AnchorCheck plugin seems not working…well... #557

Open
jmbeuken opened this issue Oct 21, 2014 · 5 comments
Open

AnchorCheck plugin seems not working…well... #557

jmbeuken opened this issue Oct 21, 2014 · 5 comments

Comments

@jmbeuken
Copy link

Hi,

config : linkchecker 9.3 / python 2.7.8 / Cent OS 6.5

$ cat test.html
<html><head></head>
<body>
<hr>
<a href="#broken">broken link</a> <br>
<a href="#working">working link</a> <br>
<H2 id="working">working…</H2>
</body></html>

I start linkchecker with AnchorCheck enabled :

$ linkchecker  -t 1 -v test.html
…
LinkChecker 9.3              Copyright (C) 2000-2014 Bastian Kleineidam
...

Start checking at 2014-10-21 09:44:25-004

URL        `file:///home/buildbot/WorkSpace/test.html'
Name       `test.html'
Real URL   file:///home/buildbot/WorkSpace/test.html
Result     Valid

URL        `#working'
Name       `working link'
Parent URL file:///home/buildbot/WorkSpace/test.html, line 5, col 1
Real URL   file:///home/buildbot/WorkSpace/test.html
Result     Valid

URL        `#broken'
Name       `broken link'
Parent URL file:///home/buildbot/WorkSpace/test.html, line 4, col 1
Real URL   file:///home/buildbot/WorkSpace/test.html
Result     Valid

That's it. 3 links in 1 URL checked. 0 warnings found. 0 errors found.
Stopped checking at 2014-10-21 09:44:25-004 (0.02 seconds)

with version 8.1 :

$ linkchecker -a test.html
…
LinkChecker 8.1              Copyright (C) 2000-2012 Bastian Kleineidam
...
Start checking at 2014-10-21 15:43:44+002

URL        `#broken'
Name       `broken link'
Parent URL file:///home/buildbot/WorkSpace/test.html, line 4, col 1
Real URL   file:///home/buildbot/WorkSpace/test.html
D/L time   0.000 seconds
Size       184B
Info       2 URLs parsed.
Warning    [url-anchor-not-found] Anchor `broken' not found.
           Available anchors: `working'.
Result     Valid

Statistics:
Robots.txt cache: 0 hits, 0 misses
Content types: 0 image, 3 text, 0 video, 0 audio, 0 application, 0 mail and 0 other.
URL lengths: min=41, max=41, avg=41.

That's it. 3 links checked. 1 warning found. 0 errors found.
Stopped checking at 2014-10-21 15:43:44+002 (0.02 seconds)

I make something wrong ?

regards

jmb

@lemzwerg
Copy link

... I get the same. Incorrect anchors are always marked as 'Valid'.

@remko
Copy link

remko commented Apr 3, 2016

I noticed --anchors was deprecated in favor of plugins. However, even when using a plugin, the anchors aren't checked. I noticed none of the URLs with anchors come through, so the problem seems to be in the core.

@RainerKlute
Copy link

Any chance to see this issue fixed anytime soon? Thanks!

@RainerKlute
Copy link

Issue #513 might provide some important insight into the problem.

@yarikoptic
Copy link

oh, that is a nice one.... here is one of the examples of oddity -- initial run finds the error, but the other ones (another loop with 2 files to go through) -- not. I kept poking around more, even with -t -1 (no threading?) the order of logged debug output is varying... some dict/set/whatever seems to provide things in random order and I guess some decision making is done based on the previously visited urls, thus in some cases some anchored urls do not reach the check (my wild guess).

First run finds, the other one not
(git)hopa:~/proj/bids/bids-specification[bf-links]git
$> for f in /home/yoh/proj/bids/bids-specification/site/01*html; do echo $f; linkchecker $f; done 
/home/yoh/proj/bids/bids-specification/site/01-introduction.html
INFO linkcheck.cmdline 2018-10-30 23:31:49,021 MainThread Checking intern URLs only; use --check-extern to check extern URLs.
LinkChecker 9.4.0              Copyright (C) 2000-2014 Bastian Kleineidam
LinkChecker comes with ABSOLUTELY NO WARRANTY!
This is free software, and you are welcome to redistribute it
under certain conditions. Look at the file `LICENSE' within this
distribution.
Get the newest version at http://wummel.github.io/linkchecker/
Write comments and bugs to https://github.com/wummel/linkchecker/issues
Support this project at http://wummel.github.io/linkchecker/donations.html

Start checking at 2018-10-30 23:31:49-004

URL        `03-modality-agnostic-files.html#YYY'
Name       `\n      Modality agnostic files\n    '
Parent URL file:///home/yoh/proj/bids/bids-specification/site/01-introduction.html, line 268, col 5
Real URL   file:///home/yoh/proj/bids/bids-specification/site/03-modality-agnostic-files.html
Check time 0.449 seconds
D/L time   0.000 seconds
Size       24.20KB
Modified   2018-10-31 02:46:44.554920Z
Warning    [None] Anchor `YYY' not found. Available anchors:
           `__drawer', `__search', `__toc', `changes', `code',
           `dataset-description', `dataset_descriptionjson',
           `modality-agnostic-files', `nav-1', `nav-1-4',
           `participants-file', `readme', `scans-file'.
Result     Valid
 3 threads active,     0 links queued,  159 links in 162 URLs checked, runtime 1 seconds

Statistics:
Downloaded: 582.65KB.
Content types: 5 image, 23 text, 0 video, 0 audio, 43 application, 0 mail and 115 other.
URL lengths: min=8, max=130, avg=65.

That's it. 186 links in 186 URLs checked. 1 warning found. 0 errors found.
Stopped checking at 2018-10-30 23:31:50-004 (1 seconds)
1 15528 ->1.....................................:Tue 30 Oct 2018 11:31:51 PM EDT:.
(git)hopa:~/proj/bids/bids-specification[bf-links]git
$> for f in /home/yoh/proj/bids/bids-specification/site/0[12]*html; do echo $f; linkchecker $f; done 
/home/yoh/proj/bids/bids-specification/site/01-introduction.html
INFO linkcheck.cmdline 2018-10-30 23:32:00,725 MainThread Checking intern URLs only; use --check-extern to check extern URLs.
LinkChecker 9.4.0              Copyright (C) 2000-2014 Bastian Kleineidam
LinkChecker comes with ABSOLUTELY NO WARRANTY!
This is free software, and you are welcome to redistribute it
under certain conditions. Look at the file `LICENSE' within this
distribution.
Get the newest version at http://wummel.github.io/linkchecker/
Write comments and bugs to https://github.com/wummel/linkchecker/issues
Support this project at http://wummel.github.io/linkchecker/donations.html

Start checking at 2018-10-30 23:32:00-004

Statistics:
Downloaded: 582.65KB.
Content types: 5 image, 23 text, 0 video, 0 audio, 43 application, 0 mail and 115 other.
URL lengths: min=8, max=130, avg=65.

That's it. 186 links in 186 URLs checked. 0 warnings found. 0 errors found.
Stopped checking at 2018-10-30 23:32:01-004 (0.97 seconds)
/home/yoh/proj/bids/bids-specification/site/02-common-principles.html
INFO linkcheck.cmdline 2018-10-30 23:32:02,878 MainThread Checking intern URLs only; use --check-extern to check extern URLs.
LinkChecker 9.4.0              Copyright (C) 2000-2014 Bastian Kleineidam
LinkChecker comes with ABSOLUTELY NO WARRANTY!
This is free software, and you are welcome to redistribute it
under certain conditions. Look at the file `LICENSE' within this
distribution.
Get the newest version at http://wummel.github.io/linkchecker/
Write comments and bugs to https://github.com/wummel/linkchecker/issues
Support this project at http://wummel.github.io/linkchecker/donations.html

Start checking at 2018-10-30 23:32:02-004

Statistics:
Downloaded: 582.65KB.
Content types: 5 image, 23 text, 0 video, 0 audio, 43 application, 0 mail and 115 other.
URL lengths: min=8, max=130, avg=65.

That's it. 186 links in 186 URLs checked. 0 warnings found. 0 errors found.
Stopped checking at 2018-10-30 23:32:03-004 (0.97 seconds)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants