Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

with_toc_data with Japanese #538

Closed
NathanHazout opened this issue Jan 5, 2016 · 18 comments · Fixed by #591
Closed

with_toc_data with Japanese #538

NathanHazout opened this issue Jan 5, 2016 · 18 comments · Fixed by #591
Milestone

Comments

@NathanHazout
Copy link

I have a markdown file in Japanese.

A header might look like:

## アプリケーション・ストリングの翻訳

The generated HTML looks like:
<h2 id=">アプリケーション・ストリングの翻訳</h2>

As you can see, it's not just that it leaves it empty, it generates invalid HTML (missing closing quote), breaking the entire page.

PS. This is happening inside a Jekyll 3 project

@eliot-akira
Copy link

I was just now experiencing the same issue using Jekyll, and came here to investigate.

### ヒューマンリーダブル

...becomes...

<h3 id=">ヒューマンリーダブル</h3>

Trial and error showed that even one Japanese character will produce this issue.

Possibly related to this commit and the regular expression used to strip inline markup from headers: 8ec8275

@KitaitiMakoto
Copy link

I also encountered the same problem. Minimal code to reproduce the problem is here:

# coding: utf-8                                                                                                                                
require 'redcarpet'

puts Redcarpet::VERSION

renderer = Redcarpet::Render::HTML.new(with_toc_data: true)
markdown = Redcarpet::Markdown.new(renderer, {})
puts markdown.render('# 見出し')
$ ruby broken-html.rb
3.3.4
<h1 id=">見出し</h1>

As an aside, "見出し" means heading(s).

@NathanHazout
Copy link
Author

According to last comment, it means it is purely redcarpet and not Jekyll-related.
@robin850 any ideas?

@IdanAdar
Copy link

Any idea if this issue is being tracked by the redcarpet team?

@scvthedefect
Copy link

The heading id is generated by its content. For content do not match /^\w$/ will cause this bug, id is incomplete and not closed in the right way.

I use heading tags explicitly (<h1></h1>) to avoid this but don't feel good.

@NathanHazout
Copy link
Author

@robin850 @vmg or anyone at redcarpet - any news with this?
Can someone review the attached Pull Request see if it's good?
We want to go live soon with our project and this is a problem...

@snkrheadz
Copy link

@nasht00 hi, this is my solution for now.

class MyRender < Redcarpet::Render::HTML
  def header(text, header_level)
    %Q{<h#{header_level} id="#{text.downcase.gsub(" ", "-")}">#{text}</h#{header_level}>}.html_safe
  end
end

@NathanHazout
Copy link
Author

@akinrt I tried your fix.
It works is most cases, however it breaks if the header contains a link.

For example:

### [Setting up Your Development Environment](../setting-up-your-development-environment/)

Generates:

<h3 id="<a-href="../setting-up-your-development-environment/">setting-up-your-development-environment</a>"><a href="../setting-up-your-development-environment/">Setting up Your Development Environment</a></h3>

Which is obviously no good. It used to work before including this fix.

@robin850
Copy link
Collaborator

Hi guys!

We've a bunch of lacks regarding non-ASCII languages, sorry about that and sorry for the late answer here. I will try to have a look as soon as possible. Thanks for reporting !

@ryush00
Copy link

ryush00 commented Mar 27, 2016

I also have the same issue.

### 한글 테스트

=>

<h3 id=">한글 테스트</h3>

Any updates?

@IdanAdar
Copy link

@robin850 Could this issue please be treated with the appropriate severity, given the many reports?

Wonicon added a commit to Wonicon/nju-oslab-lecture that referenced this issue Mar 30, 2016
When a heading has non-ASCII contents, the id attribute for that
heading will not close, making following contents not display.

See vmg/redcarpet#538
@ryush00
Copy link

ryush00 commented May 4, 2016

There are some alternatives.

https://www.ruby-toolbox.com/categories/markup_processors

@ermaker
Copy link

ermaker commented Jun 22, 2016

Any update?

@robin850 robin850 added this to the 3.4.0 milestone Jun 22, 2016
@lengerfulluse
Copy link

face the same problem with toc extension.
I think the root cause is https://github.com/vmg/redcarpet/blob/master/ext/redcarpet/html.c#L291. The isascii() check is not appropriate since it will cause generate empty header id for unicode character(CJK etc). Two possible solutions.

  • use url encode for non-ascii character.
  • use alternative isutf8() or some others check instead.

lengerfulluse referenced this issue Jul 13, 2016
Do not call tolower on non-ASCII chars because it would otherwise
insert invalid UTF-8 bytes into the HTML output. (tolower is not
locale-aware)

Invalid UTF-8 bytes will cause various errors, e.g. "ArgumentError
(invalid byte sequence in UTF-8)", when rendering the generated HTML in
Rails.

Signed-off-by: Clemens Gruber <clemensgru@gmail.com>
@gnujoow
Copy link

gnujoow commented Nov 9, 2016

facing same problem redcarpet 3.3.4 in korean.

@robin850
Copy link
Collaborator

Hi everyone !

#591 should solve this problem ; I'm so sorry for the delay, thank you very much for your patience guys ! This will be part of Redcarpet 3.4.0 that should be released to Rubgems very soon ! Merry Christmas and happy new year ! ❤️ 🎄 🎅 ❄️ ⭐️

@robin850
Copy link
Collaborator

Redcarpet 3.4.0 is available on Rubygems ! Enjoy ! ❤️

@KitaitiMakoto
Copy link

Great!

halfcrazy added a commit to halfcrazy/slate that referenced this issue Jan 3, 2017
halfcrazy added a commit to halfcrazy/slate that referenced this issue Jan 3, 2017
chriserwin added a commit to elemenohq/docs that referenced this issue Feb 24, 2017
* Add deploy configuration with a port set option

Addressing slatedocs#463

* Note the need for an upgraded Ruby

* Add JavaScript examples

* Update rouge languages link in README

* Add PR template

* Add issue template

* Update readme

* Remove CONTRIBUTING

* Bump CHANGELOG

* Word missing on Readme.md (#592)

* make scss variables changeable

all variables should only provide a default

this would allow us to include the screen.scss and simply set the variables we want to change before that.

* Add Ruby 2.2.0 to Travis testing matrix

* fix -margin

* Add some company links to readme

* Remove unused gem middleman-gh-pages

* Update middleman-syntax

* Upgrade sprockets

* Fix another bug where disabling language tabs didn't properly hide HTML

* Update ruby version requirements in Travis and README

* Fix bug where -margin wasn't properly respected even if search was off

* Fix build, update middleman

* Exec middleman server fails with invalid flags --force-polling and -l. Removed the flags to remedy.

* Update language list link and count in README

* Add back in middleman flags to Vagrant with new flag syntax

* Fix incorrect documentation in deploy.sh

* Update pull requset template

* Cut version 1.4.0

* Fix Woocommerce link in readme

* Adding Scale to the list of companies. (#694)

* Add Ruby 2.3.3 to tested rubies

* Add multiple-tabs-per-language test

* Latest middleman - fixing startup arguments (#653)

As per middleman/middleman#1866 (comment)
Tested. It works.

* Update redcarpet gem to 3.4.0 which will solve the unicode error with (#660)

h1/h2.
See vmg/redcarpet#538

* Updated Mozilla localForage link (#665)

Old link was broken

* Add Ruby 2.4.0 to .travis.yml

* Update middleman and middleman-sprokets, run bundle update

* Update code highlighting theme from Base16::Monokai to just Monokai

* Switch theme from Monokai to the less neon MonokaiSublime

* Allow Ruby 2.4.0 to fail for now

* Update to middleman 4.2.1 for ruby 2.4 fix

* Typo Fix (#693)

* Fix Woocommerce link in readme

* Fixed _errors.md typo

* Remove multiple language example from readme, users should just check wiki for instructions

* Add version 1.5.0 changelog notes
markable-user pushed a commit to markable-dev/markable-api-docs that referenced this issue Mar 1, 2017
kyrylo pushed a commit to airbrake/slate that referenced this issue Apr 21, 2017
jacques pushed a commit to jacques/diamondcash-public-api-docs that referenced this issue Apr 13, 2019
Aminocd added a commit to Aminocd/slate that referenced this issue Sep 28, 2020
jacques pushed a commit to jacques/plutus-public-api-docs that referenced this issue Sep 4, 2021
MikeLiamCode pushed a commit to MikeLiamCode/slate that referenced this issue Jun 14, 2024
alimoeeny pushed a commit to suprafrontal/hemato.ai.docs that referenced this issue Sep 18, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
12 participants