Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

format-all make org export html coding error #197

Open
Jousimies opened this issue Jun 29, 2022 · 8 comments
Open

format-all make org export html coding error #197

Jousimies opened this issue Jun 29, 2022 · 8 comments

Comments

@Jousimies
Copy link

emacs -q and mini configuration as below

(add-to-list 'load-path "~/.emacs.d/packages/format-all/")
(add-to-list 'load-path "~/.emacs.d/packages/inheritenv")
(add-to-list 'load-path "~/.emacs.d/packages/language-id")
(require 'format-all)
(add-hook 'prog-mode-hook 'format-all-mode)
(add-hook 'format-all-mode-hook 'format-all-ensure-formatter)

open a file such as foo.org and write some Chinese character as below

测试

Then, C-c C-e h o export org file to html , open the html file with browser and it display as below:
image

If just emacs -q and export foo.org to foo.html without mini configuration related to format-all, the foo.html displayed well.
image

@chuxubank
Copy link

I can also reproduce this bug.
Currently, if you install tidy (the default formatter for html), then the Chinese characters display normally, but the style is broken (compare to original org-export)
Can we ignore the org-exported files by default?

@lassik
Copy link
Owner

lassik commented Jun 29, 2022

Is HTML Tidy messing up the file?

Can you try M-x customize-variable format-all-debug and turn on debug mode? Then format-all will write information into the *Messages* buffer every time it runs a formatter.

I don't know why the org-mode exporter switches its output buffer to html-mode and runs the on-save hook. Seems oddly designed; these should be reserved for interactive use.

@Jousimies
Copy link
Author

Do it with format-all-debug on , the message buffer contents as below:

Using default formatter prettier [2 times]
Using default formatter html-tidy [2 times]
Saving file /Users/jousimies/123.html...
Format-All: Formatting 123.html as HTML using html-tidy
Format-All: Running: /usr/bin/tidy -q --tidy-mark no -indent
Format-All: Directory: /Users/jousimies/
Reformatted!
Wrote /Users/jousimies/123.html
Running open /Users/jousimies/123.html...done

@chuxubank
Copy link

The macOS built-in tidy will mess up Chinese characters.
The brew install tidy-html5 will mess up style.

@lassik
Copy link
Owner

lassik commented Jul 1, 2022

I guess org-mode isn't writing a charset declaration into the HTML file, and Tidy assumes the wrong charset.

I don't know what could cause the style to be corrupted.

Anyway if I've understood the design of Emacs correctly, an internal process like org-mode's HTML writer should not write files using the user interface commands that run hooks and such. Instead, it should use a simple library function like write-region. Currently org-publish-org-to is happily using UI commands such as find-file.

The quick fix is to write a custom function for prog-mode-hook that only enables format-all-mode when there is no existing .org file beside the file being edited. Something like this (not tested):

(add-hook
 'prog-mode-hook
 (lambda ()
   (unless (and (buffer-file-name)
                (file-exists-p
                 (concat (file-name-sans-extension (buffer-file-name))
                         ".org")))
     (format-all-mode))))

The long-term fix should be made to org-mode. Either rewrite the exporter not to use find-file and such, or add an official function like org-export-buffer-p that other packages can check.

@APIPLM
Copy link

APIPLM commented Jul 2, 2022

As running through exporting one file org, in which it has one Chinese character in Emacs 27.2 in the container of the docker ,which has format-all package, but without installing the tidy. it broke in the temp file created in the working folder. When I open this temp file, the content section is displayed as <p>\345\273\226 </p>. it can be displayed normally in some case , some case can not be. Like in VM host that container of the docker, it can be displayed normally in Emacs in the VM, but in the container of the docker it is displayed as ?? in Emacs. The temp html file has the character set uft8 in the meta section.

I checked org export reference. sound like the export functions in org-mode is back ground job. For checking the org-export-buffer-p is kind of hard for them. My idea is opposite, the function org-export-buffer-p in the user site is enough.

@lassik
Copy link
Owner

lassik commented Jul 4, 2022

sound like the export functions in org-mode is back ground job. For checking the org-export-buffer-p is kind of hard for them.

It would not be hard for org-mode to do. Something like this (pseudo code):

(defvar-local org-export-buffer-p nil)

(defun org-publish-org-to (...)
  (with-current-buffer (get-or-create-export-buffer ...)
    (setq-local org-export-buffer-p t)
    ...)))

@APIPLM
Copy link

APIPLM commented Jul 5, 2022

Back-end job mostly like everything run in that --batch mode. And also the function is for all kind of the format (like .md, .html .tex), if set org-export-buffer-p f as the default value for the one format like .html, It will block all other formats, which the user in the current buffer is exporting to as well. I mean if it has this option, which should be alist for the user to customize.Maybe we can add this kind information of customer alist to ext-plist in the function org-export-to-file as the input for the exporting following process. I am thinking that if format-all is a local variable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants