Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decode unicode filenames from URL #2131

Conversation

fedorkk
Copy link
Contributor

@fedorkk fedorkk commented Mar 3, 2017

No description provided.

@fedorkk
Copy link
Contributor Author

fedorkk commented Mar 3, 2017

The problem with downloading file with unicode symbols in URL through remote_image_url. It downloads correctly, but file name like this: 'юникод.jpg' sanitizes to this: '_D1_8E_D0_BD_D0_B8_D0_BA_D0_BE_D0_B4.jpg'. And arises NAME_TO_LONG errors, like there #539

@joemsak
Copy link

joemsak commented May 2, 2017

I thumbs-upped this PR but I want to comment too for emphasis. I am now using this branch in my production app because of this issue and it did fix it. I really need this merged so I can take advantage of upcoming updates for Rails 5.1 compatibility

@fedorkk
Copy link
Contributor Author

fedorkk commented May 2, 2017

As a workaround in my production I am using modified version of this monkey patch with a carrierwave master branch.

# frozen_string_literal: true
# Monkey patch for long filenames
# @see https://github.com/carrierwaveuploader/carrierwave/pull/539/files
module CWRemoteFix
  # 255 characters is the max size of a filename in modern filesystems
  # and 100 characters are allocated for versions
  MAX_FILENAME_LENGTH = 255 - 100

  def original_filename
    filename = filename_from_header || filename_from_uri
    mime_type = MIME::Types[file.content_type].first
    unless File.extname(filename).present? || mime_type.blank?
      filename = "#{filename}.#{mime_type.extensions.first}"
    end

    if filename.size > MAX_FILENAME_LENGTH
      extension = (filename =~ /\./) ? filename.split(/\./).last : false
      # 32 for MD5 and 2 for the __ separator
      split_position = MAX_FILENAME_LENGTH - 32 - 2
      # +1 for the . in the extension
      split_position -= (extension.size + 1) if extension
      # Generate an hash from original filename
      hex = Digest::MD5.hexdigest(filename[split_position, filename.size])
      # Create a new name within given limits
      filename = filename[0, split_position] + '__' + hex
      filename << '.' + extension if extension
    end
    # Return original or patched filename
    filename
  end

  def filename_from_uri
    URI.decode(File.basename(file.base_uri.path))
  end
end

# Monkeypatch downloader class using prepend
CarrierWave::Uploader::Download::RemoteFile.prepend CWRemoteFix

@thiagofm
Copy link
Member

thiagofm commented Jul 7, 2017

👍 thanks! Problem is well-described and tested.

@thiagofm thiagofm merged commit 61a961a into carrierwaveuploader:master Jul 7, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants