From c7af89d557df6ff2d10599c73de0789792cc3173 Mon Sep 17 00:00:00 2001 From: Herbert Valerio Riedel Date: Sat, 15 Dec 2018 20:18:38 +0100 Subject: [PATCH] Fix corrupted config file header for non-ASCII package names The config-state header is a human readable line prepended to the binary serialisation which looks like Saved package config for pkgname-1.2.3 written by Cabal-2.5.0.0 using ghc-8.6 However, the functions generating and parsing this header didn't take into account that package names are not limited to the ASCII subset and blindly used the ByteString `pack` function which truncates away the high bits of the `Char` code point resulting in a corrupted header with a non-sensical package-name. The fix is simply to serialise the package-name with the UTF-8 encoding which works nicely with the rest of the UTF-8 unaware string handling functions. Hence the fix is a lot shorter than this commit message. Fixes #2557 --- Cabal/ChangeLog.md | 2 ++ Cabal/Distribution/Simple/Configure.hs | 4 ++-- 2 files changed, 4 insertions(+), 2 deletions(-) diff --git a/Cabal/ChangeLog.md b/Cabal/ChangeLog.md index 9c041e3be65..304dea6d88a 100644 --- a/Cabal/ChangeLog.md +++ b/Cabal/ChangeLog.md @@ -18,6 +18,8 @@ supports fully static linking; [`glibc` has some issues](https://sourceware.org/glibc/wiki/FAQ#Even_statically_linked_programs_need_some_shared_libraries_which_is_not_acceptable_for_me.__What_can_I_do.3F) with fully static linking. + * Fix corrupted config file header for non-ASCII package names + ([2557](https://github.com/haskell/cabal/issues/2557)). ---- diff --git a/Cabal/Distribution/Simple/Configure.hs b/Cabal/Distribution/Simple/Configure.hs index 328dc2ae095..ba860f24041 100644 --- a/Cabal/Distribution/Simple/Configure.hs +++ b/Cabal/Distribution/Simple/Configure.hs @@ -275,7 +275,7 @@ parseHeader header = case BLC8.words header of ["Saved", "package", "config", "for", pkgId, "written", "by", cabalId, "using", compId] -> fromMaybe (throw ConfigStateFileBadHeader) $ do - _ <- simpleParsec (BLC8.unpack pkgId) :: Maybe PackageIdentifier + _ <- simpleParsec (fromUTF8LBS pkgId) :: Maybe PackageIdentifier cabalId' <- simpleParsec (BLC8.unpack cabalId) compId' <- simpleParsec (BLC8.unpack compId) return (cabalId', compId') @@ -286,7 +286,7 @@ showHeader :: PackageIdentifier -- ^ The processed package. -> ByteString showHeader pkgId = BLC8.unwords [ "Saved", "package", "config", "for" - , BLC8.pack $ prettyShow pkgId + , toUTF8LBS $ prettyShow pkgId , "written", "by" , BLC8.pack $ prettyShow currentCabalId , "using"