Rajesh edited this page Jun 16, 2024

better-files is a dependency-free pragmatic thin Scala wrapper around Java NIO.

Consult the changelog if you are upgrading your library.


Imagine you have to write the following method:

  1. List all .csv files in a directory by increasing order of file size
  2. Drop the first line of each file and concat the rest into a single output file
  3. Split the above output file into n smaller files without breaking up the lines in the input files
  4. gzip each of the smaller output files

Note: Your program should work when files are much bigger than memory in your JVM and must close all open resources correctly

The above task is not that easy to write in Java or shell or Python without a certain amount of Googling. Using better-files, the above problem can be solved in a fairly straightforward way:

import better.files._

def run(inputDir: File, outputDir: File, n: Int) = {
  val count = new AtomicInteger()
  val outputs = Vector.tabulate(n)(i => outputDir / s"part-$i.csv.gz")
  for {
    writers <-
    inputFile <- inputDir.list(_.extension == Some(".csv")).toSeq.sorted(File.Order.bySize)
    line <- inputFile.lineIterator.drop(1)
  } writers(count.incrementAndGet() % n).println(line)



ScalaDays NYC 2016: Introduction to better-files

