Skip to content
Rajesh edited this page Jun 16, 2024 · 7 revisions

better-files

Scaladex License Tests codecov Known Vulnerabilities

better-files is a dependency-free pragmatic thin Scala wrapper around Java NIO.

Consult the changelog if you are upgrading your library.

Motivation

Imagine you have to write the following method:

  1. List all .csv files in a directory by increasing order of file size
  2. Drop the first line of each file and concat the rest into a single output file
  3. Split the above output file into n smaller files without breaking up the lines in the input files
  4. gzip each of the smaller output files

Note: Your program should work when files are much bigger than memory in your JVM and must close all open resources correctly

The above task is not that easy to write in Java or shell or Python without a certain amount of Googling. Using better-files, the above problem can be solved in a fairly straightforward way:

import better.files._

def run(inputDir: File, outputDir: File, n: Int) = {
  val count = new AtomicInteger()
  val outputs = Vector.tabulate(n)(i => outputDir / s"part-$i.csv.gz")
  for {
    writers <- outputs.map(_.newGzipOutputStream().printWriter()).autoClosed
    inputFile <- inputDir.list(_.extension == Some(".csv")).toSeq.sorted(File.Order.bySize)
    line <- inputFile.lineIterator.drop(1)
  } writers(count.incrementAndGet() % n).println(line)
}

Tests

Talks

ScalaDays NYC 2016: Introduction to better-files

Questions? Gitter Average time to resolve an issue

Ask in our gitter channel or file an issue with the question tag


Scala Steward badge

YourKit

YourKit supports better-files with its full-featured Java Profiler. YourKit, LLC is the creator of YourKit Java Profiler and YourKit .NET Profiler, innovative and intelligent tools for profiling Java and .NET applications.

Clone this wiki locally