hadoop uncompress / process gzip

stackoverflow’s post kindly provides the answer (I used different keywords initially; so it took several hops until I found it):

TextInputFormat and descendants should automatically handle .gz compressed files. you can also implement your own InputFormat (which will split the input file into chunks for processing) and RecordReader (which extract one record at a time from the chunk)



