public final class CSVParser extends Object implements Iterable<CSVRecord>, Closeable
CSVFormat
.
The parser works record wise. It is not possible to go back, once a record has been parsed from the input stream.
There are several static factory methods that can be used to create instances for various types of resources:
parse(java.io.File, Charset, CSVFormat)
parse(String, CSVFormat)
parse(java.net.URL, java.nio.charset.Charset, CSVFormat)
Alternatively parsers can also be created by passing a Reader
directly to the sole constructor.
For those who like fluent APIs, parsers can be created using CSVFormat.parse(java.io.Reader)
as a shortcut:
for(CSVRecord record : CSVFormat.EXCEL.parse(in)) { ... }
To parse a CSV input from a file, you write:
File csvData = new File("/path/to/csv"); CSVParser parser = CSVParser.parse(csvData, CSVFormat.RFC4180); for (CSVRecord csvRecord : parser) { ... }
This will read the parse the contents of the file using the RFC 4180 format.
To parse CSV input in a format like Excel, you write:
CSVParser parser = CSVParser.parse(csvData, CSVFormat.EXCEL); for (CSVRecord csvRecord : parser) { ... }
If the predefined formats don't match the format at hands, custom formats can be defined. More information about
customising CSVFormats is available in CSVFormat Javadoc
.
If parsing record wise is not desired, the contents of the input can be read completely into memory.
Reader in = new StringReader("a;b\nc;d"); CSVParser parser = new CSVParser(in, CSVFormat.EXCEL); List<CSVRecord> list = parser.getRecords();
There are two constraints that have to be kept in mind:
Internal parser state is completely covered by the format and the reader-state.
Constructor and Description |
---|
CSVParser(Reader reader,
CSVFormat format)
Customized CSV parser using the given
CSVFormat |
CSVParser(Reader reader,
CSVFormat format,
long characterOffset,
long recordNumber)
Customized CSV parser using the given
CSVFormat |
Modifier and Type | Method and Description |
---|---|
void |
close()
Closes resources.
|
long |
getCurrentLineNumber()
Returns the current line number in the input stream.
|
String |
getFirstEndOfLine()
Gets the first end-of-line string encountered.
|
Map<String,Integer> |
getHeaderMap()
Returns a copy of the header map.
|
List<String> |
getHeaderNames()
Returns a read-only list of header names that iterates in column order.
|
long |
getRecordNumber()
Returns the current record number in the input stream.
|
List<CSVRecord> |
getRecords()
Parses the CSV input according to the given format and returns the content as a list of
CSVRecords . |
boolean |
isClosed()
Gets whether this parser is closed.
|
Iterator<CSVRecord> |
iterator()
Returns an iterator on the records.
|
static CSVParser |
parse(File file,
Charset charset,
CSVFormat format)
Creates a parser for the given
File . |
static CSVParser |
parse(InputStream inputStream,
Charset charset,
CSVFormat format)
Creates a CSV parser using the given
CSVFormat . |
static CSVParser |
parse(Path path,
Charset charset,
CSVFormat format)
Creates a parser for the given
Path . |
static CSVParser |
parse(Reader reader,
CSVFormat format)
Creates a CSV parser using the given
CSVFormat |
static CSVParser |
parse(String string,
CSVFormat format)
Creates a parser for the given
String . |
static CSVParser |
parse(URL url,
Charset charset,
CSVFormat format)
Creates a parser for the given URL.
|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
forEach, spliterator
public CSVParser(Reader reader, CSVFormat format) throws IOException
CSVFormat
If you do not read all records from the given reader
, you should call close()
on the parser,
unless you close the reader
.
reader
- a Reader containing CSV-formatted input. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.IllegalArgumentException
- If the parameters of the format are inconsistent or if either reader or format are null.IOException
- If there is a problem reading the header or skipping the first recordpublic CSVParser(Reader reader, CSVFormat format, long characterOffset, long recordNumber) throws IOException
CSVFormat
If you do not read all records from the given reader
, you should call close()
on the parser,
unless you close the reader
.
reader
- a Reader containing CSV-formatted input. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.characterOffset
- Lexer offset when the parser does not start parsing at the beginning of the source.recordNumber
- The next record number to assignIllegalArgumentException
- If the parameters of the format are inconsistent or if either reader or format are null.IOException
- If there is a problem reading the header or skipping the first recordpublic static CSVParser parse(File file, Charset charset, CSVFormat format) throws IOException
File
.file
- a CSV file. Must not be null.charset
- A Charsetformat
- the CSVFormat used for CSV parsing. Must not be null.IllegalArgumentException
- If the parameters of the format are inconsistent or if either file or format are null.IOException
- If an I/O error occurspublic static CSVParser parse(InputStream inputStream, Charset charset, CSVFormat format) throws IOException
CSVFormat
.
If you do not read all records from the given reader
, you should call close()
on the parser,
unless you close the reader
.
inputStream
- an InputStream containing CSV-formatted input. Must not be null.charset
- a Charset.format
- the CSVFormat used for CSV parsing. Must not be null.IllegalArgumentException
- If the parameters of the format are inconsistent or if either reader or format are null.IOException
- If there is a problem reading the header or skipping the first recordpublic static CSVParser parse(Path path, Charset charset, CSVFormat format) throws IOException
Path
.path
- a CSV file. Must not be null.charset
- A Charsetformat
- the CSVFormat used for CSV parsing. Must not be null.IllegalArgumentException
- If the parameters of the format are inconsistent or if either file or format are null.IOException
- If an I/O error occurspublic static CSVParser parse(Reader reader, CSVFormat format) throws IOException
CSVFormat
If you do not read all records from the given reader
, you should call close()
on the parser,
unless you close the reader
.
reader
- a Reader containing CSV-formatted input. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.IllegalArgumentException
- If the parameters of the format are inconsistent or if either reader or format are null.IOException
- If there is a problem reading the header or skipping the first recordpublic static CSVParser parse(String string, CSVFormat format) throws IOException
String
.string
- a CSV string. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.IllegalArgumentException
- If the parameters of the format are inconsistent or if either string or format are null.IOException
- If an I/O error occurspublic static CSVParser parse(URL url, Charset charset, CSVFormat format) throws IOException
If you do not read all records from the given url
, you should call close()
on the parser, unless
you close the url
.
url
- a URL. Must not be null.charset
- the charset for the resource. Must not be null.format
- the CSVFormat used for CSV parsing. Must not be null.IllegalArgumentException
- If the parameters of the format are inconsistent or if either url, charset or format are null.IOException
- If an I/O error occurspublic void close() throws IOException
close
in interface Closeable
close
in interface AutoCloseable
IOException
- If an I/O error occurspublic long getCurrentLineNumber()
ATTENTION: If your CSV input has multi-line values, the returned number does not correspond to the record number.
public String getFirstEndOfLine()
public Map<String,Integer> getHeaderMap()
The map keys are column names. The map values are 0-based indices.
public List<String> getHeaderNames()
public long getRecordNumber()
ATTENTION: If your CSV input has multi-line values, the returned number does not correspond to the line number.
public List<CSVRecord> getRecords() throws IOException
CSVRecords
.
The returned content starts at the current parse-position in the stream.
CSVRecords
, may be emptyIOException
- on parse error or input read-failurepublic boolean isClosed()
public Iterator<CSVRecord> iterator()
An IOException
caught during the iteration are re-thrown as an
IllegalStateException
.
If the parser is closed a call to Iterator.next()
will throw a
NoSuchElementException
.
Copyright © 2019 The Apache Software Foundation. All rights reserved.