Converting Excel to CSV
There are several open source and commercial Java libraries that are capable of reading Excel workbooks. Among those are Apache POI , JExcelAPI, JCom, ExtenXLS7 to name a few. We will use JExcelAPI to recognize and process Excel files.
We assume that users are loading either Excel workbooks or ASCII files, such as XML documents or CSV files. This makes streamToString method looks like this.
protected String streamToString(String mimeType, InputStream inputStream) throws IOException
{
String result;
// check if this is an Excel spreadsheet
if (”application/vnd.ms-excel”.equalsIgnoreCase(mimeType))
{
try {
result = xlsToString(inputStream);
} catch (jxl.read.biff.BiffException ex)
{ // if not, then assume this is an ASCII stream
inputStream.reset();
result = streamToString(inputStream);
}
} else
{ // otherwise an ASCII stream
result = streamToString(inputStream);
}
return result;
}
This method interprets the mimeType of the inbound file, and either reads it as an Excel workbook using method xlsToString(InputStream) or converts it into a string using method streamToString(InputStream). We will omit details of streamToStream(InputStream) method, as it reads characters from the input stream and appends them to a string, and concentrate on xlsToString(InputStream).
The body for this method was borrowed from here. The method interprets the input stream as an Excel workbook and converts each sheet into a comma separated format.
private final static String CSV_SEPARATOR = “,”;
private String xlsToString(InputStream stream) throws jxl.read.biff.BiffException
{
StringWriter stringWriter = new StringWriter();
BufferedWriter bufferedWriter = new BufferedWriter(stringWriter);
try {
WorkbookSettings ws = new WorkbookSettings();
ws.setLocale(new Locale(”en”, “EN”));
Workbook w = Workbook.getWorkbook(stream, ws);// Gets the sheets from workbook
for (int sheet = 0; sheet < w.getNumberOfSheets(); sheet++)
{
Sheet s = w.getSheet(sheet);
Cell[] row = null;// Gets the cells from sheet
for (int i = 0 ; i < s.getRows() ; i++)
{
row = s.getRow(i);
if (row.length > 0)
{
bufferedWriter.write(formatExcelCell(row[0]));
for (int j = 1; j < row.length; j++)
{
bufferedWriter.write(CSV_SEPARATOR);
bufferedWriter.write(formatExcelCell(row[j]));
}
}
bufferedWriter.newLine();
}
}
bufferedWriter.flush();
} catch(jxl.read.biff.BiffException ex)
{
throw ex;
}
catch (Exception ex)
{
throw OAException.wrapperException(ex);
}return stringWriter.toString();
}
Method formatExcelCell is responsible for converting the contents of a cell into the one compatible with CSV file format. Namely
- double each double quote
- surround contents with double quotes if contents contains a double quote or comma
- convert date into DD-MMM-YYYY format
The formatExcelCell method body is
private final static String QUOTE_STRING = “”";
private SimpleDateFormat dateFormatter = new SimpleDateFormat(”d-MMM-yyyy”);private String formatExcelCell(Cell cell)
{
if (cell == null) return null;
String rowCell = cell.getContents();
// format the date
if ( cell.getType().equals(cell.getType().DATE) )
{
rowCell = dateFormatter.format(((DateCell)cell).getDate());
}
// double each quote
String result = rowCell.replaceAll(QUOTE_STRING, QUOTE_STRING+QUOTE_STRING);
// surround with quotes if comma or quote is present
if (result.indexOf(CSV_SEPARATOR) >0 || result.indexOf(QUOTE_STRING) >0)
{
result = QUOTE_STRING + result + QUOTE_STRING;
}
return result;
}
Here we reached the point when the incoming stream is tested to be an Excel workbook, and if so, converted into a comma separate stream. Otherwise, it is assumed that the incoming file is an ASCII stream.
Our next task is to store the contents in the database. Stay tuned to learn how to write CLOBs using BC4J.
0 comments ↓
There are no comments yet...Kick things off by filling out the form below.
Leave a Comment