Wednesday 22 June 2011

reading data from the .Doc file by using Apache POI api

This program simply explains how to read data from the MS wordfile(.DOC) line by line using Apache POI,
what is Apache POI and what is the need i already explain in previous post, you can find that post here
for executing this program we need to download Apache POI api and make jar files  in classpath.

Example
import java.io.File;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.IOException;
import org.apache.poi.hwpf.HWPFDocument;
import org.apache.poi.hwpf.extractor.WordExtractor;

public class NewDocReader {
public static void main(String args[]) throws FileNotFoundException, IOException{

File docFile=new File(“c:\\multi\\multi.doc”); // file object was created
// file input stream with docFile
 
FileInputStream finStream=new FileInputStream(docFile.getAbsolutePath()); 
// throws IOException and need to import org.apache.poi.hwpf.HWPFDocument;
HWPFDocument doc=new HWPFDocument(finStream);
// import  org.apache.poi.hwpf.extractor.WordExtractor
WordExtractor wordExtract=new WordExtractor(doc);
String [] dataArray =wordExtract.getParagraphText();
// dataArray stores the each line from the document
for(int i=0;i<dataArray.length;i++)
{
System.out.println(“\n–”+dataArray[i]);
// printing lines from the array
}
finStream.close(); //closing fileinputstream
}
}

No comments:

Post a Comment