Java 中常用的幾種 DOCX 轉 PDF 方法

damoo 8年前發布 | 24K 次閱讀 Java Java開發

DOCX2PDF

將DOCX文檔轉化為PDF是項目中常見的需求之一,目前主流的方法可以分為兩大類,一類是利用各種Office應用進行轉換,譬如Microsoft Office、WPS以及LiberOffice,另一種是利用各種語言提供的對于Office文檔讀取的接口(譬如Apache POI)然后使用專門的PDFGenerator庫,譬如IText進行PDF構建。總的來說,從樣式上利用Office應用可以保證較好的樣式,不過相對而言效率會比較低。其中Microsoft Office涉及版權,不可輕易使用(筆者所在公司就被抓包了),WPS目前使用比較廣泛,不過存在超鏈接截斷問題,即超過256個字符的超鏈接會被截斷,LiberOffice的樣式排版相對比較隨意。而利用POI接口進行讀取與生成的方式性能較好,適用于對于格式要求不是很高的情況。另外還有一些封裝好的在線工具或者命令行工具,譬如 docx2pdfOfficeToPDF

MicroSoft Office

本部分的核心代碼如下,全部代碼參考 這里 :

private ActiveXComponent oleComponent = null;
private Dispatch activeDoc = null;
private final static String APP_ID = "Word.Application";

// Constants that map onto Word's WdSaveOptions enumeration and that // may be passed to the close(int) method public static final int DO_NOT_SAVE_CHANGES = 0; public static final int PROMPT_TO_SAVE_CHANGES = -2; public static final int SAVE_CHANGES = -1;

// These constant values determine whether or not tha application // instance will be displyed on the users screen or not. public static final boolean VISIBLE = true; public static final boolean HIDDEN = false;

/**

  • Create a new instance of the JacobWordSearch class using the following
  • parameters. *
  • @param visibility A primitive boolean whose value will determine whether
  • or not the Word application will be visible to the user. Pass true
  • to display Word, false otherwise. */ public OfficeConverter(boolean visibility) { this.oleComponent = new ActiveXComponent(OfficeConverter.APP_ID); this.oleComponent.setProperty("Visible", new Variant(visibility)); }

/**

  • Open ana existing Word document. *
  • @param docName An instance of the String class that encapsulates the
  • path to and name of a valid Word file. Note that there are a few
  • limitations applying to the format of this String; it must specify
  • the absolute path to the file and it must not use the single forward
  • slash to specify the path separator. */ public void openDoc(String docName) { Dispatch disp = null; Variant var = null; // First get a Dispatch object referencing the Documents collection - for // collections, think of ArrayLists of objects. var = Dispatch.get(this.oleComponent, "Documents"); disp = var.getDispatch(); // Now call the Open method on the Documents collection Dispatch object // to both open the file and add it to the collection. It would be possible // to open a series of files and access each from the Documents collection // but for this example, it is simpler to store a reference to the // active document in a private instance variable. var = Dispatch.call(disp, "Open", docName); this.activeDoc = var.getDispatch(); }

/**

  • There is more than one way to convert the document into PDF format, you
  • can either explicitly use a FileConvertor object or call the
  • ExportAsFixedFormat method on the active document. This method opts for
  • the latter and calls the ExportAsFixedFormat method passing the name
  • of the file along with the integer value of 17. This value maps onto one
  • of Word's constants called wdExportFormatPDF and causes the application
  • to convert the file into PDF format. If you wanted to do so, for testing
  • purposes, you could add another value to the args array, a Boolean value
  • of true. This would open the newly converted document automatically. *
  • @param filename */ public void publishAsPDF(String filename) { // The code to expoort as a PDF is 17 //Object args = new Object{filename, new Integer(17), new Boolean(true)}; Object args = new Object {
     filename, new Integer(17)
    
    } ; Dispatch.call(this.activeDoc, "ExportAsFixedFormat", args); }

/**

  • Called to close the active document. Note that this method simply
  • calls the overloaded closeDoc(int) method passing the value 0 which
  • instructs Word to close the document and discard any changes that may
  • have been made since the document was opened or edited. */ public void closeDoc() { this.closeDoc(JacobWordSearch.DO_NOT_SAVE_CHANGES); }

/**

  • Called to close the active document. It is possible with this overloaded
  • version of the close() method to specify what should happen if the user
  • has made changes to the document that have not been saved. There are three
  • possible value defined by the following manifest constants;
  • DO_NOT_SAVE_CHANGES - Close the document and discard any changes
  • the user may have made.
  • PROMPT_TO_SAVE_CHANGES - Display a prompt to the user asking them
  • how to proceed.
  • SAVE_CHANGES - Save the changes the user has made to the document. *
  • @param saveOption A primitive integer whose value indicates how the close
  • operation should proceed if the user has made changes to the active
  • document. Note that no checks are made on the value passed to
  • this argument. */ public void closeDoc(int saveOption) { Object args = {new Integer(saveOption)}; Dispatch.call(this.activeDoc, "Close", args); }

/**

  • Called once processing has completed in order to close down the instance
  • of Word. */ public void quit() { Dispatch.call(this.oleComponent, "Quit"); }</code></pre>

    WPS

    本文的核心代碼如下,完整代碼查看 這里 :

    @Override

     public boolean convert(String word, String pdf) {
         File pdfFile = new File(pdf);
         File wordFile = new File(word);
         boolean convertSuccessfully = false;
    
         ActiveXComponent wps = null;
         ActiveXComponent doc = null;
        try {
            wps = new ActiveXComponent("KWPS.Application");

// Dispatch docs = wps.getProperty("Documents").toDispatch(); // Dispatch d = Dispatch.call(docs, "Open", wordFile.getAbsolutePath(), false, true).toDispatch(); // Dispatch.call(d, "SaveAs", pdfFile.getAbsolutePath(), 17); // Dispatch.call(d, "Close", false);

            doc = wps.invokeGetComponent("Documents")
                    .invokeGetComponent("Open", new Variant(wordFile.getAbsolutePath()));

            try {
                doc.invoke("SaveAs",
                        new Variant(new File("C:\\Users\\lotuc\\Documents\\mmm.pdf").getAbsolutePath()),
                        new Variant(17));
                convertSuccessfully = true;
            } catch (Exception e) {
                logger.warning("生成PDF失敗");
                e.printStackTrace();
            }

            File saveAsFile = new File("C:\\Users\\lotuc\\Documents\\saveasfile.doc");
            try {
                doc.invoke("SaveAs", saveAsFile.getAbsolutePath());
                logger.info("成功另存為" + saveAsFile.getAbsolutePath());
            } catch (Exception e) {
                logger.info("另存為" + saveAsFile.getAbsolutePath() + "失敗");
                e.printStackTrace();
            }
        } finally {
            if (doc == null) {
                logger.info("打開文件 " + wordFile.getAbsolutePath() + " 失敗");
            } else {
                try {
                    logger.info("釋放文件 " + wordFile.getAbsolutePath());
                    doc.invoke("Close");
                    doc.safeRelease();
                } catch (Exception e1) {
                    logger.info("釋放文件 " + wordFile.getAbsolutePath() + " 失敗");
                }
            }

            if (wps == null) {
                logger.info("加載 WPS 控件失敗");
            } else {
                try {
                    logger.info("釋放 WPS 控件");
                    wps.invoke("Quit");
                    wps.safeRelease();
                } catch (Exception e1) {
                    logger.info("釋放 WPS 控件失敗");
                }
            }
        }

        return convertSuccessfully;
    }</code></pre> 

LiberOffice

LiberOffice本身提供了一個命令行工具進行轉換,在你安裝好了LiberOffice之后

/usr/local/bin/soffice --convert-to pdf:writer_pdf_Export /Users/lotuc/Downloads/test.doc

如果有打開的libreoffice實例, 要穿入env選項指定一個工作目錄

/usr/local/bin/soffice "-env:UserInstallation=file:///tmp/LibreOffice_Conversion_abc" --convert-to pdf:writer_pdf_Export /Users/lotuc/Downloads/test.doc

首先我們需要安裝好LiberOffice,然后將依賴的Jar包添加到classpath中:

Install Libre Office

Create a Java project in your favorite editor and add these to your class path: [Libre Office Dir]/URE/java/juh.jar [Libre Office Dir]/URE/java/jurt.jar [Libre Office Dir]/URE/java/ridl.jar [Libre Office Dir]/program/classes/unoil.jar</code></pre>

然后我們需要啟動一個LiberOffice進程:

import java.util.Date;
import java.io.File;
import com.sun.star.beans.PropertyValue;
import com.sun.star.comp.helper.Bootstrap;
import com.sun.star.frame.XComponentLoader;
import com.sun.star.frame.XDesktop;
import com.sun.star.frame.XStorable;
import com.sun.star.lang.XComponent;
import com.sun.star.lang.XMultiComponentFactory;
import com.sun.star.text.XTextDocument;
import com.sun.star.uno.UnoRuntime;
import com.sun.star.uno.XComponentContext;
import com.sun.star.util.XReplaceDescriptor;
import com.sun.star.util.XReplaceable;

public class MailMergeExample {

public static void main(String[] args) throws Exception {

// Initialise XComponentContext xContext = Bootstrap.bootstrap();

XMultiComponentFactory xMCF = xContext.getServiceManager();

Object oDesktop = xMCF.createInstanceWithContext( "com.sun.star.frame.Desktop", xContext);

XDesktop xDesktop = (XDesktop) UnoRuntime.queryInterface( XDesktop.class, oDesktop);</code></pre>

接下來我們需要加載目標Doc文檔:

// Load the Document
String workingDir = "C:/projects/";
String myTemplate = "letterTemplate.doc";

if (!new File(workingDir + myTemplate).canRead()) { throw new RuntimeException("Cannot load template:" + new File(workingDir + myTemplate)); }

XComponentLoader xCompLoader = (XComponentLoader) UnoRuntime .queryInterface(com.sun.star.frame.XComponentLoader.class, xDesktop);

String sUrl = "file:///" + workingDir + myTemplate;

PropertyValue[] propertyValues = new PropertyValue[0];

propertyValues = new PropertyValue[1]; propertyValues[0] = new PropertyValue(); propertyValues[0].Name = "Hidden"; propertyValues[0].Value = new Boolean(true);

XComponent xComp = xCompLoader.loadComponentFromURL( sUrl, "_blank", 0, propertyValues);</code></pre>

然后我們可以使用如下方式對內容進行替換:

// Search and replace
XReplaceDescriptor xReplaceDescr = null;
XReplaceable xReplaceable = null;

XTextDocument xTextDocument = (XTextDocument) UnoRuntime .queryInterface(XTextDocument.class, xComp);

xReplaceable = (XReplaceable) UnoRuntime .queryInterface(XReplaceable.class, xTextDocument);

xReplaceDescr = (XReplaceDescriptor) xReplaceable .createReplaceDescriptor();

// mail merge the date xReplaceDescr.setSearchString("<date>"); xReplaceDescr.setReplaceString(new Date().toString()); xReplaceable.replaceAll(xReplaceDescr);

// mail merge the addressee xReplaceDescr.setSearchString("<addressee>"); xReplaceDescr.setReplaceString("Best Friend"); xReplaceable.replaceAll(xReplaceDescr);

// mail merge the signatory xReplaceDescr.setSearchString("<signatory>"); xReplaceDescr.setReplaceString("Your New Boss"); xReplaceable.replaceAll(xReplaceDescr);</code></pre>

然后可以輸出到PDF中:

// save as a PDF
XStorable xStorable = (XStorable) UnoRuntime
  .queryInterface(XStorable.class, xComp);

propertyValues = new PropertyValue[2]; propertyValues[0] = new PropertyValue(); propertyValues[0].Name = "Overwrite"; propertyValues[0].Value = new Boolean(true); propertyValues[1] = new PropertyValue(); propertyValues[1].Name = "FilterName"; propertyValues[1].Value = "writer_pdf_Export";

// Appending the favoured extension to the origin document name String myResult = workingDir + "letterOutput.pdf"; xStorable.storeToURL("file:///" + myResult, propertyValues);

System.out.println("Saved " + myResult);</code></pre>

xdocreport

本文的核心代碼如下,完整代碼查看 這里 :

/**

  • @param inpuFile 輸入的文件流
  • @param outFile 輸出的文件對象
  • @return
  • @function 利用Apache POI從輸入的文件中生成PDF文件 */ @SneakyThrows public static void convertWithPOI(InputStream inpuFile, File outFile) {

    //從輸入的文件流創建對象 XWPFDocument document = new XWPFDocument(inpuFile);

    //創建PDF選項 PdfOptions pdfOptions = PdfOptions.create();//.fontEncoding("windows-1250")

    //為輸出文件創建目錄 outFile.getParentFile().mkdirs();

    //執行PDF轉化 PdfConverter.getInstance().convert(document, new FileOutputStream(outFile), pdfOptions);

}

/**

  • @param inpuFile
  • @param outFile
  • @param renderParams
  • @function 先將渲染參數填入模板DOCX文件然后生成PDF */ @SneakyThrows public static void convertFromTemplateWithFreemarker(InputStream inpuFile, File outFile, Map<String, Object> renderParams) {

    //創建Report實例 IXDocReport report = XDocReportRegistry.getRegistry().loadReport(

         inpuFile, TemplateEngineKind.Freemarker);
    
    

    //創建上下文 IContext context = report.createContext();

    //填入渲染參數 renderParams.forEach((s, o) -> {

     context.put(s, o);
    

    });

    //創建輸出流 outFile.getParentFile().mkdirs();

    //創建轉化參數 Options options = Options.getTo(ConverterTypeTo.PDF).via(

         ConverterTypeVia.XWPF);
    
    

    //執行轉化過程 report.convert(context, options, new FileOutputStream(outFile)); }</code></pre>

     

    來自:https://segmentfault.com/a/1190000006789644

     

 本文由用戶 damoo 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!