【libreoffice】libreoffice實現office轉pdf、html、jpg等格式數據


  其實libreoffice有好多功能,完全可以替代office

 

1.windows下將word轉為pdf

1  安裝libreoffice

到官網下載后安裝即可。https://donate.libreoffice.org/

安裝完成后目錄:

 

其實安裝完我們發現其有好多功能,現在介紹幾個重要的功能。

soffice.exe --- 類似於一個全收錄功能,雙擊可以新建好多格式文本。

sweb.exe---類似於一個html的編輯器,可以編輯好多文件,可能與notpad++更像。

scalc.exe---類似於excel,對表格處理。

simpress.exe---類似於ppt

swriter.exe---類似於word,編輯文檔(當然可以打開docx文檔)

sbase.exe----對數據庫進行操作,可以通過JDBC、ODBC連接數據庫,沒有可視化工具的時候可以用這個。

 

2.配置環境變量(為了我們能在任何情況下調用命令)

 

 

執行命令測試soffice

C:\Users\liqiang>
LibreOffice 6.0.6.2 0c292870b25a325b5ed35f6b45599d2ea4458e77

Usage: soffice [argument...]
       argument - switches, switch parameters and document URIs (filenames).

Using without special arguments:
Opens the start center, if it is used without any arguments.
   {file}              Tries to open the file (files) in the components
                       suitable for them.
   {file} {macro:///Library.Module.MacroName}
                       Opens the file and runs specified macros from
                       the file.

Getting help and information:
   --help | -h | -?    Shows this help and quits.
   --helpwriter        Opens built-in or online Help on Writer.
   --helpcalc          Opens built-in or online Help on Calc.
   --helpdraw          Opens built-in or online Help on Draw.
   --helpimpress       Opens built-in or online Help on Impress.
   --helpbase          Opens built-in or online Help on Base.
   --helpbasic         Opens built-in or online Help on Basic scripting
                       language.
   --helpmath          Opens built-in or online Help on Math.
   --version           Shows the version and quits.
   --nstemporarydirectory
                       (MacOS X sandbox only) Returns path of the temporary
                       directory for the current user and exits. Overrides
                       all other arguments.

General arguments:
   --quickstart[=no]   Activates[Deactivates] the Quickstarter service.
   --nolockcheck       Disables check for remote instances using one
                       installation.
   --infilter={filter} Force an input filter type if possible. For example:
                       --infilter="Calc Office Open XML"
                       --infilter="Text (encoded):UTF8,LF,,,"
   --pidfile={file}    Store soffice.bin pid to {file}.
   --display {display} Sets the DISPLAY environment variable on UNIX-like
                       platforms to the value {display} (only supported by a
                       start script).

User/programmatic interface control:
   --nologo            Disables the splash screen at program start.
   --minimized         Starts minimized. The splash screen is not displayed.
   --nodefault         Starts without displaying anything except the splash
                       screen (do not display initial window).
   --invisible         Starts in invisible mode. Neither the start-up logo nor
                       the initial program window will be visible. Application
                       can be controlled, and documents and dialogs can be
                       controlled and opened via the API. Using the parameter,
                       the process can only be ended using the taskmanager
                       (Windows) or the kill command (UNIX-like systems). It
                       cannot be used in conjunction with --quickstart.
   --headless          Starts in "headless mode" which allows using the
                       application without GUI. This special mode can be used
                       when the application is controlled by external clients
                       via the API.
   --norestore         Disables restart and file recovery after a system crash.
   --safe-mode         Starts in a safe mode, i.e. starts temporarily with a
                       fresh user profile and helps to restore a broken
                       configuration.
   --accept={UNO-URL}  Specifies an UNO-URL connect-string to create an UNO
                       acceptor through which other programs can connect to
                       access the API. UNO-URL is string the such kind
                   uno:connection-type,params;protocol-name,params;ObjectName.
   --unaccept={UNO-URL} Closes an acceptor that was created with --accept. Use
                       --unaccept=all to close all open acceptors.
   --language={lang}   Uses specified language, if language is not selected
                       yet for UI. The lang is a tag of the language in IETF
                       language tag.

Developer arguments:
   --terminate_after_init
                       Exit after initialization complete (no documents loaded).
   --eventtesting      Exit after loading documents.

New document creation arguments:
The arguments create an empty document of specified kind. Only one of them may
be used in one command line. If filenames are specified after an argument,
then it tries to open those files in the specified component.
   --writer            Creates an empty Writer document.
   --calc              Creates an empty Calc document.
   --draw              Creates an empty Draw document.
   --impress           Creates an empty Impress document.
   --base              Creates a new database.
   --global            Creates an empty Writer master (global) document.
   --math              Creates an empty Math document (formula).
   --web               Creates an empty HTML document.

File open arguments:
The arguments define how following filenames are treated. New treatment begins
after the argument and ends at the next argument. The default treatment is to
open documents for editing, and create new documents from document templates.
   -n                  Treats following files as templates for creation of new
                       documents.
   -o                  Opens following files for editing, regardless whether
                       they are templates or not.
   --pt {Printername}  Prints following files to the printer {Printername},
                       after which those files are closed. The splash screen
                       does not appear. If used multiple times, only last
                       {Printername} is effective for all documents of all
                       --pt runs. Also, --printer-name argument of
                       --print-to-file switch interferes with {Printername}.
   -p                  Prints following files to the default printer, after
                       which those files are closed. The splash screen does
                       not appear. If the file name contains spaces, then it
                       must be enclosed in quotation marks.
   --view              Opens following files in viewer mode (read-only).
   --show              Opens and starts the following presentation documents
                       of each immediately. Files are closed after the showing.
                       Files other than Impress documents are opened in
                       default mode , regardless of previous mode.
   --convert-to OutputFileExtension[:OutputFilterName]
     [--outdir output_dir] [--convert-images-to]
                       Batch convert files (implies --headless). If --outdir
                       isn't specified, then current working directory is used
                       as output_dir. If --convert-images-to is given, its
                       parameter is taken as the target MIME format for *all*
                       images written to the output format. If --convert-to is
                       used more than once, the last value of OutputFileExtension
                       [:OutputFilterName] is effective. If --outdir is used more
                       than once, only its last value is effective. For example:
                   --convert-to pdf *.odt
                   --convert-to epub *.doc
                   --convert-to pdf:writer_pdf_Export --outdir /home/user *.doc
                   --convert-to "html:XHTML Writer File:UTF8" *.doc
                   --convert-to "txt:Text (encoded):UTF8" *.doc
   --print-to-file [--printer-name printer_name] [--outdir output_dir]
                       Batch print files to file. If --outdir is not specified,
                       then current working directory is used as output_dir.
                       If --printer-name or --outdir used multiple times, only
                       last value of each is effective. Also, {Printername} of
                       --pt switch interferes with --printer-name.
   --cat               Dump text content of the following files to console
                       (implies --headless). Cannot be used with --convert-to.
   --script-cat        Dump text content of any scripts embedded in the files to console
                       (implies --headless). Cannot be used with --convert-to.
   -env:<VAR>[=<VALUE>] Set a bootstrap variable. For example: to set
                       a non-default user profile path:
                       -env:UserInstallation=file:///tmp/test

Ignored switches:
   -psn                Ignored (MacOS X only).
   -Embedding          Ignored (COM+ related; Windows only).
   --nofirststartwizard Does nothing, accepted only for backward compatibility.
   --protector {arg1} {arg2}
                       Used only in unit tests and should have two arguments.

 

 

4.命令行轉換pdf

 轉換到當前目錄:

liqiang@root MINGW64 ~/Desktop/新建文件夾 (3)
$ soffice --headless --convert-to pdf ./Java開發-太原科技大學-軟件工程-喬利強.docx
convert C:\Users\liqiang\Desktop\▒½▒▒ļ▒▒▒ (3)\Java▒▒▒▒-̫ԭ▒Ƽ▒▒▒ѧ-▒▒▒▒▒▒▒-▒▒▒▒ǿ.docx -> C:\Users\liqiang\Desktop\▒½▒▒ļ▒▒▒ (3)\Java▒▒▒▒-̫ԭ▒Ƽ▒▒▒ѧ-▒▒▒▒▒▒▒-▒▒▒▒ǿ.pdf using filter : writer_pdf_Export
func=xmlSecCheckVersionExt:file=..\src\xmlsec.c:line=188:obj=unknown:subj=unknown:error=19:invalid version:mode=abi compatible;expected minor version=2;real minor version=2;expected subminor version=25;real subminor version=26

liqiang@root MINGW64 ~/Desktop/新建文件夾 (3)
$ ls
Java開發-太原科技大學-軟件工程-喬利強.docx
Java開發-太原科技大學-軟件工程-喬利強.pdf

 

 

如果需要轉換到指定目錄可以加--outdir參數

 

5.java程序實現word轉pdf(原理是通過cmd調用上述命令)

import java.io.IOException;
import java.io.InputStream;

import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public final class Test {
    private static final Logger logger = LoggerFactory.getLogger(Test.class);

    public static void main(String[] args) throws NullPointerException {
        long start = System.currentTimeMillis();
        String srcPath = "C:/Users/liqiang/Desktop/ww/tt.docx", desPath = "C:/Users/liqiang/Desktop/ww";
        String command = "";
        String osName = System.getProperty("os.name");
        if (osName.contains("Windows")) {
            command = "soffice --headless --convert-to pdf " + srcPath + " --outdir " + desPath;
            exec(command);
        }
        long end = System.currentTimeMillis();
        logger.debug("用時:{} ms", end - start);
    }

    public static boolean exec(String command) {
        Process process;// Process可以控制該子進程的執行或獲取該子進程的信息
        try {
            logger.debug("exec cmd : {}", command);
            process = Runtime.getRuntime().exec(command);// exec()方法指示Java虛擬機創建一個子進程執行指定的可執行程序,並返回與該子進程對應的Process對象實例。
            // 下面兩個可以獲取輸入輸出流
            InputStream errorStream = process.getErrorStream();
            InputStream inputStream = process.getInputStream();
        } catch (IOException e) {
            logger.error(" exec {} error", command, e);
            return false;
        }

        int exitStatus = 0;
        try {
            exitStatus = process.waitFor();// 等待子進程完成再往下執行,返回值是子線程執行完畢的返回值,返回0表示正常結束
            // 第二種接受返回值的方法
            int i = process.exitValue(); // 接收執行完畢的返回值
            logger.debug("i----" + i);
        } catch (InterruptedException e) {
            logger.error("InterruptedException  exec {}", command, e);
            return false;
        }

        if (exitStatus != 0) {
            logger.error("exec cmd exitStatus {}", exitStatus);
        } else {
            logger.debug("exec cmd exitStatus {}", exitStatus);
        }

        process.destroy(); // 銷毀子進程
        process = null;

        return true;
    }

}

 

另一種命令的方式為  cmd /c soffice ..... .

另外寫的時候最好pdf后面跟上  :writer_pdf_Export,例如: --convert-to pdf:writer_pdf_Export  可能會在轉換失敗后調用過濾器重寫。

 

 結果:

2018-10-25 21:56:35 [Test]-[DEBUG] exec cmd : soffice --headless --convert-to pdf C:/Users/liqiang/Desktop/ww/tt.docx --outdir C:/Users/liqiang/Desktop/ww
2018-10-25 21:56:45 [Test]-[DEBUG] i----0
2018-10-25 21:56:45 [Test]-[DEBUG] exec cmd exitStatus 0
2018-10-25 21:56:45 [Test]-[DEBUG] 用時:9980 ms

 

 

 2.linux實現將word轉為pdf,以centos為例

1.linux下安裝libreoffice

1.下載

  我們安裝采用yum安裝,首先下載rpm包。這里需要三個包。

wget http://mirrors.ustc.edu.cn/tdf/libreoffice/stable/6.0.6/rpm/x86_64/LibreOffice_6.0.6_Linux_x86-64_rpm.tar.gz
wget http://mirrors.ustc.edu.cn/tdf/libreoffice/stable/6.0.6/rpm/x86_64/LibreOffice_6.0.6_Linux_x86-64_rpm_sdk.tar.gz
wget http://mirrors.ustc.edu.cn/tdf/libreoffice/stable/6.0.6/rpm/x86_64/LibreOffice_6.0.6_Linux_x86-64_rpm_langpack_zh-CN.tar.gz

 

  

  其實我們在windows下通過瀏覽器訪問上面鏈接也是可以下載tar.gz包的,如果需要不同的版本只需要修改url上的版本號即可。比如我想下載6.0.3的我可以訪問下面url:

http://mirrors.ustc.edu.cn/tdf/libreoffice/stable/6.0.3/rpm/x86_64/LibreOffice_6.0.3_Linux_x86-64_rpm.tar.gz

 

  其實好多時候我們采用wget下載的時候如果下載不下來, 我們可以先在windows下訪問url下載完只會傳到linux服務器,這也是一種思路。

2.上傳到服務器並解壓

采用  tar -xvf xxxxxx.tar.gz解壓即可。解壓結果如下:

[root@VM_0_12_centos libreoffice]# ll
total 263748
drwxr-xr-x 4 root root      4096 Jul 28 06:07 LibreOffice_6.0.6.2_Linux_x86-64_rpm
drwxr-xr-x 3 root root      4096 Jul 28 07:32 LibreOffice_6.0.6.2_Linux_x86-64_rpm_langpack_zh-CN
drwxr-xr-x 3 root root      4096 Jul 28 06:27 LibreOffice_6.0.6.2_Linux_x86-64_rpm_sdk
-rw-r--r-- 1 root root    798421 Oct 25 12:13 LibreOffice_6.0.6_Linux_x86-64_rpm_langpack_zh-CN.tar.gz
-rw-r--r-- 1 root root  36919386 Oct 25 12:24 LibreOffice_6.0.6_Linux_x86-64_rpm_sdk.tar.gz
-rw-r--r-- 1 root root 213845646 Oct 25 10:33 LibreOffice_6.0.6_Linux_x86-64_rpm.tar.gz

 

 

3.采用yum localinstall *.rpm安裝rpm文件

[root@VM_0_12_centos RPMS]# pwd
/opt/libreoffice/LibreOffice_6.0.6.2_Linux_x86-64_rpm/RPMS
[root@VM_0_12_centos RPMS]# yum localinstall *.rpm

 

  RPMS下存放的是需要安裝的rpm文件,進入該文件夾下采用通配符的方式安裝即可。(三個tar.gz解壓后的都需要安裝)

 

4.測試libreoffice

[root@VM_0_12_centos RPMS]# libreoffice6.0 -help
Warning: -help is deprecated.  Use --help instead.
LibreOffice 6.0.6.2 0c292870b25a325b5ed35f6b45599d2ea4458e77

Usage: soffice [argument...]
       argument - switches, switch parameters and document URIs (filenames).

Using without special arguments:
Opens the start center, if it is used without any arguments.
   {file}              Tries to open the file (files) in the components
                       suitable for them.
   {file} {macro:///Library.Module.MacroName}
                       Opens the file and runs specified macros from
                       the file.

Getting help and information:
   --help | -h | -?    Shows this help and quits.
   --helpwriter        Opens built-in or online Help on Writer.
   --helpcalc          Opens built-in or online Help on Calc.
   --helpdraw          Opens built-in or online Help on Draw.
   --helpimpress       Opens built-in or online Help on Impress.
   --helpbase          Opens built-in or online Help on Base.
   --helpbasic         Opens built-in or online Help on Basic scripting
                       language.
   --helpmath          Opens built-in or online Help on Math.
   --version           Shows the version and quits.
   --nstemporarydirectory
                       (MacOS X sandbox only) Returns path of the temporary
                       directory for the current user and exits. Overrides
                       all other arguments.

General arguments:
   --quickstart[=no]   Activates[Deactivates] the Quickstarter service.
   --nolockcheck       Disables check for remote instances using one
                       installation.
   --infilter={filter} Force an input filter type if possible. For example:
                       --infilter="Calc Office Open XML"
                       --infilter="Text (encoded):UTF8,LF,,,"
   --pidfile={file}    Store soffice.bin pid to {file}.
   --display {display} Sets the DISPLAY environment variable on UNIX-like
                       platforms to the value {display} (only supported by a
                       start script).

User/programmatic interface control:
   --nologo            Disables the splash screen at program start.
   --minimized         Starts minimized. The splash screen is not displayed.
   --nodefault         Starts without displaying anything except the splash
                       screen (do not display initial window).
   --invisible         Starts in invisible mode. Neither the start-up logo nor
                       the initial program window will be visible. Application
                       can be controlled, and documents and dialogs can be
                       controlled and opened via the API. Using the parameter,
                       the process can only be ended using the taskmanager
                       (Windows) or the kill command (UNIX-like systems). It
                       cannot be used in conjunction with --quickstart.
   --headless          Starts in "headless mode" which allows using the
                       application without GUI. This special mode can be used
                       when the application is controlled by external clients
                       via the API.
   --norestore         Disables restart and file recovery after a system crash.
   --safe-mode         Starts in a safe mode, i.e. starts temporarily with a
                       fresh user profile and helps to restore a broken
                       configuration.
   --accept={UNO-URL}  Specifies an UNO-URL connect-string to create an UNO
                       acceptor through which other programs can connect to
                       access the API. UNO-URL is string the such kind
                   uno:connection-type,params;protocol-name,params;ObjectName.
   --unaccept={UNO-URL} Closes an acceptor that was created with --accept. Use
                       --unaccept=all to close all open acceptors.
   --language={lang}   Uses specified language, if language is not selected
                       yet for UI. The lang is a tag of the language in IETF
                       language tag.

Developer arguments:
   --terminate_after_init
                       Exit after initialization complete (no documents loaded).
   --eventtesting      Exit after loading documents.

New document creation arguments:
The arguments create an empty document of specified kind. Only one of them may
be used in one command line. If filenames are specified after an argument,
then it tries to open those files in the specified component.
   --writer            Creates an empty Writer document.
   --calc              Creates an empty Calc document.
   --draw              Creates an empty Draw document.
   --impress           Creates an empty Impress document.
   --base              Creates a new database.
   --global            Creates an empty Writer master (global) document.
   --math              Creates an empty Math document (formula).
   --web               Creates an empty HTML document.

File open arguments:
The arguments define how following filenames are treated. New treatment begins
after the argument and ends at the next argument. The default treatment is to
open documents for editing, and create new documents from document templates.
   -n                  Treats following files as templates for creation of new
                       documents.
   -o                  Opens following files for editing, regardless whether
                       they are templates or not.
   --pt {Printername}  Prints following files to the printer {Printername},
                       after which those files are closed. The splash screen
                       does not appear. If used multiple times, only last
                       {Printername} is effective for all documents of all
                       --pt runs. Also, --printer-name argument of
                       --print-to-file switch interferes with {Printername}.
   -p                  Prints following files to the default printer, after
                       which those files are closed. The splash screen does
                       not appear. If the file name contains spaces, then it
                       must be enclosed in quotation marks.
   --view              Opens following files in viewer mode (read-only).
   --show              Opens and starts the following presentation documents
                       of each immediately. Files are closed after the showing.
                       Files other than Impress documents are opened in
                       default mode , regardless of previous mode.
   --convert-to OutputFileExtension[:OutputFilterName]
     [--outdir output_dir] [--convert-images-to]
                       Batch convert files (implies --headless). If --outdir
                       isn't specified, then current working directory is used
                       as output_dir. If --convert-images-to is given, its
                       parameter is taken as the target MIME format for *all*
                       images written to the output format. If --convert-to is
                       used more than once, the last value of OutputFileExtension
                       [:OutputFilterName] is effective. If --outdir is used more
                       than once, only its last value is effective. For example:
                   --convert-to pdf *.odt
                   --convert-to epub *.doc
                   --convert-to pdf:writer_pdf_Export --outdir /home/user *.doc
                   --convert-to "html:XHTML Writer File:UTF8" *.doc
                   --convert-to "txt:Text (encoded):UTF8" *.doc
   --print-to-file [--printer-name printer_name] [--outdir output_dir]
                       Batch print files to file. If --outdir is not specified,
                       then current working directory is used as output_dir.
                       If --printer-name or --outdir used multiple times, only
                       last value of each is effective. Also, {Printername} of
                       --pt switch interferes with --printer-name.
   --cat               Dump text content of the following files to console
                       (implies --headless). Cannot be used with --convert-to.
   --script-cat        Dump text content of any scripts embedded in the files to console
                       (implies --headless). Cannot be used with --convert-to.
   -env:<VAR>[=<VALUE>] Set a bootstrap variable. For example: to set
                       a non-default user profile path:
                       -env:UserInstallation=file:///tmp/test

Ignored switches:
   -psn                Ignored (MacOS X only).
   -Embedding          Ignored (COM+ related; Windows only).
   --nofirststartwizard Does nothing, accepted only for backward compatibility.
   --protector {arg1} {arg2}
                       Used only in unit tests and should have two arguments.

 

  安裝后的命令是libreoffice6.0

 

5.為了使用libreoffice我們創建別名

[root@VM_0_12_centos ~]# alias libreoffice='libreoffice6.0'
[root@VM_0_12_centos ~]# alias
alias cp='cp -i'
alias egrep='egrep --color=auto'
alias fgrep='fgrep --color=auto'
alias grep='grep --color=auto'
alias l.='ls -d .* --color=auto'
alias libreoffice='libreoffice6.0'
alias ll='ls -l --color=auto'
alias ls='ls --color=auto'

 

 

2.linux下面命令行測試word轉pdf(其參數與windows下的參數大體相同)

[root@VM_0_12_centos tmpFile]# ls
tt.docx
[root@VM_0_12_centos tmpFile]# libreoffice6.0 --convert-to pdf:writer_pdf_Export ./tt.docx
func=xmlSecCheckVersionExt:file=xmlsec.c:line=188:obj=unknown:subj=unknown:error=19:invalid version:mode=abi compatible;expected minor version=2;real minor version=2;expected subminor version=25;real subminor version=26
convert /root/tmpFile/tt.docx -> /root/tmpFile/tt.pdf using filter : writer_pdf_Export
[root@VM_0_12_centos tmpFile]# ls
tt.docx  tt.pdf
[root@VM_0_12_centos tmpFile]#

 

 

  我們將上面生成的pdf傳回windows下面查看發現中文亂碼。

 

3.關於word轉pdf中文亂碼問題的解決辦法

1.查看fonts目錄

[root@VM_0_12_centos tmpFile]# cat /etc/fonts/fonts.conf | grep fon
<!DOCTYPE fontconfig SYSTEM "fonts.dtd">
<!-- /etc/fonts/fonts.conf file to configure system font access -->
<fontconfig>
        problems to the fontconfig bugzilla system located at fontconfig.org
        Note that the normal 'make install' procedure for fontconfig is to
        replace any existing fonts.conf file with the new version. Place
        <dir>/usr/share/fonts</dir>
        <dir>/usr/share/X11/fonts/Type1</dir> <dir>/usr/share/X11/fonts/TTF</dir> <dir>/usr/local/share/fonts</dir>
        <dir prefix="xdg">fonts</dir>
        <dir>~/.fonts</dir>
        <include ignore_missing="yes">/etc/fonts/conf.d</include>
        <cachedir>/var/cache/fontconfig</cachedir>
        <cachedir prefix="xdg">fontconfig</cachedir>
        <cachedir>~/.fontconfig</cachedir>
  in fonts.  All other blank chars are assumed to be broken and
</fontconfig>

 

 

發現上面的字體存在/usr/share/fonts目錄下。

 

2.把Windows下的字體C:\Windows\Fonts下的宋體,即simsun.ttc上傳到linux服務器並賦值到上面的字體目錄下賦予讀寫權限

[root@VM_0_12_centos libreoffice]# ll | grep simsun.ttc
-rw-r--r-- 1 root root  18214472 Oct 25 13:19 simsun.ttc

 

cp simsun.ttc /usr/share/fonts

 

cd /usr/share/fonts

 

賦予權限(默認權限也可以,如果不可以就手動賦予權限即可)

chmod 644 simsun.ttc

 

3.更新字體緩存

fc-cache -fv

 

 

  再次轉換pdf發現完美解決。

 

4.linux下Java程序調用libreoffice轉換pdf

  文件的位置與輸出目錄通過主函數參數傳遞進去。

 (1)先寫一個簡單的程序進行測試

import java.io.IOException;

public class Test {

    public static void main(String[] args) throws NullPointerException {
        String filePath = args[0];
        String destDir = args[1];
        String osName = System.getProperty("os.name");
        System.out.println(filePath);
        System.out.println(destDir);
        System.out.println(osName);
        String cmd = "libreoffice6.0 --convert-to pdf:writer_pdf_Export " + filePath + " --outdir " + destDir;
        System.out.println(cmd);
        try {
            Runtime.getRuntime().exec(cmd);
        } catch (IOException e) {
            System.err.println(e.getMessage());
        }
    }

}

 

我們在linux下面進行編譯並且運行:

[root@VM_0_12_centos tmpFile]# javac Test.java
[root@VM_0_12_centos tmpFile]# java Test ./tt.docx ./
./tt.docx
./
Linux
libreoffice6.0 --convert-to pdf:writer_pdf_Export ./tt.docx --outdir ./
[root@VM_0_12_centos tmpFile]# ls
Test.class  Test.java  tt.docx  tt.pdf

 

 

 (2)接下來簡單的編寫程序獲取轉換時間:(使線程等待抓換完成)

import java.io.IOException;

public class Test {

    public static void main(String[] args) throws NullPointerException {
        long start = System.currentTimeMillis();
        String filePath = args[0];
        String destDir = args[1];
        String osName = System.getProperty("os.name");
        System.out.println(filePath);
        System.out.println(destDir);
        System.out.println(osName);
        String cmd = "libreoffice6.0 --convert-to pdf:writer_pdf_Export " + filePath + " --outdir " + destDir;
        System.out.println(cmd);
        try {
            Process process = Runtime.getRuntime().exec(cmd);
            try {
                // 獲取返回狀態
                int status = process.waitFor();
                // 銷毀process
                process.destroy();
                process = null;
                System.out.println("status -> " + status);
            } catch (InterruptedException e) {
                System.err.println(e.getMessage());
            }
        } catch (IOException e) {
            System.err.println(e.getMessage());
        }
        long end = System.currentTimeMillis();
        System.out.println("用時:" + (end - start) + "ms");
    }

}

 

再次在linux下面編譯運行:

[root@VM_0_12_centos tmpFile]# java Test ./tt.docx ./
./tt.docx
./
Linux
libreoffice6.0 --convert-to pdf:writer_pdf_Export ./tt.docx --outdir ./
status -> 0
用時:1463ms
[root@VM_0_12_centos tmpFile]# ls
Test.class  Test.java  tt.docx  tt.pdf

 

 

至此完成了使用libreoffice在windows與linux下面轉換pdf,這種方式感覺比較穩定。同時也學會了Runtime 調用本地程序以單線程方式運行的方法。

 

  文中用到的所有的tar包以及字體simsun.ttc下載地址:http://qiaoliqiang.cn/fileDown/linuxlibreoffice.zip

 

補充:word也可以轉為html,測試word轉html

word內容:

 

 

soffice.exe --headless --convert-to html .\通用功能需求收集20180723.docx

 

結果:

 

 補充:word可以轉jpg

soffice.exe --headless --convert-to jpg .\通用功能需求收集20180723.docx

 

 結果生成jpg:

 

 補充:word可以轉txt

soffice.exe --headless --convert-to txt .\通用功能需求收集20180723.docx

 

結果:

 

補充:其實excel和ppt也可以轉為pdf和html以及jpg,下面研究excel轉換(只是邊框被去掉,如果需要顯示邊框在excel中的樣式需要顯示邊框;而且內容過長會折行,解決辦法就是縮小列寬、減少列數)

原來excel內容:

 

 轉換:

soffice.exe --headless --convert-to jpg ./test.xls

soffice.exe --headless --convert-to html ./test.xls

soffice.exe --headless --convert-to pdf ./test.xls

 

 (1)轉換后的jpg

(2)轉換的html

(3)轉換后的pdf

 

補充:直接拷貝目錄遇到的問題:

  今天拷貝下載好的目錄使用時,發現報錯缺失VCRUNTIME140.dll和MSVCP140.dll,於是拷貝另外一台電腦到缺失的電腦上就可以了。記住是C:\Windows\System32目錄和C:\Windows\SysWOW64目錄下對應的dll,這兩個文件夾下的dll不一樣,雖然文件名一樣,但是大小不一樣,所以要復制對應的dll。

 

補充;java也可以用jodconverter進行轉換,我用的是jodconverter2.2版本(該工具包依賴openoffice或libreoffice插件)

依賴的jar包如下:

 

 代碼如下:

import java.io.File;
import java.io.IOException;

import com.artofsolving.jodconverter.DocumentConverter;
import com.artofsolving.jodconverter.openoffice.connection.OpenOfficeConnection;
import com.artofsolving.jodconverter.openoffice.connection.SocketOpenOfficeConnection;
import com.artofsolving.jodconverter.openoffice.converter.OpenOfficeDocumentConverter;

public class Office2Pdf {

    // 將word格式的文件轉換為pdf格式
    public static void WordToPDF(String startFile, String overFile) throws IOException {
        // 源文件目錄
        File inputFile = new File(startFile);
        if (!inputFile.exists()) {
            System.out.println("源文件不存在!");
            return;
        }

        // 輸出文件目錄
        File outputFile = new File(overFile);
        if (!outputFile.getParentFile().exists()) {
            outputFile.getParentFile().exists();
        }

        // 調用openoffice服務線程
        /** 我把openOffice下載到了 C:/Program Files (x86)/下 ,下面的寫法自己修改編輯就可以 **/
        String command = "D:/zdc8/lo/program/soffice.exe -headless -accept=\"socket,host=127.0.0.1,port=8300;urp;\"";
        Process p = Runtime.getRuntime().exec(command);

        // 連接openoffice服務
        OpenOfficeConnection connection = new SocketOpenOfficeConnection("127.0.0.1", 8300);
        connection.connect();

        // 轉換
        DocumentConverter converter = new OpenOfficeDocumentConverter(connection);
        converter.convert(inputFile, outputFile);

        // 關閉連接
        connection.disconnect();

        // 關閉進程
        p.destroy();
    }

    public static void main(String[] args) {
        String start = "C:\\Users\\Administrator\\Desktop\\123.xlsx";
        String over = "C:\\Users\\Administrator\\Desktop\\123.xlsx.pdf";
        try {
            WordToPDF(start, over);
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

 

如果想去掉留痕,需要反編譯jodconverter-2.2.2.jar,獲取類OpenOfficeDocumentConverter.java,修改方法loadAndExport,如下:(加粗部分是添加的代碼)

    private void loadAndExport(String inputUrl, Map/* <String,Object> */ loadProperties, String outputUrl,
            Map/* <String,Object> */ storeProperties) throws OpenOfficeException {
        XComponent document;
        try {
            document = loadDocument(inputUrl, loadProperties);
        } catch (ErrorCodeIOException errorCodeIOException) {
            throw new OpenOfficeException(
                    "conversion failed: could not load input document; OOo errorCode: " + errorCodeIOException.ErrCode,
                    errorCodeIOException);
        } catch (Exception otherException) {
            throw new OpenOfficeException("conversion failed: could not load input document", otherException);
        }
        if (document == null) {
            throw new OpenOfficeException("conversion failed: could not load input document");
        }

 XPropertySet mxDocProps = (XPropertySet) UnoRuntime.queryInterface(XPropertySet.class, document); try { mxDocProps.setPropertyValue("RedlineDisplayType", RedlineDisplayType.NONE); } catch (Exception e) { throw new OpenOfficeException("dispose RedlineDisplay failed", e); }

        refreshDocument(document);

        try {
            storeDocument(document, outputUrl, storeProperties);
        } catch (ErrorCodeIOException errorCodeIOException) {
            throw new OpenOfficeException(
                    "conversion failed: could not save output document; OOo errorCode: " + errorCodeIOException.ErrCode,
                    errorCodeIOException);
        } catch (Exception otherException) {
            throw new OpenOfficeException("conversion failed: could not save output document", otherException);
        }
    }

 

補充:基於libreoffice和jodconverter的文件在線預覽插件,這個插件功能強大,使用簡單

git地址:  https://github.com/kekingcn/kkFileView

博客地址:  https://my.oschina.net/keking/blog/3064732

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM