pdf文件之itextpdf插入html內容以及中文解決方案


簡述

目前網上已經有很多種html文件直接轉pdf的技術帖子,但是很少有直接將部分html作為段落插入到pdf中,而且也沒有一個可以很好的解決中文顯示的問題。

因此今天上午圍繞這個問題進行了研究,把解決方案分享給大家。

itextpdf基礎操作請訪問:http://www.cnblogs.com/mvilplss/p/5640598.html

感謝:http://gridmix.blog.51cto.com/4764051/1229585

實現思路

如果想插入html片段,我們使用一個類的靜態方法:

1         String html = "<div style='color:green;font-size:20px;'>你好世界!hello world !</div>";
2         Paragraph context = new Paragraph();
3         ElementList elementList =XMLWorkerHelper.parseToElementList(htmlString, null);
4         for (Element element : elementList) {
5             context.add(element);
6         }
7         document.add(context);

不過你會發現不能顯示中文,這個問題網上有很多種解決方法,但是都不好使。

查看XMLWorkerHelper.parseToElementList(htmlString, null)這個方法的源碼,發現

CssAppliers cssAppliers = new CssAppliersImpl(FontFactory.getFontImp());可以進行字體的更換。
 1 public static ElementList parseToElementList(String html, String css) throws IOException {
 2         // CSS
 3         CSSResolver cssResolver = new StyleAttrCSSResolver();
 4         if (css != null) {
 5             CssFile cssFile = XMLWorkerHelper.getCSS(new ByteArrayInputStream(css.getBytes()));
 6             cssResolver.addCss(cssFile);
 7         }
 8         
 9         // HTML
10         CssAppliers cssAppliers = new CssAppliersImpl(FontFactory.getFontImp());//這里可以下手對字體進行操作 11         HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
12         htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
13         htmlContext.autoBookmark(false);
14         
15         // Pipelines
16         ElementList elements = new ElementList();
17         ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
18         HtmlPipeline htmlPipeline = new HtmlPipeline(htmlContext, end);
19         CssResolverPipeline cssPipeline = new CssResolverPipeline(cssResolver, htmlPipeline);
20         
21         // XML Worker
22         XMLWorker worker = new XMLWorker(cssPipeline, true);
23         XMLParser p = new XMLParser(worker);
24         p.parse(new ByteArrayInputStream(html.getBytes()));
25         
26         return elements;
27     }

因此我們就想到重寫XMLWorkerFontProvider類的getFont(*)方法,對於沒有顯示聲明css樣式的字體,默認使用undefine字體樣式進行設置默認字體。

 1 public class MyXMLWorkerHelper {
 2     public static class MyFontsProvider extends XMLWorkerFontProvider {
 3         public MyFontsProvider() {
 4             super(null, null);
 5         }
 6 
 7         @Override
 8         public Font getFont(final String fontname, String encoding, float size, final int style) {
 9 
10             String fntname = fontname;
11             if (fntname == null) {
12                 fntname = "宋體";
13             }
14             return super.getFont(fntname, encoding, size, style);
15         }
16     }
17 
18     public static ElementList parseToElementList(String html, String css) throws IOException {
19         // CSS
20         CSSResolver cssResolver = new StyleAttrCSSResolver();
21         if (css != null) {
22             CssFile cssFile = XMLWorkerHelper.getCSS(new ByteArrayInputStream(css.getBytes()));
23             cssResolver.addCss(cssFile);
24         }
25 
26         // HTML
27         MyFontsProvider fontProvider = new MyFontsProvider();
28         CssAppliers cssAppliers = new CssAppliersImpl(fontProvider);
29         HtmlPipelineContext htmlContext = new HtmlPipelineContext(cssAppliers);
30         htmlContext.setTagFactory(Tags.getHtmlTagProcessorFactory());
31         htmlContext.autoBookmark(false);
32 
33         // Pipelines
34         ElementList elements = new ElementList();
35         ElementHandlerPipeline end = new ElementHandlerPipeline(elements, null);
36         HtmlPipeline htmlPipeline = new HtmlPipeline(htmlContext, end);
37         CssResolverPipeline cssPipeline = new CssResolverPipeline(cssResolver, htmlPipeline);
38 
39         // XML Worker
40         XMLWorker worker = new XMLWorker(cssPipeline, true);
41         XMLParser p = new XMLParser(worker);
42         html = html.replace("<br>", "").replace("<hr>", "").replace("<img>", "").replace("<param>", "") 43                 .replace("<link>", ""); 44         p.parse(new ByteArrayInputStream(html.getBytes()));
45 
46         return elements;
47     }
48 
49 }

因為XMLWork不支持html的單標簽,所以要對但標簽進行過濾。不然就會報錯:Invalid nested tag div found, expected closing tag br

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM