C# 使用 iTextSharp 將 PDF 轉換成 TXT 文本

本文轉載自查看原文 2019-03-17 11:07 902 C#

 1             var pdfReader = new PdfReader("xxx.pdf");
 2             
 3             StreamWriter output = new StreamWriter(new FileStream("處理結果.txt", FileMode.Create));
 4 
 5             int pageCount = pdfReader.NumberOfPages;
 6             for (int pg = 1; pg <= pageCount; pg++)
 7             {
 8                 ITextExtractionStrategy strategy = new SimpleTextExtractionStrategy();
 9                 var value = PdfTextExtractor.GetTextFromPage(pdfReader, pg, strategy);
10                 value = value.Replace(" ", "");
11                 Console.WriteLine(value);
12                 output.Write(value);
13             }
14 
15             output.Flush();
16             output.Close();
17             Console.Write("處理完畢");
18             Console.ReadLine();

該方法讀出的漢字不會亂碼。

免責聲明！

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。

猜您在找 C#實現文檔轉換成PDF C# 將PowerPoint文件轉換成PDF文件 c#，將pdf文件轉換成圖片文件。 C# 使用Microsoft.Office.Interop將Excel、Word轉換成PDF遇到的問題總結 nodejs將PDF文件轉換成txt文本，並利用python處理轉換后的文本文件 C#將JSON文本轉換成HttpResponseMessage數據行 C#使用iTextSharp壓縮PDF圖片 C#使用iTextSharp將圖片轉為pdf C#使用ITextSharp操作pdf 使用iTextSharp的在C＃中旋轉PDF