Python 將文本轉換成html的簡單示例


實例txt文件test_input.txt:

Welcome to World Wide Spam. Inc.



These are the corporate web pages of *World Wide Spam*,Inc.We hope
you find your stay enjoyable,and that you will sample many of our
products.

A short history if the company

World Wide Spam was started in the summer of 2000.The business
concept was to ride the dot-com wave ande to make money both through
bulk email and by selling canned meat online.

After receiving several complaints from customers who weren't
satisfied by their bulk email.World Wide Spam altered their profile,
and focused 100%on canned goods.Today,they rank as the world's
13,892nd online supplier of SPAM.

Destinations

From this page you may visit several of our intersting web pages:

-What is SPAM?(http://wwspam.fu/whatisspam)

-How do they make it?(http://wwspam.fu/howtomakeit)

-Why should I eat it?(http://wwspam.fu/whyeatif)

How to get in touch with us

You can get in touch with us in *many* ways: By phone (555-1234),by
email (wwspam@wwspam.fu) or by visiting our customer feedback page
(http://wwspam.fu/feedback).

 

 

將txt文件分塊的模塊util.py:

def lines(file):
    for line in file:yield line
    yield '\n'

def blocks(file):
    block = []
    for line in lines(file):
        if line.strip():
           block.append(line)
        elif block:
           yield ''.join(block).strip()
           block=[]

 

簡單的轉換模塊simple_markup.py:

import sys,re
from util import *

print '<html><body>'

title = True
for block in blocks(sys.stdin):
    block = re.sub(r'\*(.+?)\*',r'<em>\1</em>',block)
    if title:
        print'<h1>'
        print block
        print '</h1>'
        title =False
    else:
        print'<p>'
        print block
        print'</p>'

print'</body></html>'

轉換代碼:python simple_markup.py<test_input.txt> test_output.html

代碼執行過后當前目錄會產生一個html文件test_output.html,放入瀏覽器運行可觀察效果。

關於代碼的注釋部分可以參看http://1.imablog.sinaapp.com/exam-translate-txt-html/


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM