python刪除所有的中文字符、非ASCII或非英文字符，檢查字符串是否包含非ASCII

本文轉載自查看原文 2017-03-25 21:34 10826 python

Your ''.join() expression is filtering, removing anything non-ASCII; you could use a conditional expression instead:

return ''.join([i if ord(i) < 128 else ' ' for i in text])

This handles characters one by one and would still use one space per character replaced.

Your regular expression should just replace consecutive non-ASCII characters with a space:

re.sub(r'[^\x00-\x7F]+',' ', text)

re.sub(r'[^\x00-\x7f]', ' ', str)

Note the + there.

檢查字符串是否包含非英文ASCII等：

a = "ds  dl,;sd!@)~`09歷史s"
regexp = re.compile(r'[^\x00-\x7f]')
if regexp.search(a):
  print('matched')

本站轉載的文章為個人學習借鑒使用，本站對版權不負任何法律責任。如果侵犯了您的隱私權益，請聯系本站郵箱yoyou2525@163.com刪除。