網頁中的字符編碼(html的unicode實體編碼)


1、編碼轉換(to Unicode)

(程序代碼來源於網絡)

 

Js版

<script>
      test = "你好abc"
      str = ""
      for( i=0;     i<test.length; i++ )
      {
       temp = test.charCodeAt(i).toString(16);
       str     += "\\u"+ new Array(5-String(temp).length).join("0") +temp;
      }
      document.write (str)
</script>


vbs版

Function Unicode(str1)
      Dim str,temp
      str = ""
      For i=1     to len(str1)
       temp = Hex(AscW(Mid(str1,i,1)))
       If len(temp) < 5 Then     temp = right("0000" & temp, 4)
       str = str & "\u" & temp
      Next
      Unicode = str
End Function


Function htmlentities(str)
      For i = 1 to Len(str)
          char = mid(str, i, 1)
          If Ascw(char) > 128 then
              htmlentities = htmlentities & "&#" & Ascw(char) & ";"
          Else
              htmlentities = htmlentities & char
          End if
      Next
End Function

 

coldfusion

 

function nochaoscode(str)
{
      var new_str = “”;
      for(i=1; i lte len(str);i=i+1){
          if(asc(mid(str,i,1)) lt 128){
              new_str = new_str & mid(str,i,1);
          }else{
              new_str = new_str & “&##” & asc(mid(str,i,1));
          }
      }
      return new_str;
}

 


 

附:

在php中我們可以用mbstring的mb_convert_encoding函數實現這個正向及反向的轉化。 如:

 

mb_convert_encoding ("你好", "HTML-ENTITIES", "gb2312"); //輸出:&#20320;&#22909;
mb_convert_encoding ("&#20320;&#22909;", "gb2312", "HTML-ENTITIES"); //輸出:你好

 

如果需要對整個頁面轉化,則只需要在php文件的頭部加上這三行代碼:

 

mb_internal_encoding("gb2312"); // 這里的gb2312是你網站原來的編碼
mb_http_output("HTML-ENTITIES");
ob_start('mb_output_handler');


如果沒有打開mbstring擴展,可以參考coolcode.cn上的這兩篇文章:
在任意字符集下正常顯示網頁的方法
在任意字符集下正常顯示網頁的方法(續)


 

2、HTML實體

 

HTML 4.01 支持 ISO 8859-1 (Latin-1) 字符集。

提示 實體名是區分大小寫的。

備注 同一個符號,可以用“實體名稱”和“實體編號”兩種方式引用,“實體名稱”的優勢在於便於記憶,但不能保證所有的瀏覽器都能順利識別它,而“實體編號”則沒有這種擔憂,但它實在不方便記憶。


ASCII中部分實體的新名字

顯示

描述

實體名稱

實體編號

"

quotation mark

&quot; &#34;
' apostrophe

&apos; (IE下無效)

&#39;
& ampersand &amp; &#38;
< less-than &lt; &#60;
> greater-than &gt; &#62;

ISO 8859-1 符號實體

顯示

描述

實體名稱

實體編號

 

non-breaking space

&nbsp; &#160;
¡

inverted exclamation mark

&iexcl; &#161;
¤ currency &curren; &#164;

cent &cent; &#162;

pound &pound; &#163;

yen &yen; &#165;
¦

broken vertical bar

&brvbar; &#166;
§ section &sect; &#167;
¨

spacing diaeresis

&uml; &#168;
© copyright &copy; &#169;
a

feminine ordinal indicator

&ordf; &#170;
«

angle quotation mark (left)

&laquo; &#171;
? negation &not; &#172;
-

soft hyphen

&shy; &#173;
®

registered trademark

&reg; &#174;
trademark &trade; &#8482;
ˉ

spacing macron

&macr; &#175;
° degree &deg; &#176;
± plus-or-minus &plusmn; &#177;
2

superscript 2

&sup2; &#178;
3

superscript 3

&sup3; &#179;

spacing acute

&acute;

&#180;
μ micro &micro; &#181;
? paragraph &para; &#182;
·

middle dot

&middot; &#183;
?

spacing cedilla

&cedil; &#184;
1

superscript 1

&sup1; &#185;
o

masculine ordinal indicator

&ordm; &#186;
»

angle quotation mark (right)

&raquo; &#187;
?

fraction 1/4

&frac14; &#188;
?

fraction 1/2

&frac12; &#189;
?

fraction 3/4

&frac34; &#190;
?

inverted question mark

&iquest; &#191;
× multiplication &times; &#215;
÷ division &divide; &#247;

ISO 8859-1 字符實體

顯示

描述

實體名稱

實體編號

À

capital a, grave accent

&Agrave; &#192;
Á

capital a, acute accent

&Aacute; &#193;
Â

capital a, circumflex accent

&Acirc; &#194;
Ã

capital a, tilde

&Atilde; &#195;
Ä

capital a, umlaut mark

&Auml; &#196;
Å

capital a, ring

&Aring; &#197;
Æ

capital ae

&AElig; &#198;
Ç

capital c, cedilla

&Ccedil; &#199;
È

capital e, grave accent

&Egrave; &#200;
É

capital e, acute accent

&Eacute; &#201;
Ê

capital e, circumflex accent

&Ecirc; &#202;
Ë

capital e, umlaut mark

&Euml; &#203;
Ì

capital i, grave accent

&Igrave; &#204;
Í

capital i, acute accent

&Iacute; &#205;
Î

capital i, circumflex accent

&Icirc; &#206;
Ï

capital i, umlaut mark

&Iuml; &#207;
Ð

capital eth, Icelandic

&ETH; &#208;
Ñ

capital n, tilde

&Ntilde; &#209;
Ò

capital o, grave accent

&Ograve; &#210;
Ó

capital o, acute accent

&Oacute; &#211;
Ô

capital o, circumflex accent

&Ocirc; &#212;
Õ

capital o, tilde

&Otilde; &#213;
Ö

capital o, umlaut mark

&Ouml; &#214;
Ø

capital o, slash

&Oslash; &#216;
ù

capital u, grave accent

&Ugrave; &#217;
ú

capital u, acute accent

&Uacute; &#218;
?

capital u, circumflex accent

&Ucirc; &#219;
ü

capital u, umlaut mark

&Uuml; &#220;
Y

capital y, acute accent

&Yacute; &#221;
T

capital THORN, Icelandic

&THORN; &#222;
?

small sharp s, German

&szlig; &#223;
à

small a, grave accent

&agrave; &#224;
á

small a, acute accent

&aacute; &#225;
a

small a, circumflex accent

&acirc; &#226;
?

small a, tilde

&atilde; &#227;
?

small a, umlaut mark

&auml; &#228;
?

small a, ring

&aring; &#229;
?

small ae

&aelig; &#230;
?

small c, cedilla

&ccedil; &#231;
è

small e, grave accent

&egrave; &#232;
é

small e, acute accent

&eacute; &#233;
ê

small e, circumflex accent

&ecirc; &#234;
?

small e, umlaut mark

&euml; &#235;
ì

small i, grave accent

&igrave; &#236;
í

small i, acute accent

&iacute; &#237;
?

small i, circumflex accent

&icirc; &#238;
?

small i, umlaut mark

&iuml; &#239;
e

small eth, Icelandic

&eth; &#240;
?

small n, tilde

&ntilde; &#241;
ò

small o, grave accent

&ograve; &#242;
ó

small o, acute accent

&oacute; &#243;
?

small o, circumflex accent

&ocirc; &#244;
?

small o, tilde

&otilde; &#245;
?

small o, umlaut mark

&ouml; &#246;
?

small o, slash

&oslash; &#248;
ù

small u, grave accent

&ugrave; &#249;
ú

small u, acute accent

&uacute; &#250;
?

small u, circumflex accent

&ucirc; &#251;
ü

small u, umlaut mark

&uuml; &#252;
y

small y, acute accent

&yacute; &#253;
t

small thorn, Icelandic

&thorn; &#254;
?

small y, umlaut mark

&yuml; &#255;

其它一些 HTML 所支持的實體

顯示

描述

實體名稱

實體編號

Œ

capital ligature OE

&OElig; &#338;
œ

small ligature oe

&oelig; &#339;
Š

capital S with caron

&Scaron; &#352;
š

small S with caron

&scaron; &#353;
Ÿ

capital Y with diaeres

&Yuml; &#376;
ˆ

modifier letter circumflex accent

&circ; &#710;
˜

small tilde

&tilde; &#732;

en space

&ensp; &#8194;

em space

&emsp; &#8195;

thin space

&thinsp; &#8201;

zero width non-joiner

&zwnj; &#8204;

zero width joiner

&zwj; &#8205;

left-to-right mark

&lrm; &#8206;

right-to-left mark

&rlm; &#8207;

en dash

&ndash; &#8211;

em dash

&mdash; &#8212;

left single quotation mark

&lsquo; &#8216;

right single quotation mark

&rsquo; &#8217;

single low-9 quotation mark

&sbquo; &#8218;

left double quotation mark

&ldquo; &#8220;

right double quotation mark

&rdquo; &#8221;

double low-9 quotation mark

&bdquo; &#8222;
dagger &dagger; &#8224;

double dagger

&Dagger; &#8225;

horizontal ellipsis

&hellip; &#8230;

per mille

&permil; &#8240;

single left-pointing angle quotation

&lsaquo; &#8249;

single right-pointing angle quotation

&rsaquo; &#8250;
  euro &euro; &#8364;


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM