php urlencode vs java URLEncoder.encode


 

結論:urlencode 先比URLEncoder.encode多編碼 “ * ” 符號,其他都保持一致

php urlencode 

  phpversion()>=5.3 will compliant with RFC 3986, while phpversion()<=5.2.7RC1 is not compliant with RFC 3986.

  參考 RFC3896 方式編碼

  

返回字符串,此字符串中除了 -_. 之外的所有非字母數字字符都將被替換成百分號(%)后跟兩位十六進制數,空格則編碼為加號(+)。
此編碼與 WWW 表單 POST 數據的編碼方式是一樣的,同時與 application/x-www-form-urlencoded 的媒體類型編碼方式一樣。
由於歷史原因,此編碼在將空格編碼為加號(+)方面與 » RFC3896 編碼(參見 rawurlencode())不同。

 

php並沒有完全按照 rfc3896編碼,符號【~】在標准中是不用編碼,但是他也編碼了。

 

所以最終的未編碼的字符列表為 [-], [_], [.],如同其文檔中描述的一樣

java URLEncoder.encode

  參考 RFC2396 方式編碼

  但是由於ie瀏覽器編碼了除  "-", "_", ".", "*" 之外的字符,java采用了和IE一樣的編碼列表,

  所以最終的未編碼的字符列表為 [-], [_], [.], [*]

  

The list of characters that are not encoded has been determined as follows: RFC 2396 states: ----- Data characters that are allowed in a URI but do not have a reserved purpose are called unreserved. These include upper and lower case letters, decimal digits, and a limited set of punctuation marks and symbols. unreserved = alphanum | mark mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")" Unreserved characters can be escaped without changing the semantics of the URI, but this should not be done unless the URI is being used in a context that does not allow the unescaped character to appear. ----- It appears that both Netscape and Internet Explorer escape all special characters from this list with the exception of "-", "_", ".", "*". While it is not clear why they are escaping the other characters, perhaps it is safest to assume that there might be contexts in which the others are unsafe if not escaped. Therefore, we will use the same list. It is also noteworthy that this is consistent with O'Reilly's "HTML: The Definitive Guide" (page 164). As a last note, Intenet Explorer does not encode the "@" character which is clearly not unreserved according to the RFC. We are being consistent with the RFC in this matter, as is Netscape.

 

History of related RFCs:

RFC 1738 section 2.2
only alphanumerics, the special characters "$-_.+!*'(),", and
reserved characters used for their reserved purposes may be used
unencoded within a URL.

RFC 2396 section 2.3
unreserved = alphanum | mark
mark = "-" | "_" | "." | "!" | "~" | "*" | "'" | "(" | ")"

RFC 2732 section 3
(3) Add "[" and "]" to the set of 'reserved' characters:

RFC 3986 section 2.3
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

RFC 3987 section 2.2
unreserved = ALPHA / DIGIT / "-" / "." / "_" / "~"

 


免責聲明!

本站轉載的文章為個人學習借鑒使用,本站對版權不負任何法律責任。如果侵犯了您的隱私權益,請聯系本站郵箱yoyou2525@163.com刪除。



 
粵ICP備18138465號   © 2018-2025 CODEPRJ.COM