Portable UTF-8 :一個用于Unicode處理的輕量級PHP庫

jopen 11年前發布 | 13K 次閱讀 PHP開發 Portable UTF-8

Portable UTF-8是一個庫能夠在PHP應用中實現Unicode支持。它采用PHP開發并且不需要mbstring,iconv,UTF-8 enabled PCRE和其它庫。Portable UTF-8的好處是,你可以把它捆綁至您的應用程序,它不需要外部的支持。

替換原生字符串函數:

  • utf8_ord - Returns Unicode Code Point of UTF-8 encoded character.
  • utf8_strlen - Returns number of UTF-8 characters in the string.
  • utf8_chr - Opposite of utf8_ord. Accepts a Unicode Code Point and returns the corresponding UTF-8 encoded character.
  • utf8_split - Breaks a string into an array of UTF-8 character(s).
  • utf8_chunk_split - Splits a UTF-8 encoded string into smaller chunks of specified length. For base64, use the native
    chunk_split
    .
  • utf8_substr - Accepts a UTF-8 encoded string and returns a part of it.
  • utf8_rev - UTF-8 aware string reverse.
  • utf8_strpos - Finds the position of a string in another string, and returns the offset UTF-8 character count.
  • utf8_max - Accepts
    array
    or
    string
    and returns a character with maximum Code Point.
  • utf8_min - Opposite of utf8_max.
  • utf8_word_count - Counts the number of words in a UTF-8 encoded string.
  • utf8_str_shuffle - Shuffles all characters of a UTF-8 encoded string.
  • </ul> UTF-8/Unicode 特殊函數:

    • pcre_utf8_support - Checks if the u modifier is available that enables UTF-8 support in PCRE functions.
    • is_utf8 - Checks if a string is UTF-8 encoded.
    • utf8_url_slug - Creates a UTF-8 encoded URL Slug allowing safe Non-ASCII characters in SEO friendly URLs.
    • utf8_clean - Removes invalid byte sequence from a UTF-8 encoded string.
    • utf8_fits_inside - Checks if the character length of a string is less than or equal to a specific size. Useful for MySQL INSERT.
    • utf8_chr_size_list - Returns an array containing number of bytes (1-4) taken by each UTF-8 encoded character.
    • utf8_max_chr_width - Takes a string and returns the maximum character width of any character in the string. Ranges from 1 to 4.
    • utf8_single_chr_html_encode - Encodes a Unicode character like ? to &#1234; encoded form.
    • utf8_html_encode - Same as utf8_single_chr_html_encode, but applies to a whole string and creates a stream of encoded sequences.
    • utf8_bom - Returns the UTF-8 Byte Order Mark (BOM) Character.
    • is_bom - Accepts a multi-byte character and tells whether it is BOM or not.
    • utf8_file_has_bom - Checks if a UTF-8 encoded file has a BOM (at the start).
    • utf8_string_has_bom - Checks if a string starts with BOM.
    • utf8_add_bom_to_string - Prepends BOM character to a string.
    • utf8_count_chars - Accepts a sinle string argument and returns details of characters in that string.
    • utf8_codepoints - Accepts a string and returns Code Points of all of its characters as integer (e.g 1740) or as string (e.g U+06CC).
    • utf8_int_to_unicode_style - Accepts an integer and converts to U+xxxx Unicode representation.
    • utf8_unicode_style_to_int - Accepts a Code Point as U+xxxx and converts to integer.
    • utf8_chr_to_unicode_style - Accepts a UTF-8 encoded character and returns Code Point as U+xxxx
    • </ul>

      項目主頁:http://www.baiduhome.net/lib/view/home/1371892654244

 本文由用戶 jopen 自行上傳分享,僅供網友學習交流。所有權歸原作者,若您的權利被侵害,請聯系管理員。
 轉載本站原創文章,請注明出處,并保留原始鏈接、圖片水印。
 本站是一個以用戶分享為主的開源技術平臺,歡迎各類分享!