Home

Awesome

SWUbanner

Build Status Build status codecov.io Codacy Badge Latest Stable Version Total Downloads License Donate to this project using Paypal Donate to this project using Patreon

🔡 Portable ASCII

Description

It is written in PHP (PHP 7+) and can work without "mbstring", "iconv" or any other extra encoding php-extension on your server.

The benefit of Portable ASCII is that it is easy to use, easy to bundle.

The project based on ...

Index

Alternative

If you like a more Object Oriented Way to edit strings, then you can take a look at voku/Stringy, it's a fork of "danielstjules/Stringy" but it used the "Portable ASCII"-Class and some extra methods.

// Portable ASCII
use voku\helper\ASCII;
ASCII::to_transliterate('déjà σσς iıii'); // 'deja sss iiii'

// voku/Stringy
use Stringy\Stringy as S;
$stringy = S::create('déjà σσς iıii');
$stringy->toTransliterate();              // 'deja sss iiii'

Install "Portable ASCII" via "composer require"

composer require voku/portable-ascii

Why Portable ASCII?

I need ASCII char handling in different classes and before I added this functions into "Portable UTF-8", but this repo is more modular and portable, because it has no dependencies.

Requirements and Recommendations

Usage

Example: ASCII::to_ascii()

  echo ASCII::to_ascii('�Düsseldorf�', 'de');
  
  // will output
  // Duesseldorf

  echo ASCII::to_ascii('�Düsseldorf�', 'en');
  
  // will output
  // Dusseldorf

Portable ASCII | API

The API from the "ASCII"-Class is written as small static methods.

Class methods

<p id="voku-php-readme-class-methods"></p><table><tr><td><a href="#charsarraybool-replace_extra_symbols-array">charsArray</a> </td><td><a href="#charsarraywithmultilanguagevaluesbool-replace_extra_symbols-array">charsArrayWithMultiLanguageValues</a> </td><td><a href="#charsarraywithonelanguagestring-language-bool-replace_extra_symbols-bool-asorigreplacearray-array">charsArrayWithOneLanguage</a> </td><td><a href="#charsarraywithsinglelanguagevaluesbool-replace_extra_symbols-bool-asorigreplacearray-array">charsArrayWithSingleLanguageValues</a> </td></tr><tr><td><a href="#cleanstring-str-bool-normalize_whitespace-bool-keep_non_breaking_space-bool-normalize_msword-bool-remove_invisible_characters-string">clean</a> </td><td><a href="#getalllanguages-string">getAllLanguages</a> </td><td><a href="#is_asciistring-str-bool">is_ascii</a> </td><td><a href="#normalize_mswordstring-str-string">normalize_msword</a> </td></tr><tr><td><a href="#normalize_whitespacestring-str-bool-keepnonbreakingspace-bool-keepbidiunicodecontrols-bool-normalize_control_characters-string">normalize_whitespace</a> </td><td><a href="#remove_invisible_charactersstring-str-bool-url_encoded-string-replacement-bool-keep_basic_control_characters-string">remove_invisible_characters</a> </td><td><a href="#to_asciistring-str-string-language-bool-remove_unsupported_chars-bool-replace_extra_symbols-bool-use_transliterate-boolnull-replace_single_chars_only-string">to_ascii</a> </td><td><a href="#to_ascii_remapstring-str1-string-str2-string">to_ascii_remap</a> </td></tr><tr><td><a href="#to_filenamestring-str-bool-use_transliterate-string-fallback_char-string">to_filename</a> </td><td><a href="#to_slugifystring-str-string-separator-string-language-string-replacements-bool-replace_extra_symbols-bool-use_str_to_lower-bool-use_transliterate-string">to_slugify</a> </td><td><a href="#to_transliteratestring-str-stringnull-unknown-bool-strict-string">to_transliterate</a> </td></tr></table>

charsArray(bool $replace_extra_symbols): array

<a href="#voku-php-readme-class-methods"></a> Returns an replacement array for ASCII methods.

EXAMPLE: <code> $array = ASCII::charsArray(); var_dump($array['ru']['б']); // 'b' </code>

Parameters:

Return:


charsArrayWithMultiLanguageValues(bool $replace_extra_symbols): array

<a href="#voku-php-readme-class-methods"></a> Returns an replacement array for ASCII methods with a mix of multiple languages.

EXAMPLE: <code> $array = ASCII::charsArrayWithMultiLanguageValues(); var_dump($array['b']); // ['β', 'б', 'ဗ', 'ბ', 'ب'] </code>

Parameters:

Return:


charsArrayWithOneLanguage(string $language, bool $replace_extra_symbols, bool $asOrigReplaceArray): array

<a href="#voku-php-readme-class-methods"></a> Returns an replacement array for ASCII methods with one language.

For example, German will map 'ä' to 'ae', while other languages will simply return e.g. 'a'.

EXAMPLE: <code> $array = ASCII::charsArrayWithOneLanguage('ru'); $tmpKey = \array_search('yo', $array['replace']); echo $array['orig'][$tmpKey]; // 'ё' </code>

Parameters:

Return:


charsArrayWithSingleLanguageValues(bool $replace_extra_symbols, bool $asOrigReplaceArray): array

<a href="#voku-php-readme-class-methods"></a> Returns an replacement array for ASCII methods with multiple languages.

EXAMPLE: <code> $array = ASCII::charsArrayWithSingleLanguageValues(); $tmpKey = \array_search('hnaik', $array['replace']); echo $array['orig'][$tmpKey]; // '၌' </code>

Parameters:

Return:


clean(string $str, bool $normalize_whitespace, bool $keep_non_breaking_space, bool $normalize_msword, bool $remove_invisible_characters): string

<a href="#voku-php-readme-class-methods"></a> Accepts a string and removes all non-UTF-8 characters from it + extras if needed.

Parameters:

Return:


getAllLanguages(): string[]

<a href="#voku-php-readme-class-methods"></a> Get all languages from the constants "ASCII::.*LANGUAGE_CODE".

Parameters: nothing

Return:


is_ascii(string $str): bool

<a href="#voku-php-readme-class-methods"></a> Checks if a string is 7 bit ASCII.

EXAMPLE: <code> ASCII::is_ascii('白'); // false </code>

Parameters:

Return:

</p>`

normalize_msword(string $str): string

<a href="#voku-php-readme-class-methods"></a> Returns a string with smart quotes, ellipsis characters, and dashes from Windows-1252 (commonly used in Word documents) replaced by their ASCII equivalents.

EXAMPLE: <code> ASCII::normalize_msword('„Abcdef…”'); // '"Abcdef..."' </code>

Parameters:

Return:


normalize_whitespace(string $str, bool $keepNonBreakingSpace, bool $keepBidiUnicodeControls, bool $normalize_control_characters): string

<a href="#voku-php-readme-class-methods"></a> Normalize the whitespace.

EXAMPLE: <code> ASCII::normalize_whitespace("abc-\xc2\xa0-öäü-\xe2\x80\xaf-\xE2\x80\xAC", true); // "abc-\xc2\xa0-öäü- -" </code>

Parameters:

Return:


remove_invisible_characters(string $str, bool $url_encoded, string $replacement, bool $keep_basic_control_characters): string

<a href="#voku-php-readme-class-methods"></a> Remove invisible characters from a string.

e.g.: This prevents sandwiching null characters between ascii characters, like Java\0script.

copy&past from https://github.com/bcit-ci/CodeIgniter/blob/develop/system/core/Common.php

Parameters:

Return:


to_ascii(string $str, string $language, bool $remove_unsupported_chars, bool $replace_extra_symbols, bool $use_transliterate, bool|null $replace_single_chars_only): string

<a href="#voku-php-readme-class-methods"></a> Returns an ASCII version of the string. A set of non-ASCII characters are replaced with their closest ASCII counterparts, and the rest are removed by default. The language or locale of the source string can be supplied for language-specific transliteration in any of the following formats: en, en_GB, or en-GB. For example, passing "de" results in "äöü" mapping to "aeoeue" rather than "aou" as in other languages.

EXAMPLE: <code> ASCII::to_ascii('�Düsseldorf�', 'en'); // Dusseldorf </code>

Parameters:

Return:


to_ascii_remap(string $str1, string $str2): string[]

<a href="#voku-php-readme-class-methods"></a> WARNING: This method will return broken characters and is only for special cases.

Convert two UTF-8 encoded string to a single-byte strings suitable for functions that need the same string length after the conversion.

The function simply uses (and updates) a tailored dynamic encoding (in/out map parameter) where non-ascii characters are remapped to the range [128-255] in order of appearance.

Parameters:

Return:


to_filename(string $str, bool $use_transliterate, string $fallback_char): string

<a href="#voku-php-readme-class-methods"></a> Convert given string to safe filename (and keep string case).

EXAMPLE: <code> ASCII::to_filename('שדגשדג.png', true)); // 'shdgshdg.png' </code>

Parameters:

Return:


to_slugify(string $str, string $separator, string $language, string[] $replacements, bool $replace_extra_symbols, bool $use_str_to_lower, bool $use_transliterate): string

<a href="#voku-php-readme-class-methods"></a> Converts the string into an URL slug. This includes replacing non-ASCII characters with their closest ASCII equivalents, removing remaining non-ASCII and non-alphanumeric characters, and replacing whitespace with $separator. The separator defaults to a single dash, and the string is also converted to lowercase. The language of the source string can also be supplied for language-specific transliteration.

Parameters:

Return:


to_transliterate(string $str, string|null $unknown, bool $strict): string

<a href="#voku-php-readme-class-methods"></a> Returns an ASCII version of the string. A set of non-ASCII characters are replaced with their closest ASCII counterparts, and the rest are removed unless instructed otherwise.

EXAMPLE: <code> ASCII::to_transliterate('déjà σσς iıii'); // 'deja sss iiii' </code>

Parameters:

Return:


Unit Test

  1. Composer is a prerequisite for running the tests.
composer install
  1. The tests can be executed by running this command from the root directory:
./vendor/bin/phpunit

Support

For support and donations please visit Github | Issues | PayPal | Patreon.

For status updates and release announcements please visit Releases | Twitter | Patreon.

For professional support please contact me.

Thanks

License and Copyright

Released under the MIT License - see LICENSE.txt for details.