non ascii characters что это
non-ASCII character
Смотреть что такое «non-ASCII character» в других словарях:
Non-English-based programming languages — are computer programming languages that, unlike better known programming languages, do not use keywords taken from, or inspired by, the English vocabulary. Contents 1 Prevalence of English based programming languages 2 International programming… … Wikipedia
ASCII art — Oldskool or Amiga style Newskool style … Wikipedia
ASCII — American Standard Code for Information Interchange « ASCII » redirige ici. Pour les autres significations, voir ASCII (homonymie) … Wikipédia en Français
Ascii — American Standard Code for Information Interchange « ASCII » redirige ici. Pour les autres significations, voir ASCII (homonymie) … Wikipédia en Français
Character encoding — Special characters redirects here. For the Wikipedia editor s handbook page, see Help:Special characters. A character encoding system consists of a code that pairs each character from a given repertoire with something else, such as a sequence of… … Wikipedia
Character encodings in HTML — For a list of character entity references, see List of XML and HTML character entity references. HTML HTML and HTML5 Dynamic HTML XHTML XHTML Mobile Profile and C HTML Canvas element Character encodings Document Object Model Font family HTML… … Wikipedia
Non-English usage of quotation marks — A Non English usage of quotation marks Punctuation apostrophe ( … Wikipedia
Non-breaking space — In computer based text processing and digital typesetting, a non breaking space or no break space (NBSP) is a variant of the space character that prevents an automatic line break (line wrap) at its position. In certain formats (such as HTML), it… … Wikipedia
Control character — In computing and telecommunication, a control character or non printing character is a code point (a number) in a character set, that does not in itself represent a written symbol. It is in band signaling in the context of character encoding. All … Wikipedia
Escape character — In computing and telecommunication, an escape character is a character which invokes an alternative interpretation on subsequent characters in a character sequence. An escape character is a particular case of metacharacters. Generally, the… … Wikipedia
Steam cannot run from a folder path with non-ASCII characters — как исправить?
Пользователей Стим с каждым годом становится всё больше. Этому поспособствовали разработчики компьютерных игр, которые ограничивают доступ к сети пиратские копии. Есть ряд других причин, но в этой статье мы поговорим о проблемах клиента, с которыми мы часто сталкиваемся. Вы узнаете, как решить ошибку «Steam cannot run from a folder path with non-ASCII characters».
Причина ошибки в Steam
Данная проблема появляется у пользователей сразу после установки клиента популярной игровой библиотеки Steam. Как только мы выбираем иконку на рабочем столе — появляется системное сообщение. Часто в нём описана и причина, и даже решения ошибки. Но запутывает английский язык, который является родным для системы Windows. Есть и русскоязычная версия этого сообщения и оно выглядит приблизительно так: система не может запустить программу из папки, в имени которой есть символы не из английского языка.
Это значит, что при установке Стима вы указали папку для установки, имя которой на русском. Или по пути к Steam есть такая папка. Запомните — папки в компьютере нужно называть только английскими символами. При этом можно использовать транслитерацию: «Moja Papka» или «Stim». Не исключены и другие причины, которые могут нарушать работу системы и вызывать ошибки.
Как исправить ошибку в Стим
Для того, чтобы исправить ошибку, вы можете попытаться просто переименовать папку на пути к Steam, в которой имя не соответствует требованиям. Есть способы, которые помогут вам быстро определить текущее расположение папки Steam.
Этот путь находится в верхней строке окна. Запомните его или не закрывайте эту папку, чтобы видеть, где она расположена. Чтобы дать папке новое имя, нужно снова нажать ПКМ и выбрать «Переименовать». После этого попробуйте запустить Steam. Если ошибка «Cannot run from a folder path with non-ASCII characters» снова появляется, переходим к следующему способу.
Переустановка Steam
Чтобы вы смогли запустить Стим без ошибок, нужно переустановить клиент. Но сделать это правильно. Каждая система Windows имеет, как правило два локальных диска — один для системы (C), другой для файлов пользователя (D). Это удобно и позволяет не засорять раздел с Виндовс пользовательскими данными. Лучше всего установить новый Стим на не системный том. Но перед установкой нужно удалить старый клиент.
После установки попробуйте запустить клиент. Ошибка не появится, так как в новом пути к файлам библиотеки нет непонятных для системы символов.
Что делать, если ошибка в Steam всё равно появляется
Если решить проблему с запуском Стим не удалось вышеизложенными способами,значит причина не в имени папок. Скорее всего в системе появился вирус, который мешает нормальной работе вашего ПК. А это проблема уже серьёзнее той, которую мы пытались решить. Вам нужно вернуться в окно для удаления программ и просмотреть весь список приложений. Найдите здесь все игры и программы, которыми вы не пользуетесь. И удалите их. Почистите также свои папки с файлами: музыку, изображения, видео.
Ваш встроенный антивирус скорее всего уже бессилен. Так как он не реагирует на вирус, поселившийся в компьютере. Поэтому вам нужен сторонний защитник. В таких случаях многие более опытные пользователи применяют сканирующие утилиты от популярных разработчиков. Можем порекомендовать утилиту от Лаборатории Касперского — https://www.kaspersky.com/downloads/thank-you/free-virus-removal-tool. Или одноразовое средство по этой ссылке https://free.drweb.ru/download+cureit+free/ от Dr.WEB.
Предложенные программы являются бесплатными и портативными. Это значит, что их не нужно устанавливать. Загрузите файл утилиты и запустите её, согласившись с правилами использования. Через некоторое время сканирование завершится, и вы сможете узнать результаты. Если подозрение падёт на одну из ваших игр или программ — не задумываясь удаляйте её. Это источник ваших проблем. После чистки ПК от вирусов перегрузите компьютер и запустите Steam. Если ошибка «Steam cannot run from a folder path with non-ASCII characters» снова появляется — переустановите клиент.
Что означает сообщение «Steam fatal error: %appname% cannot run from a folder path with non-ASCII», что делать? Если перед вами появилось подобное оповещение, нужно предпринять несложные меры. Бояться не следует, через несколько минут мы восстановим работоспособность десктопного клиента.
Откуда берется ошибка
Сразу отметим: проблемы могут возникнуть исключительно у пользователей десктопной версии программы.
Не удивляйтесь, мы начала с расшифровки сообщения не просто так. Если вы переведете полученное оповещение с английского на русский, сможете понять основной смысл и увидите путь решения проблемы.
Не стоит волноваться, это лишь таблица специальных символов, используемых в программировании. В нее не входят русские буквы, исключительно латиница. Понимаете, что делать, если Стим выдает ошибку «%appname% cannot run from a folder path with non-ASCII characters»?
Совершенно верно! В название папки, где хранится десктопный клиент, закралась русская буква. Возможно, русские символы есть в названиях других папок в директории. Необходимо проверить и устранить деструктивный элемент!
Решение проблемы
Ошибка Стима «%appname% cannot run from a folder path with non-ASCII» может возникнуть, если в названии директории, куда установлен клиент, есть хоть одна русская буква. В имени любой папки – конечной или нет.
Решение проблемы элементарное:
Простой пример пути:
Больше ошибка Steam «%appname% cannot run from a folder path with non-ASCII» не будет вас беспокоить! Мы нашли источник проблемы, устранили его – можно наслаждаться десктопным клиентом на все сто процентов. Все работает идеально!
34. Non-ASCII Characters
This chapter covers the special issues relating to non-ASCII characters and how they are stored in strings and buffers.
top next
34.1. Text Representations
Emacs has two text representations—two ways to represent text in a string or buffer. These are called unibyte and multibyte. Each string, and each buffer, uses one of these two representations. For most purposes, you can ignore the issue of representations, because Emacs converts text between them as appropriate. Occasionally in Lisp programming you will need to pay attention to the difference.
In unibyte representation, each character occupies one byte and therefore the possible character codes range from 0 to 255. Codes 0 through 127 are ASCII characters; the codes from 128 through 255 are used for one non-ASCII character set (you can choose which character set by setting the variable nonascii-insert-offset ).
In multibyte representation, a character may occupy more than one byte, and as a result, the full range of Emacs character codes can be stored. The first byte of a multibyte character is always in the range 128 through 159 (octal 0200 through 0237). These values are called leading codes. The second and subsequent bytes of a multibyte character are always in the range 160 through 255 (octal 0240 through 0377); these values are trailing codes.
In a buffer, the buffer-local value of the variable enable-multibyte-characters specifies the representation used. The representation for a string is determined based on the string contents when the string is constructed.
You cannot set this variable directly; instead, use the function set-buffer-multibyte to change a buffer’s representation.
The `—unibyte’ command line option does its job by setting the default value to nil early in startup.
top Function: multibyte-string-p string Return t if string contains multibyte characters.
top next prev
34.2. Converting Text Representations
Emacs can convert unibyte text to multibyte; it can also convert multibyte text to unibyte, though this conversion loses information. In general these conversions happen when inserting text into a buffer, or when putting text from several strings together in one string. You can also explicitly convert a string’s contents to either representation.
Emacs chooses the representation for a string based on the text that it is constructed from. The general rule is to convert unibyte text to multibyte text when combining it with other multibyte text, because the multibyte representation is more general and can hold whatever characters the unibyte text has.
When inserting text into a buffer, Emacs converts the text to the buffer’s representation, as specified by enable-multibyte-characters in that buffer. In particular, when you insert multibyte text into a unibyte buffer, Emacs converts the text to unibyte, even though this conversion cannot in general preserve all the characters that might be in the multibyte text. The other natural alternative, to convert the buffer contents to multibyte, is not acceptable because the buffer’s representation is a choice made by the user that cannot be overridden automatically.
Converting multibyte text to unibyte is simpler: it performs logical-and of each character code with 255. If nonascii-insert-offset has a reasonable value, corresponding to the beginning of some character set, this conversion is the inverse of the other: converting unibyte text to multibyte and back to unibyte reproduces the original unibyte text.
top Variable: nonascii-insert-offset This variable specifies the amount to add to a non-ASCII character when converting unibyte text to multibyte. It also applies when self-insert-command inserts a character in the unibyte non-ASCII range, 128 through 255. However, the function insert-char does not perform this conversion.
top Function: string-make-unibyte string This function converts the text of string to unibyte representation, if it isn’t already, and returns the result. If string is a unibyte string, it is returned unchanged.
top Function: string-make-multibyte string This function converts the text of string to multibyte representation, if it isn’t already, and returns the result. If string is a multibyte string, it is returned unchanged.
top next prev
34.3. Selecting a Representation
Sometimes it is useful to examine an existing buffer or string as multibyte when it was unibyte, or vice versa.
This function leaves the buffer contents unchanged when viewed as a sequence of bytes. As a consequence, it can change the contents viewed as characters; a sequence of two bytes which is treated as one character in multibyte representation will count as two characters in unibyte representation.
This function sets enable-multibyte-characters to record which representation is in use. It also adjusts various data in the buffer (including overlays, text properties and markers) so that they cover the same text as they did before.
top Function: string-as-unibyte string This function returns a string with the same bytes as string but treating each byte as a character. This means that the value may have more characters than string has.
If string is unibyte already, then the value is string itself.
top Function: string-as-multibyte string This function returns a string with the same bytes as string but treating each multibyte sequence as one character. This means that the value may have fewer characters than string has.
If string is multibyte already, then the value is string itself.
top next prev
34.4. Character Codes
The unibyte and multibyte text representations use different character codes. The valid character codes for unibyte representation range from 0 to 255—the values that can fit in one byte. The valid character codes for multibyte representation range from 0 to 524287, but not all values in that range are valid. In particular, the values 128 through 255 are not legitimate in multibyte text (though they can occur in «raw bytes»; see section Explicit Encoding and Decoding). Only the ASCII codes 0 through 127 are fully legitimate in both representations.
top Function: char-valid-p charcode This returns t if charcode is valid for either one of the two text representations.
top next prev
34.5. Character Sets
Emacs classifies characters into various character sets, each of which has a name which is a symbol. Each character belongs to one and only one character set.
top Function: charsetp object Return t if object is a character set name symbol, nil otherwise.
top Function: charset-list This function returns a list of all defined character set names.
top Function: char-charset character This function returns the name of the character set that character belongs to.
top next prev
34.6. Characters and Bytes
In multibyte representation, each character occupies one or more bytes. Each character set has an introduction sequence, which is normally one or two bytes long. (Exception: the ASCII character set has a zero-length introduction sequence.) The introduction sequence is the beginning of the byte sequence for any character in the character set. The rest of the character’s bytes distinguish it from the other characters in the same character set. Depending on the character set, there are either one or two distinguishing bytes; the number of such bytes is called the dimension of the character set.
top Function: charset-dimension charset This function returns the dimension of charset ; at present, the dimension is always 1 or 2.
This is the simplest way to determine the byte length of a character set’s introduction sequence:
top next prev
34.7. Splitting Characters
The functions in this section convert between characters and the byte values used to represent them. For most purposes, there is no need to be concerned with the sequence of bytes used to represent a character, because Emacs translates automatically when necessary.
The reason this function can give correct results for both multibyte and unibyte representations is that the non-ASCII character codes used in those two representations do not overlap.
Unibyte non-ASCII characters are considered as part of the ascii character set:
top next prev
34.8. Scanning for Character Sets
Sometimes it is useful to find out which character sets appear in a part of a buffer or a string. One use for this is in determining which coding systems (see section Coding Systems) are capable of representing all of the text in question.
top next prev
34.9. Translation of Characters
A translation table specifies a mapping of characters into characters. These tables are used in encoding and decoding, and for other purposes. Some coding systems specify their own particular translation tables; there are also default translation tables which apply to all other coding systems.
You can also map one whole character set into another character set with the same dimension. To do this, you specify a generic character (which designates a character set) for from (see section Splitting Characters). In this case, to should also be a generic character, for another character set of the same dimension. Then the translation table translates each character of from ‘s character set into the corresponding character of to ‘s character set.
top Variable: standard-character-translation-table-for-decode This is the default translation table for decoding, for coding systems that don’t specify any other translation table.
top Variable: standard-character-translation-table-for-encode This is the default translation table for encoding, for coding systems that don’t specify any other translation table.
top next prev
34.10. Coding Systems
When Emacs reads or writes a file, and when Emacs sends text to a subprocess or receives text from a subprocess, it normally performs character code conversion and end-of-line conversion as specified by a particular coding system.
top 34.10.1. Basic Concepts of Coding Systems
Character code conversion involves conversion between the encoding used inside Emacs and some other encoding. Emacs supports many different encodings, in that it can convert to and from them. For example, it can convert text to or from encodings such as Latin 1, Latin 2, Latin 3, Latin 4, Latin 5, and several variants of ISO 2022. In some cases, Emacs supports several alternative encodings for the same characters; for example, there are three coding systems for the Cyrillic (Russian) alphabet: ISO, Alternativnyj, and KOI8.
Most coding systems specify a particular character code for conversion, but some of them leave this unspecified—to be chosen heuristically based on the data.
End of line conversion handles three different conventions used on various systems for representing end of line in files. The Unix convention is to use the linefeed character (also called newline). The DOS convention is to use the two character sequence, carriage-return linefeed, at the end of a line. The Mac convention is to use just carriage-return.
The coding system raw-text is special in that it prevents character code conversion, and causes the buffer visited with that coding system to be a unibyte buffer. It does not specify the end-of-line conversion, allowing that to be determined as usual by the data, and has the usual three variants which specify the end-of-line conversion. no-conversion is equivalent to raw-text-unix : it specifies no conversion of either character codes or end-of-line.
The coding system emacs-mule specifies that the data is represented in the internal Emacs encoding. This is like raw-text in that no code conversion happens, but different in that the result is multibyte data.
The value of the mime-charset property is also defined as an alias for the coding system.
top 34.10.2. Encoding and I/O
The principal purpose of coding systems is for use in reading and writing files. The function insert-file-contents uses a coding system for decoding the file data, and write-region uses one to encode the buffer contents.
You can specify the coding system to use either explicitly (see section Specifying a Coding System for One Operation), or implicitly using the defaulting mechanism (see section Default Coding Systems). But these methods may not completely specify what to do. For example, they may choose a coding system such as undefined which leaves the character code conversion to be determined from the data. In these cases, the I/O operation finishes the job of choosing a coding system. Very often you will want to find out afterwards which coding system was chosen.
top Variable: last-coding-system-used I/O operations for files and subprocesses set this variable to the coding system name that was used. The explicit encoding and decoding functions (see section Explicit Encoding and Decoding) set it too.
Warning: Since receiving subprocess output sets this variable, it can change whenever Emacs waits; therefore, you should use copy the value shortly after the function call which stores the value you are interested in.
The variable selection-coding-system specifies how to encode selections for the window system. See section Window System Selections.
top 34.10.3. Coding Systems in Lisp
Here are Lisp facilities for working with coding systems;
top Function: coding-system-p object This function returns t if object is a coding system name.
top Function: detect-coding-string string highest This function is like detect-coding-region except that it operates on the contents of string instead of bytes in the buffer.
See section Process Information, for how to examine or set the coding systems used for I/O to a subprocess.
top 34.10.4. User-Chosen Coding Systems
The optional argument preferred-coding-system specifies a coding system to try first. If that one can handle the text in the specified region, then it is used. If this argument is omitted, the current buffer’s value of buffer-file-coding-system is tried first.
If the region contains some multibyte characters that the preferred coding system cannot encode, this function asks the user to choose from a list of coding systems which can encode the text, and returns the user’s choice.
One other kludgy feature: if from is a string, the string is the target text, and to is ignored.
Here are two functions you can use to let the user specify a coding system, with completion. See section Completion.
top 34.10.5. Default Coding Systems
This section describes variables that specify the default coding system for certain files or when running certain subprograms, and the function that I/O operations use to access them.
The idea of these variables is that you set them once and for all to the defaults you want, and then do not change them again. To specify a particular coding system for a particular operation in a Lisp program, don’t change these variables; instead, override them using coding-system-for-read and coding-system-for-write (see section Specifying a Coding System for One Operation).
If val is a function symbol, the function must return a coding system or a cons cell containing two coding systems. This value is used as described above.
Warning: Coding systems such as undecided which determine the coding system from the data do not work entirely reliably with asynchronous subprocess output. This is because Emacs handles asynchronous subprocess output in batches, as it arrives. If the coding system leaves the character code conversion unspecified, or leaves the end-of-line conversion unspecified, Emacs must try to detect the proper conversion from one batch at a time, and this does not always work.
top Variable: default-process-coding-system This variable specifies the coding systems to use for subprocess (and network stream) input and output, when nothing else specifies what to do.
top 34.10.6. Specifying a Coding System for One Operation
It also applies to any asynchronous subprocess or network stream, but in a different way: the value of coding-system-for-read when you start the subprocess or open the network stream specifies the input decoding method for that subprocess or network stream. It remains in use for that subprocess or network stream unless and until overridden.
top 34.10.7. Explicit Encoding and Decoding
All the operations that transfer text in and out of Emacs have the ability to use a coding system to encode or decode the text. You can also explicitly encode and decode text using the functions in this section.
The result of encoding, and the input to decoding, are not ordinary text. They are «raw bytes»—bytes that represent text in the same way that an external file would. When a buffer contains raw bytes, it is most natural to mark that buffer as using unibyte representation, using set-buffer-multibyte (see section Selecting a Representation), but this is not required. If the buffer’s contents are only temporarily raw, leave the buffer multibyte, which will be correct after you decode them.
Raw bytes sometimes contain overlong byte-sequences that look like a proper multibyte character plus extra bytes containing trailing codes. For most purposes, Emacs treats such a sequence in a buffer or string as a single character, and if you look at its character code, you get the value that corresponds to the multibyte character sequence—the extra bytes are disregarded. This behavior is not quite clean, but raw bytes are used only in limited places in Emacs, so as a practical matter problems can be avoided.
top 34.10.8. Terminal I/O Encoding
Emacs can decode keyboard input using a coding system, and encode terminal output. This is useful for terminals that transmit or display text using a particular encoding such as Latin-1. Emacs does not set last-coding-system-used for encoding or decoding for the terminal.
top Function: keyboard-coding-system This function returns the coding system that is in use for decoding keyboard input—or nil if no coding system is to be used.
top Function: terminal-coding-system This function returns the coding system that is in use for encoding terminal output—or nil for no encoding.
top 34.10.9. MS-DOS File Types
Emacs on MS-DOS and on MS-Windows recognizes certain file names as text files or binary files. By «binary file» we mean a file of literal byte values that are not necessary meant to be characters. Emacs does no end-of-line conversion and no character code conversion for a binary file. Meanwhile, when you create a new file which is marked by its name as a «text file», Emacs uses DOS end-of-line conversion.
Normally this variable is set by visiting a file; it is set to nil if the file was visited without any actual conversion.
Emacs when running on MS-DOS or MS-Windows checks this alist to decide which coding system to use when reading a file. For a text file, undecided-dos is used. For a binary file, no-conversion is used.
If no element in this alist matches a given file name, then default-buffer-file-type says how to treat the file.
top User Option: default-buffer-file-type This variable says how to handle files for which file-name-buffer-file-type-alist says nothing about the type.
top prev
34.11. Input Methods
Each input method has a name, which is currently a string; in the future, symbols may also be usable as input method names.
top Variable: current-input-method This variable holds the name of the input method now active in the current buffer. (It automatically becomes local in each buffer when set in any fashion.) It is nil if no input method is active in the buffer now.
The returned value is a string.
top Variable: input-method-alist This variable defines all the supported input methods. Each element defines one input method, and should have the form:
Here input-method is the input method name, a string; language-env is another string, the name of the language environment this input method is recommended for. (That serves only for documentation purposes.)
title is a string to display in the mode line while this method is active. description is a string describing this method and what it is good for.