UniWiki:Manual of Style/Text formatting: Difference between revisions

Line 186:

===PUA characters===

~~{{shortcut|MOS:PUA}}~~

[[Wikipedia:Private Use Area|Private Use Area]] (PUA) characters are in three ranges of code points (<code>U+E000</code>–<code>U+F8FF</code> in the [[Wikipedia:Plane (Unicode)#Basic Multilingual Plane|BMP]], and in [[Wikipedia:Plane (Unicode)#Private Use Area planes|planes 15 and 16]]). PUA characters should normally be avoided, but they are sometimes used when they are found in common fonts.

[[Private Use Area]] (PUA) characters are in three ranges of code points (<code>U+E000</code>–<code>U+F8FF</code> in the [[Plane (Unicode)#Basic Multilingual Plane|BMP]], and in [[Plane (Unicode)#Private Use Area planes|planes 15 and 16]]). PUA characters should normally be avoided, but they are sometimes used when they are found in common fonts~~, especially when the character itself is the topic of discussion~~.

Where PUA characters cannot be replaced with non-PUA Unicode characters, they should be converted to their (hexa)decimal code values (that is, &#...; or &#x...;). However, whenever a PUA character has a Unicode equivalent, it should instead be replaced with that equivalent (Unicodified). The Unicode may be obvious when text is copied and pasted from a document that uses the PUA for bullets or similar characters in Latin text, but similar things happen with punctuation and emoticons in documents using Japanese and other scripts, so an editor familiar with those scripts may be needed~~. In Chinese documents it's not uncommon for the PUA to be used for characters that now have full Unicode support, due to poorer support for Chinese characters when those fonts were designed~~. Such PUA characters, which are sometimes found on ~~Wikipedia~~ in references and footnotes, should not be substituted with their (hexa)decimal values, as that will lock in the illegible character. If you're moderately familiar with the script, an internet search of the surrounding text will often locate a fully Unicode version of the text which can be used to correct the ~~Wikipedia~~ article.

Where PUA characters cannot be replaced with non-PUA Unicode characters, they should be converted to their (hexa)decimal code values (that is, &#...; or &#x...;). However, whenever a PUA character has a Unicode equivalent, it should instead be replaced with that equivalent (Unicodified). The Unicode may be obvious when text is copied and pasted from a document that uses the PUA for bullets or similar characters in Latin text, but similar things happen with punctuation and emoticons in documents using Japanese and other scripts, so an editor familiar with those scripts may be needed. Such PUA characters, which are sometimes found on the UniWiki in references and footnotes, should not be substituted with their (hexa)decimal values, as that will lock in the illegible character. If you're moderately familiar with the script, an internet search of the surrounding text will often locate a fully Unicode version of the text which can be used to correct the UniWiki article.

Because browsers do not know which fonts to use for PUA characters, it is necessary for ~~Wikipedia~~ to specify them. ~~{{tl|Unicode}} or {{tl|IPA}} formatting is sufficient in some cases. Otherwise the~~ fonts should be specified through html markup, as in the example below. Note that if a font is not specified, or if none of the fonts are installed, readers will only see a numbered box in place of the PUA character.

Because browsers do not know which fonts to use for PUA characters, it is necessary for the UniWiki to specify them. The fonts should be specified through html markup, as in the example below. Note that if a font is not specified, or if none of the fonts are installed, readers will only see a numbered box in place of the PUA character.

Tagging a (hexa)decimal code with the template {{tl|PUA}} will enable future editors to review the page, and to Unicodify the character if it is included in future expansions of Unicode. This happened, for example, at [[strident vowel]], where a non-Unicode symbol for the sound was used in the literature and added to the PUA of SIL's IPA fonts. Unicode didn't support it until several years after the Wikipedia article was written, and once the fonts were updated to support it, the PUA character in the article was replaced with its new Unicode value.

~~For example,~~

:{{xt|1=<code><nowiki>SIL added these letters at U+F267 and U+F268: <span style="font-family:Gentium Plus, Charis SIL, Doulos SIL, serif">{{PUA|&#xf267;}}, {{PUA|&#xf268;}}</span>.</nowiki></code>}}

~~which renders as:~~

~~:SIL added these letters at U+F267 and U+F268: <span style="font-family:Gentium Plus, Charis SIL, Doulos SIL, serif">{{PUA|}}, {{PUA|}}</span>.~~

~~{{crossref|See [[:Category:Articles with wanted PUA characters]] and especially [[Tengwar#Unicode]] for examples of PUA characters which cannot easily be replaced.}}~~

== See also ==

@@ Line 186: / Line 186: @@
 ===PUA characters===
-{{shortcut|MOS:PUA}}
+[[Wikipedia:Private Use Area|Private Use Area]] (PUA) characters are in three ranges of code points (<code>U+E000</code>–<code>U+F8FF</code> in the [[Wikipedia:Plane (Unicode)#Basic Multilingual Plane|BMP]], and in [[Wikipedia:Plane (Unicode)#Private Use Area planes|planes 15 and 16]]). PUA characters should normally be avoided, but they are sometimes used when they are found in common fonts.
-[[Private Use Area]] (PUA) characters are in three ranges of code points (<code>U+E000</code>–<code>U+F8FF</code> in the [[Plane (Unicode)#Basic Multilingual Plane|BMP]], and in [[Plane (Unicode)#Private Use Area planes|planes 15 and 16]]). PUA characters should normally be avoided, but they are sometimes used when they are found in common fonts, especially when the character itself is the topic of discussion.
-Where PUA characters cannot be replaced with non-PUA Unicode characters, they should be converted to their (hexa)decimal code values (that is, &#...; or &#x...;). However, whenever a PUA character has a Unicode equivalent, it should instead be replaced with that equivalent (Unicodified).  The Unicode may be obvious when text is copied and pasted from a document that uses the PUA for bullets or similar characters in Latin text, but similar things happen with punctuation and emoticons in documents using Japanese and other scripts, so an editor familiar with those scripts may be needed.  In Chinese documents it's not uncommon for the PUA to be used for characters that now have full Unicode support, due to poorer support for Chinese characters when those fonts were designed.  Such PUA characters, which are sometimes found on Wikipedia in references and footnotes, should not be substituted with their (hexa)decimal values, as that will lock in the illegible character. If you're moderately familiar with the script, an internet search of the surrounding text will often locate a fully Unicode version of the text which can be used to correct the Wikipedia article.
+Where PUA characters cannot be replaced with non-PUA Unicode characters, they should be converted to their (hexa)decimal code values (that is, &#...; or &#x...;). However, whenever a PUA character has a Unicode equivalent, it should instead be replaced with that equivalent (Unicodified).  The Unicode may be obvious when text is copied and pasted from a document that uses the PUA for bullets or similar characters in Latin text, but similar things happen with punctuation and emoticons in documents using Japanese and other scripts, so an editor familiar with those scripts may be needed.  Such PUA characters, which are sometimes found on the UniWiki in references and footnotes, should not be substituted with their (hexa)decimal values, as that will lock in the illegible character. If you're moderately familiar with the script, an internet search of the surrounding text will often locate a fully Unicode version of the text which can be used to correct the UniWiki article.
-Because browsers do not know which fonts to use for PUA characters, it is necessary for Wikipedia to specify them. {{tl|Unicode}} or {{tl|IPA}} formatting is sufficient in some cases.  Otherwise the fonts should be specified through html markup, as in the example below.  Note that if a font is not specified, or if none of the fonts are installed, readers will only see a numbered box in place of the PUA character.
+Because browsers do not know which fonts to use for PUA characters, it is necessary for the UniWiki to specify them. The fonts should be specified through html markup, as in the example below.  Note that if a font is not specified, or if none of the fonts are installed, readers will only see a numbered box in place of the PUA character.
-Tagging a (hexa)decimal code with the template {{tl|PUA}} will enable future editors to review the page, and to Unicodify the character if it is included in future expansions of Unicode. This happened, for example, at [[strident vowel]], where a non-Unicode symbol for the sound was used in the literature and added to the PUA of SIL's IPA fonts.  Unicode didn't support it until several years after the Wikipedia article was written, and once the fonts were updated to support it, the PUA character in the article was replaced with its new Unicode value.
-For example,
-:{{xt|1=<code><nowiki>SIL added these letters at U+F267 and U+F268: <span style="font-family:Gentium Plus, Charis SIL, Doulos SIL, serif">{{PUA|&amp;#xf267;}}, {{PUA|&amp;#xf268;}}</span>.</nowiki></code>}}
-which renders as:
-:SIL added these letters at U+F267 and U+F268: <span style="font-family:Gentium Plus, Charis SIL, Doulos SIL, serif">{{PUA|&#xf267;}}, {{PUA|&#xf268;}}</span>.
-{{crossref|See [[:Category:Articles with wanted PUA characters]] and especially [[Tengwar#Unicode]] for examples of PUA characters which cannot easily be replaced.}}
 == See also ==