Wikistandards:Community Portal
From Wikistandards
Check the Foreign Language Resource Center for a comprehensive listing of Standards for Content Creation and Globalization.
[edit] ISO-639-3 special codes
There is content that is for all intents and purposes not language specific.
There are specific codes for this kind of content:
- mul - Multiple language
- und - Undetermined
- zxx - No linguistic content
What codes should I use for things like chemical formulae ?? Thanks, GerardM 17:18, 18 July 2006 (CEST)
- The "
zxx" code seems inappropriate since it indicates that the language of the tagged resource has no written form. I think it would be used to tag certain audio clips (or perhaps video clips of sign languages?). Likewise, the "und" seems inappropriate because it indicates that the language of the tagged resource is not known at the time of tagging but may, perhaps, be later determined and even retagged. That seems to leave "mul" as the best choice. Rodasmith 19:48, 18 July 2006 (CEST)
mulfollows the idea that, formulae are embedded in many languages, and thus are indeed somehow multilingual.- On the other hand,
mulwas iirr created for labelling containers of mixed language content such as in this (x)html fragment:
<div lang="mul">
<span lang="en">So one can say, the french</span>
<span lang="fr">Oui</span>
<span lang="en">does not always translate to either</span>
<span lang="it">si</span>,
<span lang="es">si</span>,
<span lang="en">or</span>
<span lang="de">ja</span>.
</div>
- Which lets
mulappear inappropriate. Note that, in the above Example, cascading style sheets (CSS) could be emplyoed to distinctly render sample words of varying languages, so e.g. a reader might be presented Italian text for example in in italics on yellow background, German on soft green, Spanish on light pink, etc. - When
xxzis indeed reserved to non-witten, or non-alphanumeric content, my suggestion is to reserve a unique code for formulae, figures, etc.- In multilingual and translational contexts, these data, as
xxzbut possibly unlikemul, andund, need at times to be transcribed, but never to be translated, etc. - In a spellchecking context, these data either needs to be excluded from the spellcheking process, or needs a very special 'spellcheck', depending on the individual type of formula, or data, in the container.
- Regarding typesetting and presentational matters, various special rules apply, again depending on the exact class of data.
- In multilingual and translational contexts, these data, as
- All these suggest imho, to use either
xxz, or a new code, and to establish an additional list of possible subtypes of homogenous treatment, possibly resembling the possiblity to add ISO 3166 country codes to language codes. Since this is a complicated matter, to quite some extent outside the scope of lingustics, I suggest to leave subdivisions to a separate norm, which should bring scientific needs, pure technical aspects of data processing and transcription, linguistic, and presentational matters together. - If my view is shared by others, this would need further discussion. Where and how to start it? --Purodha 15:53, 2 February 2007 (CET)
- Which lets
[edit] Original research
Hi all, since we are professionals, in my view, original research shouldn't be forbidden here. I mean 'original research' in the meaning of English wikipedia: http:en.wikipedia.org/wiki/wikipedia:No_original_research. In my opinion we should perhaps explicitely discuss about that. Bests, Claudi.--81.74.187.124 18:24, 18 February 2007 (CET) ...That's me- I wasn't logged in. Sorry,--Clamengh 20:22, 20 February 2007 (CET)
