[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Introduction to a multi-byte character extension



In the separate mail, I posted  a mail titled
"A multi-byte character extension proposal",
which has 740 lines.

Since I posted mails on kanji manipulation in May 1986,
I have been thinking of this problem.
I organized the kanji WG under Jeida Common Lisp committee,
and I welcomed Dr. Motoyoshi  from Electro-technical Laboratory
as the chair of the WG.
I asked him and his group to make a proposal.
Their porposal was first appeared on the academic meeting
(IPSJ SIGSYM Jan. 1987) and then reported in the annual report
of Jeida, then was refined into the english version I posted
as a mail.
I believe the contents are worth to distribute in USA and
to have  criticism and suggestions
as a base for multi-byte character manipulation extension
of Common Lisp.
It is a report of our kanji WG
to make a guide line for multi-byte character manipulation
in Common Lisp, not by myself only, but by the contributions of
all the members of kanji WG.
	kanji WG members:
                IKEO, J. (Fuji Xerox Co., Ltd.)
                KIMURA, K. (Toshiba Corp.)
                MURAYAMA, K. (Nippon Univac Kaisha, Ltd.)
                NAKAMURA, S. (Fujitsu, Ltd.)
                OKA, M. (Japan Radio Co., Ltd.)
                SAKAIBARA, K. (Hitachi Co., Ltd.)
                SHIOTA, E. (Nihon Symbolics Corp.)
                SUGIMURA, T. (Nippon Telegraph and Telephone Corp.)

Especially, thanks to Mr. Shiota of Nippon Symbolics,
who made a first english version.

It has a digest, 4 chapters and one appendix;
   Digest
   1. A proposal for embedding multi-byte characters
   2. Additional features for embedding multi-byte characters
   3. Common parts which we implement
   4. Proposed changes to CLtL to support multiple character sets
   Appendix Proposed Japanese character processing facilities for Common Lisp

We tried to split the general issues on multi-byte character extension
and the japanese character manipulation.
We hope this specification will fit to any other character set handling.
Our proposal is based on the clear specification of CLtL
for character data type.

Roughly speaking,
First, we give a value larger than 256, may be 65536 or larger,
to *char-code-limit* constant.
Second, for the implementation which do not need to have
a multi-byte character feature, we propose internal-thin-things.
Third, to handle multi-byte character properly, we propose
write-width function.

We did not make a conclusion on font- style- issue.
And our proposal does not have a direct concern to the issue.


We will welcome any comments on it.
But we can not make a rapid answer to arpa.
So, please dont haste. excuse me.
We will make answers to the important questions
considering our limited budget for communication.

Masayuki Ida