Composite Character Issue
-
Wednesday, January 20, 2010 5:20 AM
Hi All,
Inter-character spacing for labeling results in a composite text collection with each character being split as a separate one. Hence each character is presented as a separate one and cannot arrive at a combination character. Problem with combinational characters is not only specific to right to left language( Arabic Language- Example:يُساوِي), the problem can exist with left to right language(Hindi Language - Example:ठऑक्षझॉ) also.
So,Please let us know if there exists any API that identifies the given set of pre composed characters comprises a composite character.
Thanks in advance,
Rajesh Reddy
All Replies
-
Thursday, January 21, 2010 7:10 AMHello Rajesh,
Welcome to MSDN forum.
Have you tried the WideCharToMultiByte Function? As said in the documentation,
"The WC_COMPOSITECHECK flag causes the WideCharToMultiByte function to test for decomposed Unicode characters and attempts to compose them before converting them to the requested code page."
More info
http://msdn.microsoft.com/en-us/library/dd374130(VS.85).aspx
Regards,
Rong-Chun Zhang
MSDN Subscriber Support in Forum
If you have any feedback on our support, please contact msdnmg@microsoft.com
Please remember to mark the replies as answers if they help and unmark them if they provide no help.
Welcome to the All-In-One Code Framework! If you have any feedback, please tell us. -
Thursday, January 21, 2010 9:16 AMI tried with WideCharToMultiByte function. But, no help. My exact problem is I want to provide inter character space for label. For example-
English Label Example:
My label is : ABCDEF
i need label should be: A B C D E F
Above example is very easy, because here no composite characters.
Hindi Label example:
My label is: ठऑक्षझॉ
I need label should be: ठ ऑ क्ष झॉ
Arabic Label Example:
My label is : يُساوِي
i need label sould be: يُ س اوِ ي [ Just here I am trying to explained the problem. label split may not correct, since i don't know the Arabic language]
So, I want the list of composite characters of the given label. Finally i will use this list and display these composite characters with space. -
Thursday, January 21, 2010 10:28 AM
Check the Uniscribe API: ScriptItemize function, SCRIPT_ITEM and SCRIPT_ANALYSIS structures.
-
Tuesday, January 26, 2010 5:19 AMHello Rajesh,
Have you got any progress on this issue with these APIs? If there is anything else we can help, welcome to post here.
Thanks,
Rong-Chun Zhang
MSDN Subscriber Support in Forum
If you have any feedback on our support, please contact msdnmg@microsoft.com
Please remember to mark the replies as answers if they help and unmark them if they provide no help.
Welcome to the All-In-One Code Framework! If you have any feedback, please tell us. -
Wednesday, January 27, 2010 6:30 AM
I have used the ScriptShape method. Before calling ScriptShape method i have used ScriptItemize method also. But these methods are not solved my problem. Please let me know if we have any other API's to get the list of composite characters of the given label. And also i need full details about the ScriptShape, ScriptItemize functions and how to use these methods to solve the problem. Below i am pasting the code. And let me know if any changes required here. Iam gettinng wrong cluster list here.
LPWSTR pszText;
pwText1 = L
"شارع الخراج";
wchar_t *pwText1;
HDC hdc = ::GetDC(NULL);
HRESULT hr;
///******************* Font code starts*************/
LOGFONT fn;
HFONT hFont;
HFONT hOldFont;
memset(&fn, 0,
sizeof(LOGFONT));
fn.lfHeight=-15;
fn.lfCharSet=DEFAULT_CHARSET;
strcpy((
char*)fn.lfFaceName,"Arial Unicode MS");
hFont= (HFONT)CreateFontIndirect(&fn);
hOldFont=(HFONT)SelectObject(hdc, hFont);
///******************** Font code ends here **************/
SCRIPT_ITEM *pItems=NULL;
SCRIPT_ITEM *pItem=NULL;
SCRIPT_CACHE cache1=NULL;
SCRIPT_FONTPROPERTIES fontProperties;
int iLen;
long x =20, y = 30;
int *piLen;
int *pLogicalToVisual;
int iNumItems=0;
int iNumGlyphs;
Itemize(TRUE, pwText1, &pItems, &iNumItems);
piLen=
new int[iNumItems];
pLogicalToVisual=
new int[iNumItems];
GetItemLen(pItems, iNumItems, piLen);
GetVisualOrder(pItems, iNumItems, piLen, pLogicalToVisual);
for(int i=0;i<iNumItems;i++)
{
POINT ptExtent;
pItem = &pItems[pLogicalToVisual[i]];
pItem->a.fLogicalOrder=FALSE;
iLen = *piLen;
pszText = pwText1+pItems[pLogicalToVisual[i]].iCharPos;
GetTextExtentPoint32W(hdc, pszText,
piLen[pLogicalToVisual[i]], (PSIZE)&ptExtent);
int iMaxGlyphs = (int)(iLen * 1.5) + 16; //ScriptShape documentation suggest this value
WORD* pwGlyphs = (WORD*) _alloca(
sizeof(WORD)*iMaxGlyphs);
WORD* pwLogClust = (WORD*)alloca(
sizeof(WORD)*iMaxGlyphs);
SCRIPT_VISATTR* pAttr = (SCRIPT_VISATTR*)_alloca(
sizeof(SCRIPT_VISATTR)*iMaxGlyphs);
fontProperties.cBytes =
sizeof(SCRIPT_FONTPROPERTIES);
hr = ScriptGetFontProperties(hdc, &cache1, &fontProperties);
hr = ScriptShape(hdc, &cache1, pszText, iLen, iMaxGlyphs, &pItem->a,
pwGlyphs, pwLogClust, pAttr, &iNumGlyphs);
-
Friday, January 29, 2010 9:41 AMHello Rajesh,
Michael had post the same issue on his blog.
http://blogs.msdn.com/michkap/archive/2010/01/26/9952907.aspx
Thanks,
Rong-Chun Zhang
MSDN Subscriber Support in Forum
If you have any feedback on our support, please contact msdnmg@microsoft.com
Please remember to mark the replies as answers if they help and unmark them if they provide no help.
Welcome to the All-In-One Code Framework! If you have any feedback, please tell us. -
Tuesday, February 02, 2010 3:19 AMHello Rajesh,
Have you got any progress on this issue with Micheal's suggestion? If there is anything else we can help, welcome to post here.
Thanks,
Rong-Chun Zhang
MSDN Subscriber Support in Forum
If you have any feedback on our support, please contact msdnmg@microsoft.com
Please remember to mark the replies as answers if they help and unmark them if they provide no help.
Welcome to the All-In-One Code Framework! If you have any feedback, please tell us. -
Friday, February 05, 2010 11:16 AM
We need to split the Unicode characters honoring combined characters and ligatures. Given a substring we can identify whether it is a combining character or not using IsScriptShape or .Net’s StringInfo.CombiningCharacters API. However we need an API that identifies whether a given substring wholly forms a ligature so that we will not split such substrings. -
Wednesday, February 10, 2010 10:03 AMHello Rajesh,
Michael had post the explanation and the answer in his blog.
http://blogs.msdn.com/michkap/archive/2010/01/26/9952907.aspx
Thanks,
Rong-Chun Zhang
MSDN Subscriber Support in Forum
If you have any feedback on our support, please contact msdnmg@microsoft.com
Please remember to mark the replies as answers if they help and unmark them if they provide no help.
Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.

