none
Composite Character Issue

    Question

  • Hi All,

    Inter-character spacing for labeling results in a composite text collection with each character being split as a separate one. Hence each character is presented as a separate one and cannot arrive at a combination character. Problem with combinational characters is not only specific to right to left language( Arabic Language- Example:يُساوِي), the problem can exist with left to right language(Hindi Language - Example:ठऑक्षझॉ)  also.

    So,Please let us know if there exists any API that identifies the given set of pre composed characters comprises a composite character.

    Thanks in advance,

    Rajesh Reddy

    Wednesday, January 20, 2010 5:20 AM

All replies

  • Hello Rajesh,

    Welcome to MSDN forum.

    Have you tried the WideCharToMultiByte Function? As said in the documentation,

    "The WC_COMPOSITECHECK flag causes the WideCharToMultiByte function to test for decomposed Unicode characters and attempts to compose them before converting them to the requested code page."

    More info
    http://msdn.microsoft.com/en-us/library/dd374130(VS.85).aspx

    Regards,
    Rong-Chun Zhang
    MSDN Subscriber Support in Forum
    If you have any feedback on our support, please contact msdnmg@microsoft.com
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
    Thursday, January 21, 2010 7:10 AM
  • I tried with WideCharToMultiByte function. But, no help. My exact problem is I want to provide inter character space for label. For example-

    English Label Example:

    My label is : ABCDEF
    i need label should be: A   B   C   D   E   F

    Above example is very easy, because here no composite characters.

    Hindi Label example:

    My label is: ठऑक्षझॉ
    I need label should be: ठ   ऑ   क्ष   झॉ

    Arabic Label Example:

    My label is : يُساوِي

    i need label sould be: يُ   س  اوِ  ي [ Just here  I am trying to explained the problem. label split may not correct, since i don't know the Arabic language]


    So, I want the list of composite characters of the given label. Finally i will use this list and  display these composite characters with space.
    Thursday, January 21, 2010 9:16 AM
  • Check the Uniscribe API: ScriptItemize function, SCRIPT_ITEM and SCRIPT_ANALYSIS structures.

    Thursday, January 21, 2010 10:28 AM
  • Hello Rajesh,

    Have you got any progress on this issue with these APIs? If there is anything else we can help, welcome to post here.

    Thanks,
    Rong-Chun Zhang
    MSDN Subscriber Support in Forum
    If you have any feedback on our support, please contact msdnmg@microsoft.com
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
    Tuesday, January 26, 2010 5:19 AM
  • I have used the ScriptShape method. Before calling ScriptShape method i have used ScriptItemize method also. But these methods are not solved my problem. Please let me know if we have any other API's to get the list of composite characters of the given label. And also i need full details about the ScriptShape, ScriptItemize functions and how to use these methods to solve the problem. Below i am pasting the code. And let me know if any changes required here. Iam gettinng wrong cluster list here.

    LPWSTR pszText;

    pwText1 = L

    "شارع الخراج";

     

    wchar_t *pwText1;

    HDC hdc = ::GetDC(NULL);

    HRESULT hr;

     

    ///******************* Font code starts*************/

    LOGFONT fn;

    HFONT hFont;

    HFONT hOldFont;

    memset(&fn, 0,

    sizeof(LOGFONT));

    fn.lfHeight=-15;

    fn.lfCharSet=DEFAULT_CHARSET;

    strcpy((

    char*)fn.lfFaceName,"Arial Unicode MS");

    hFont= (HFONT)CreateFontIndirect(&fn);

    hOldFont=(HFONT)SelectObject(hdc, hFont);

     

    ///******************** Font code ends here **************/

    SCRIPT_ITEM *pItems=NULL;

    SCRIPT_ITEM *pItem=NULL;

    SCRIPT_CACHE cache1=NULL;

    SCRIPT_FONTPROPERTIES fontProperties;

     

    int iLen;

     

    long x =20, y = 30;

     

    int *piLen;

     

    int *pLogicalToVisual;

     

    int iNumItems=0;

     

    int iNumGlyphs;

     

     

    Itemize(TRUE, pwText1, &pItems, &iNumItems);

    piLen=

    new int[iNumItems];

    pLogicalToVisual=

    new int[iNumItems];

    GetItemLen(pItems, iNumItems, piLen);

    GetVisualOrder(pItems, iNumItems, piLen, pLogicalToVisual);

     

    for(int i=0;i<iNumItems;i++)

    {

    POINT ptExtent;

    pItem = &pItems[pLogicalToVisual[i]];

    pItem->a.fLogicalOrder=FALSE;

    iLen = *piLen;

    pszText = pwText1+pItems[pLogicalToVisual[i]].iCharPos;

    GetTextExtentPoint32W(hdc, pszText,

    piLen[pLogicalToVisual[i]], (PSIZE)&ptExtent);

     

    int iMaxGlyphs = (int)(iLen * 1.5) + 16; //ScriptShape documentation suggest this value

    WORD* pwGlyphs = (WORD*) _alloca(

    sizeof(WORD)*iMaxGlyphs);

    WORD* pwLogClust = (WORD*)alloca(

    sizeof(WORD)*iMaxGlyphs);

    SCRIPT_VISATTR* pAttr = (SCRIPT_VISATTR*)_alloca(

    sizeof(SCRIPT_VISATTR)*iMaxGlyphs);

     

    fontProperties.cBytes =

    sizeof(SCRIPT_FONTPROPERTIES);

    hr = ScriptGetFontProperties(hdc, &cache1, &fontProperties);

     

    hr = ScriptShape(hdc, &cache1, pszText, iLen, iMaxGlyphs, &pItem->a,

    pwGlyphs, pwLogClust, pAttr, &iNumGlyphs);

    Wednesday, January 27, 2010 6:30 AM
  • Hello Rajesh,

    Michael had post the same issue on his blog.
    http://blogs.msdn.com/michkap/archive/2010/01/26/9952907.aspx

    Thanks,
    Rong-Chun Zhang
    MSDN Subscriber Support in Forum
    If you have any feedback on our support, please contact msdnmg@microsoft.com
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
    Friday, January 29, 2010 9:41 AM
  • Hello Rajesh,

    Have you got any progress on this issue with Micheal's suggestion? If there is anything else we can help, welcome to post here.

    Thanks,
    Rong-Chun Zhang
    MSDN Subscriber Support in Forum
    If you have any feedback on our support, please contact msdnmg@microsoft.com
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
    Tuesday, February 02, 2010 3:19 AM

  • We need to split the Unicode characters honoring combined characters and ligatures. Given a substring we can identify whether it is a combining character or not using IsScriptShape or .Net’s StringInfo.CombiningCharacters API. However we need an API that identifies whether a given substring wholly forms a ligature so that we will not split such substrings.

    Friday, February 05, 2010 11:16 AM
  • Hello Rajesh,

    Michael had post the explanation and the answer in his blog.

    http://blogs.msdn.com/michkap/archive/2010/01/26/9952907.aspx

    Thanks,
    Rong-Chun Zhang
    MSDN Subscriber Support in Forum
    If you have any feedback on our support, please contact msdnmg@microsoft.com
    Please remember to mark the replies as answers if they help and unmark them if they provide no help.
    Welcome to the All-In-One Code Framework! If you have any feedback, please tell us.
    Wednesday, February 10, 2010 10:03 AM