Quick Links: Download Gideros Studio | Gideros Documentation | Gideros Development Center | Gideros community chat
Arabic - Gideros Forum

Arabic

MoKaLuxMoKaLux Member
edited April 16 in Game & application design
Hello there,

I have started fiddling with gideros and arabic again.

Let me explain: I have an arabic word with vowels! I transform it to numbers depending on the letter, then I transform the vowel to a string. Then I map those to a .png and show it to the screen.

That's it. I need to work on position and more stuff, but that should do it for my needs.

PS: the older answers do not work in some cases, so I had to do it from scratch.
PS2: i won't be here for the next two weeks!

Have fun people.

Peace.
Tagged:
gideros_ar01.png
1182 x 641 - 44K

Comments

  • MoKaLuxMoKaLux Member
    Could you tell me if the class can be better written please?
    The code is a bit long because it accepts 15 letters and I like when the code is not in one block.
    You can put spaces to make a sentence.

    The way I call it (anywhere in my project):
    	local arab_words = {
    		Arabic_Words.new(26, 9, 1, 0, 2, 28, 3, 0, nil, nil, nil, nil, nil, nil, nil,"d", "m", "i", " ", "md", "m", "f", ".", nil, nil, nil, nil, nil, nil, nil,"as", nil, nil, " ", "a", "s", "oun", ".", nil, nil, nil, nil, nil, nil, nil),
    		Arabic_Words.new(26, 9, 1, 0, 24, 12, 5, 8, 0, nil, nil, nil, nil, nil, nil,"d", "m", "i", " ", "md", "m", "m", "f", ".", nil, nil, nil, nil, nil, nil,"as", nil, nil, " ", "a", "s", "i", "oun", ".", nil, nil, nil, nil, nil, nil),
    		Arabic_Words.new(26, 9, 1, 0, 2, 1, 2, 0, nil, nil, nil, nil, nil, nil, nil,"d", "m", "i", " ", "md", "m", "i", ".", nil, nil, nil, nil, nil, nil, nil,"as", nil, nil, " ", "a", nil, "oun", ".", nil, nil, nil, nil, nil, nil, nil),
    		Arabic_Words.new(26, 9, 1, 0, 22, 3, 1, 2, 0, nil, nil, nil, nil, nil, nil,"d", "m", "i", " ", "md", "m", "m", "i", ".", nil, nil, nil, nil, nil, nil,"as", nil, nil, " ", "i", "a", nil, "oun", ".", nil, nil, nil, nil, nil, nil),
    		Arabic_Words.new(26, 9, 1, 0, 21, 23, 24, 0, nil, nil, nil, nil, nil, nil, nil,"d", "m", "i", " ", "md", "m", "f", ".", nil, nil, nil, nil, nil, nil, nil,"as", nil, nil, " ", "a", "a", "oun", ".", nil, nil, nil, nil, nil, nil, nil),
    	}

    The class:
    Arabic_Words = Core.class(Sprite)
     
    function Arabic_Words:init(
    			pl01, pl02, pl03, pl04, pl05, pl06, pl07, pl08, pl09, pl10, pl11, pl12, pl13, pl14, pl15,
    			pl01f, pl02f, pl03f, pl04f, pl05f, pl06f, pl07f, pl08f, pl09f, pl10f, pl11f, pl12f, pl13f, pl14f, pl15f,
    			pv01, pv02, pv03, pv04, pv05, pv06, pv07, pv08, pv09, pv10, pv11, pv12, pv13, pv14, pv15)
     
    	-- set up the letters tilesheet, 1 letter is 64*64px
    	local texture = Texture.new("arabic/arabletters_64_dmf.png", false)
     
    	-- fills the table with all the letters
    	self.arabletters_list = {}
    	for l = 1, 8 do -- number of lines in tilesheet
    		for c = 1, 15 do -- number of columns in tilesheet
    			local arabletter = TextureRegion.new(texture, (c-1) * 64, (l-1) * 64, 64, 64)
    			local bitmap = Bitmap.new(arabletter)
    			self.arabletters_list[#self.arabletters_list + 1] = bitmap
    		end
    	end
     
    	-- picks the right letter form depending on its position in the word, and fills a table
    	local xlf = {
    		self:letterForm(pl01, pl01f),
    		self:letterForm(pl02, pl02f),
    		self:letterForm(pl03, pl03f),
    		self:letterForm(pl04, pl04f),
    		self:letterForm(pl05, pl05f),
    		self:letterForm(pl06, pl06f),
    		self:letterForm(pl07, pl07f),
    		self:letterForm(pl08, pl08f),
    		self:letterForm(pl09, pl09f),
    		self:letterForm(pl10, pl10f),
    		self:letterForm(pl11, pl11f),
    		self:letterForm(pl12, pl12f),
    		self:letterForm(pl13, pl13f),
    		self:letterForm(pl14, pl14f),
    		self:letterForm(pl15, pl15f),
    	}
     
    	-- position each letters 32px apart from the other (it's RTL)
    	for lposx = 1, #xlf do
    		if lposx == 1 then
    			self:letterPosX(lposx, xlf[lposx], xlf[lposx])
    		else
    			self:letterPosX(lposx, xlf[lposx], xlf[lposx - 1])
    		end
    	end
     
    	-- assigns a vowel from the function paramaters
    	local xlv = {
    		self:letterVowel(xlf[1], pv01),
    		self:letterVowel(xlf[2], pv02),
    		self:letterVowel(xlf[3], pv03),
    		self:letterVowel(xlf[4], pv04),
    		self:letterVowel(xlf[5], pv05),
    		self:letterVowel(xlf[6], pv06),
    		self:letterVowel(xlf[7], pv07),
    		self:letterVowel(xlf[8], pv08),
    		self:letterVowel(xlf[9], pv09),
    		self:letterVowel(xlf[10], pv10),
    		self:letterVowel(xlf[11], pv11),
    		self:letterVowel(xlf[12], pv12),
    		self:letterVowel(xlf[13], pv13),
    		self:letterVowel(xlf[14], pv14),
    		self:letterVowel(xlf[15], pv15),
    	}
     
    	-- finally adds the letters and its vowel to self
    	for xlfv = 1, #xlf do
    		self:addLettersChild(xlf[xlfv], xlv[xlfv])
    	end
     
    end
     
    -- FUNCTIONS
    function Arabic_Words:letterForm(pl, plf)
     
    	if pl ~= nil then
     
    		local xlf = nil
     
    		if pl == 0 then
     
    			xlf = self.arabletters_list[1]
     
    		else
     
    			if plf == "i" then
    				xlf = self.arabletters_list[pl * 4 - 2]
    			elseif plf == "d" then
    				xlf = self.arabletters_list[pl * 4 - 1]
    			elseif plf == "m" then
    				xlf = self.arabletters_list[pl * 4 - 0]
    			elseif plf == "f" then
    				xlf = self.arabletters_list[pl * 4 + 1]
    			elseif plf == "md" then
    				xlf = self.arabletters_list[pl * 4 - 1]
    			elseif plf == "mf" then
    				xlf = self.arabletters_list[pl * 4 - 0]
    			elseif plf == "fi" then
    				xlf = self.arabletters_list[pl * 4 - 2]
    			elseif plf == " " then
    				xlf = self.arabletters_list[1]
    			elseif plf == "." then
    				-- nothing here
    			else
    				-- nothing here
    			end
     
    		end
     
    		return xlf
     
    	end
     
    	return nil
     
    end
     
     
    function Arabic_Words:letterPosX(pn, pl, plm1)
     
    --	print(plm1)
     
    	if pn > 1 then
     
    		if plm1 == self.arabletters_list[1] then
     
    			pl:setX(plm1:getX() - 8)
     
    		else
     
    			pl:setX(plm1:getX() - 32)
     
    		end
     
    	end
     
    end
     
     
    function Arabic_Words:letterVowel(pl, pv)
     
    	if pl ~= nil then
     
    		local xlv = nil
    		local v = nil
     
    		if pv == "a" then
    			xlv = Bitmap.new(Texture.new("arabic/a_i.png"))
    			xlv:setPosition(pl:getX() + 16, pl:getY() - 16)
    		elseif pv == "ou" then
    			xlv = Bitmap.new(Texture.new("arabic/ou.png"))
    			xlv:setPosition(pl:getX() + 16, pl:getY() - 16)
    		elseif pv == "i" then
    			xlv = Bitmap.new(Texture.new("arabic/a_i.png"))
    			xlv:setPosition(pl:getX() + 16, pl:getY() + 48)
    		elseif pv == "s" then
    			xlv = Bitmap.new(Texture.new("arabic/souk.png"))
    			xlv:setPosition(pl:getX() + 16, pl:getY() - 16)
    		elseif pv == "oun" then
    			xlv = Bitmap.new(Texture.new("arabic/oun.png"))
    			xlv:setPosition(pl:getX() + 16, pl:getY() - 16)
    		elseif pv == "ca" then
    			xlv = Bitmap.new(Texture.new("arabic/chadda.png"))
    			xlv:setPosition(pl:getX() + 16, pl:getY() - 16)
    		elseif pv == "as" then
    			xlv = Bitmap.new(Texture.new("arabic/alifspec.png"))
    			xlv:setPosition(pl:getX() + 16, pl:getY() - 0)
    		elseif pv == " " then
    			xlv = Bitmap.new(Texture.new("arabic/empty.png"))
    		elseif pv == "." then
    			xlv = Bitmap.new(Texture.new("arabic/point.png"))
    			xlv:setPosition(pl:getX() + 32, pl:getY() + 24)
    		else
    			xlv = Bitmap.new(Texture.new("arabic/empty.png"))
    		end
     
    		xlv:setScale(0.25, 0.25)
     
    		return xlv
     
    	end
     
    	return nil
     
    end
     
     
    function Arabic_Words:addLettersChild(pl, plv)
     
    	self:addChild(pl)
    	self:addChild(plv)
     
    end

    gideros_ar02.png
    602 x 852 - 12K
  • Apollo14Apollo14 Member
    edited April 16
    Hi!
    I'm also interested in how to work with arabic symbols, thx for your information!
    I usually just use arabic ttfonts from the internet:
    ARABIC_FONT @ \TTFont.new("BSinaBd.ttf",24)\
    arabicTextField = TextField.new(ARABIC_FONT, "مرحبا بالعالم")
     
    arabicTextField:setTextColor(0x2ecc71)
    arabicTextField:setPosition(50,50)
     
    stage:addChild(arabicTextField)
    > Newcomers roadmap: from where to start learning Gideros
    "What one programmer can do in one month, two programmers can do in two months." - Fred Brooks
    “The more you do coding stuff, the better you get at it.” - Aristotle (322 BC)
  • MoKaLuxMoKaLux Member
    hi @Apollo14, you're welcome.
    Yes, I have tried that too with other fonts but some letters weren't attached properly. I believe there is some code left in the repo that shows that. Plus I added vowels! will work more on this when coming back then I'll post the whole thing insha'Allah.

  • hgy29hgy29 Maintainer
    I am using the following piece of code to preprocess arabic before sending to TextField. It seems to be ok according to the few arabic customers I have, but I don't read arabic myself so can't tell for sure.
    I'd like to integrate it into gideros, so I'd be glad if some arabic speaker/writer could check it up!
    function arabicProcessing(text)
        local Achar={0x002,0x024,0x062,0x084,0x0C4,0x104,0x144,0x184,0x1C2,0x1E2,0x202,0x222,
    0x244,0x284,0x2C4,0x304,0x344,0x384,0x3C4,0x404,0x440,0x440,0x440,0x440,0x440,0x440,
    0x444,0x484,0x4C4,0x504,0x544,0x584,0x5C4,0x602,0x622,0x644}
        local last=0
        local gtext,rtext,stext="","",""
        local rtl=false
        for p, cur in utf8.codes(text) do
            if ((last>=0x627) and (last<=0x64A)) then
                if not rtl then
                    gtext=gtext..rtext
                    rtext=""
                    rtl=true
                end
                local A=Achar[last-0x626]
                local unibase=0xFE8D+(A>>4)
                local of=0 --0=iso,1=fin,2=ini,3=med
                if ((cur>=0x627) and (cur<=0x64A)) then
                    of=2+state
                else
                    of=state
                end
                state=1
                if (of>=(A&0xF)) then of=0 end
                of=of+unibase
                rtext=rtext..stext..utf8.char(of)
                stext=""
            elseif last>=0x30 then
                if rtl then
                    gtext=gtext..utf8.reverse(rtext)
                    rtext=""
                    rtl=false
                    state=0
                end
                rtext=rtext..stext..utf8.char(last)
                stext=""
            elseif last>0 then
                stext=stext..utf8.char(last)
                state=0
            else
                state=0
            end
            last=cur
        end
        if last>0 then
            rtext=rtext..utf8.char(last)
        end
        if rtl then rtext=utf8.reverse(rtext) end
        if #rtext then gtext=gtext..rtext end
        return gtext
    end

    Likes: SinisterSoft

    +1 -1 (+1 / -0 ) Share on Facebook
  • SinisterSoftSinisterSoft Maintainer
    Accepted Answer
    @MoKaLux Does @hgy29 's latest arabicProcessing (above) function have the vowel changes you made? If not, could you add them and repost it here?
  • MoKaLuxMoKaLux Member
    Hi guys I dont have a computer here. @hgy29 before doing my code i tried yours but some letters were not connected properly.

    I understand a little bit more of lua but i don't think i can fix your code but I am willing to share everything I've got because I am in love with gideros and you guys =)

  • MoKaLuxMoKaLux Member
    let's begin: in your code you have 0x... for arabic letters I suppose in what format? html?

    I am working with unicode for arabic:
    ex: 1574 ئ

    this allow us to deal with vowels as well.

    Some quotes from an excel forum:

    Function atu(rng As Range) As String
    ' arab word to unicode

    Dim i As Integer
    Dim CellValue As String
    Dim arabunicode As Integer
    Dim NewValue As String

    'Get the string from the cell. Although it may not look like it
    'this is in fact unicode. It's kinda hidden from you.
    CellValue = rng.Value

    'go through the string character by character (note that
    'each character is 2 bytes - you just don't see it)
    For i = 1 To Len(CellValue)
    'get the unicode value for this character
    arabunicode = AscW(Mid$(CellValue, i, 1))
    'append this to our string - as unicode
    NewValue = NewValue & arabunicode
    Next i

    'Write our string back to the cell.
    atu = NewValue

    End Function

    See attached my excel experiment.
    xls
    xls
    mkgiArabPlatformerx_15_04.xls
    649K
  • MoKaLuxMoKaLux Member
    I don't think my solution is the answer to dealing with arabic in gideros, but it works for my projects.

    The only problem is that it has a limited number of characters (15 in my gideros function), but it can handle vowels, spaces, comma, column, ...

    The function is not complete yet because I wanted to give it a try before leaving.

    When i am back i should complete it and share the whole class again with a use case. But the basics are here in the github (some classes are missing I believe because they were linked classes, but they are like the basics class for buttons, easing,...)

    So the way I do it is via a texture pack of all the letters isolated, beginning, middle and end of words. Then I add some more graphics for the vowels and voilà.

    Hope that will help you guys. You give us so much you need some payback.

    In the meantime I am going to have a look at your code again to see what I understand from it (but it's hard because you put too few comments :blush: ).

    Could you just tell me what are those 0x... symbols?

    Peace.
  • MoKaLuxMoKaLux Member
    @hgy29 one thing I can do is to use your class and tell you where it fails.

    Plus for the vowels we need to have a code for them as well. For example letter N + vowel A = 16061614 in unicode.

    نَ 16061614

    where ن = 1606
    and 1614 = َ

  • hgy29hgy29 Maintainer
    @MoKaLux 0x is used to denote hexadecimal notation. My code handle ligatures for chars in the range 0x627 to 0x64a (1575 Unicode to 1610 Unicode): it replaces independent characters with connected versions (Unicode 65165 and next ones)
    I will at your example to see if I can understand what goes wrong in my code
  • hgy29hgy29 Maintainer
    Ok, turns out my code is far from being complete. i found a comprehensive set of rules here: https://www.w3.org/TR/alreq/
  • hgy29hgy29 Maintainer
    Digging and reading further I understood a few things: arabic vowels glyphs are classified as 'combining diacritics' that should just be drawn above the preceding character. This should already work in theory provided that unicode chars are reversed before sending them to text field, assuming that those 'combining' characters don't take space by themselves.
    However it turns out that a lot of fonts are buggy: they report a non zero advance distance for those characters. I was using arial.ttf it IS buggy. I switched to NotoSansArabic and it already looks better, although the diacritic mark doesn't seem to be correctly cenetered...

    I used the following test:
    local f=TTFont.new("NotoSansArabic-Black.ttf",100)
     
    local t=TextField.new(f,utf8.char(1614,1606))
    t:setPosition(100,100)
    stage:addChild(t)

    Likes: MoKaLux

    +1 -1 (+1 / -0 ) Share on Facebook
  • hgy29hgy29 Maintainer
    Hours later: diacritics are misplaced because freetype doesn't handle opentype GPOS tables, which indicates how to lay out composed glyphs. Freetype guys suggest to use HarfBuzz library instead for complex script rendering. Benefit: it would handle vietnamese, thai, khmer, etc too, not just arabic. Drawback: I am afraid it would enlarge gideros codebase significantly.
    I am looking at to which extent hafbuzz could be made a plugin, hooking into gideros text rendering. The good thing is that it is MIT licensed

    Likes: Apollo14, MoKaLux

    +1 -1 (+2 / -0 ) Share on Facebook
  • MoKaLuxMoKaLux Member
    edited April 17
    @hgy29 there is a link from another engine where they discuss it: https://github.com/godotengine/godot/issues/982

    https://github.com/godotengine/godot/pull/21791

    The conversation is interesting because they speak of licences too. The code is C++.

    I keep on looking!
  • MoKaLuxMoKaLux Member
    hgy29 said:


    I used the following test:

    local f=TTFont.new("NotoSansArabic-Black.ttf",100)
     
    local t=TextField.new(f,utf8.char(1614,1606))
    t:setPosition(100,100)
    stage:addChild(t)
    This piece of code looks great. I will have to test it!

    It supports vowels and does not limit your text to 15 characters =(

    The only issue here is that users will have to translate their text to unicode and then use your code.

    Or am i missing the point and you try to implement harfbuzz to do this for us?

    I had a look at ICU and it seems to do the same thing as harbuzz?
    http://userguide.icu-project.org/intro

    Personally, your code should be all I need for all of my project where I pass in the unicode to show some arabic text.

    Thank you a ton @hgy29
  • hgy29hgy29 Maintainer
    @MoKaLux, you are mixing up things :)
    In order to display a text, we have to go through several steps. Roughly:

    1. Get a string with the text we want to display: lua strings are only 8 bit, so in gideros texts are expected to be UTF8 encoded. This is what the editor do automatically, but this cause trouble with RTL texts such as arabic, so I used utf8.char() to encode the two codepoints 1606 and 1614 into UTF8 by code for better readability.
    2. text is separated into RTL and LTR chunks. This is a process known as BiDi algorithm. As part of this process RTL texts are reversed to displayed LTR by drawing engine. Gideros doesn't do that step at all and just assume texts are always LTR.
    3. Text is is converted to font glyphs
    4. Glyphs are 'shaped': glyphs are laid out relative to each other depending on language rules. Gideros doesn't do that either, again it assumes latin rules. This is what HarfBuzz is doing.
    5. Glyphs are rendered to screen


    My code above was doing step 2 for arabic, and probably not accurately, but even with that, step 4 was missing too.
    ICU used to do the same as HarfBuzz plus BiDi. They have dropped there shaping routines and integrated HarfBuzz instead, but ICU seems to be much more than just a text layout engine.

    Likes: MoKaLux

    +1 -1 (+1 / -0 ) Share on Facebook
  • MoKaLuxMoKaLux Member
    thank you @hgy29 . I understand your points.

    When I am back I will try an example with your code and show you how I would use it.

    Take it easy!
  • hgy29hgy29 Maintainer
    So, I made a HarfBuzz plugin for gideros, and so far it works as expected! I am going to try and implement BiDi algorithm now...

    Likes: MoKaLux, talis, antix

    +1 -1 (+3 / -0 ) Share on Facebook
  • MoKaLuxMoKaLux Member
    Congrats.
  • SinisterSoftSinisterSoft Maintainer
    How much does HarfBuzz increase the size by? Will it be needed only for Arabic text?

    Will BiDi be added to the normal text system too? If so, will the existing ArabicProcessing function still work?
  • hgy29hgy29 Maintainer
    @SinisterSoft,

    HarfBuzz .dll is around 1MB in size, which is quite a lot compared to other plugins. I can't tell for other platforms yet. It actually handle most languages, not just arabic.

    I am not sure how I will implement BiDi yet, but I'll make it optional. I am looking for a lightweight implementation. So far I found ICU and FriBiDi, but I could make my own for trivial cases too. Using HarfBuzz and BiDi will make arabicProcessing obsolete, but since it will be a plugin there shouldn't be much concerns about backward compatibility.
    +1 -1 (+3 / -0 ) Share on Facebook
  • MoKaLuxMoKaLux Member
    hgy29 said:


    I used the following test:

    local f=TTFont.new("NotoSansArabic-Black.ttf",100)
     
    local t=TextField.new(f,utf8.char(1614,1606))
    t:setPosition(100,100)
    stage:addChild(t)
    Hello hgy29, the code works for letters and vowels but they are not aligned (I thought they would).
    hgy29 said:

    @MoKaLux 0x is used to denote hexadecimal notation. My code handle ligatures for chars in the range 0x627 to 0x64a (1575 Unicode to 1610 Unicode): it replaces independent characters with connected versions (Unicode 65165 and next ones)

    I tried getting the letters (Unicode 65165 and next ones) but I could not find those Unicode 65165 and next ones. In what ttf did you find those Unicode?
  • hgy29hgy29 Maintainer
    Accepted Answer
    @MoKaLux, Unicode 65165 and above are in NotoSansArabic and Arial at least, the two fonts I tried. And yes, letters are misplaced in Gideros, that's why I worked on integrating HarfBuzz, which purpose is exactly to place them correctly.

    See the result in attached image for two texts: each is rendered three times:
    - with standard gideros functions
    - with arabicProcessing being applied first
    - with HarfBuzz plugin (without arabicProcessing, which is no longer necessary)


    Likes: MoKaLux

    GF-1.png
    390 x 699 - 30K
    +1 -1 (+1 / -0 ) Share on Facebook
  • MoKaLuxMoKaLux Member
    @hgy29 I think you're on the right path, I am working on my arabic lua thing so we can compare. Thank you for your precious help.
Sign In or Register to comment.