Anonymous | Login | 2024-11-21 15:34 UTC |
My View | View Issues | Change Log | Roadmap |
View Issue Details [ Jump to Notes ] | [ Issue History ] [ Print ] | ||||||||||||
ID | Project | Category | View Status | Date Submitted | Last Update | ||||||||
0001420 | VCMI | Other | public | 2013-08-26 02:49 | 2023-04-11 09:21 | ||||||||
Reporter | acme_pjz | ||||||||||||
Assigned To | Ivan | ||||||||||||
Priority | normal | Severity | feature | Reproducibility | N/A | ||||||||
Status | resolved | Resolution | fixed | ||||||||||
Platform | OS | OS Version | |||||||||||
Product Version | |||||||||||||
Target Version | Fixed in Version | ||||||||||||
Summary | 0001420: Read data file and map file with other text encodings | ||||||||||||
Description | Is it possible to read data file and map file with other configurable text encodings? In current version the font files can be configured already. But it still doesn't work on Heroes3 Simplified Chinese edition, because the text encoding is CP936. If the text encoding is configurable, then VCMI will support more languages of Heroes3. | ||||||||||||
Tags | No tags attached. | ||||||||||||
Attached Files | H3Bitmap.7z [^] (219,570 bytes) 2013-09-03 06:11 HZK.7z [^] (427,903 bytes) 2013-09-03 06:11 vcmi-chinese.png [^] (693,206 bytes) 2013-09-08 15:06 1.jpg [^] (54,280 bytes) 2013-09-11 05:30 1-1.jpg [^] (74,040 bytes) 2013-10-03 04:57 1-2.jpg [^] (65,890 bytes) 2013-10-03 04:58 | ||||||||||||
Notes | |
(0003895) Ivan (developer) 2013-08-26 09:33 |
Hi. This is first time I hear about H3 version that uses multi-byte encoding. Can you upload fonts and text files from your version of H3? Normally they can be found in file Data/H3bitmap.lod which can be opened using MMArchive, can be downloaded here: http://wogarchive.ru/download.php?id=119 [^] Unpack all files with *.txt and *.fnt extensions and upload them here. |
(0003921) acme_pjz (reporter) 2013-09-03 06:15 |
Text files uploaded. They are all in CP936 (a.k.a. GBK) encoding. And I think font files in H3bitmap.lod don't contain Chinese characters, because I found some file named HZK** in the root directory of H3, and I think they are Simplified Chinese font file. |
(0003922) Ivan (developer) 2013-09-03 10:18 |
Thanks. Yes - these 3 files are indeed Chinese fonts and encoding seems to match to GBK character tables I found. I'll add support for these fonts soon - looks simple enough. For reference: These files consist from sequence of bitmaps, 10x10, 12x12 or 24x24 pixels in size, 1 bit per pixel, width is aligned to full byte (10x10 image actually uses 2x10 bytes per character). No header or anything like that. Font files contain ONLY Chinese characters, ASCII symbols should be taken from English fonts. Mapping of 2-byte GBK characters to image index: index = (first - 0x80) * 0xA0 + (second - 0x40) May be a bit different - file size indicates 8000+ characters while GBK can fit 20000+ symbols. |
(0003935) acme_pjz (reporter) 2013-09-06 16:00 |
I’ll test it soon. BTW, it need to convert to UTF8 or UTF16 if using SDL_ttf IMO |
(0003936) Ivan (developer) 2013-09-06 16:28 |
Proper unicode support is more difficult to implement. Right now I'm leaning towards making vcmi work with GBK encoding so no conversion is needed. I also can't use SDL_ttf here because those fonts you posted are bitmaps while SDL_ttf works only with true type vector fonts. Not really a problem since original H3 fonts are also bitmaps and are already supported (including different languages like Russian) |
(0003961) Ivan (developer) 2013-09-08 15:08 |
Looks to be working, will commit my changes soon. Does everything on this image looks OK? http://bugs.vcmi.eu/file_download.php?file_id=1465&type=bug [^] |
(0003967) Tow (developer) 2013-09-08 16:16 |
The text line in the message (the white line under the title text) seems to be clipped, missing several bottom pixel rows. Same issue with labels next to the Quest Log and Dismiss buttons. Other text seems fine. |
(0003982) acme_pjz (reporter) 2013-09-09 05:18 |
Same as Tow said, and maybe too few spaces between characters. And I want to point out that your mapping formula is incorrect. It looks like H3 Simplified Chinese edition only supports GB2312 encoding, i.e. GBK/1 and GBK/2 in GBK standard. After experiments I found that the correct mapping formula is index = (first - 0xA1) * 94 + (second - 0xA1) Be careful, it can map non-GB2312 character into the valid range :| For non-GB2312 characters and other edition of H3 (possibly Traditional Chinese edition which may using other text encodings?) an iconv calling and SDL_ttf must be used (VCMI can be configured to use TTF vector font by changing config/fonts.json, right?) The second note is while HZK10 looks no problem, characters in HZK12 and HZK24H seems X-Y flipped! |
(0003986) Ivan (developer) 2013-09-09 11:15 |
Can you make screenshot of Chinese version and post it here for comparison? I compared several characters to what I see in text editor and they were looking fine. - missing line is already fixed - GBK vs GB2312: actually I ran into this issue already. After a quick search I found a discussion on Chinese support in Era. Including link to GBK fonts. Switching from GBK to GB2312 is possible but since GBK already includes GB2312 I think supporting GBK would be more than enough. - unicode: I think I'll try to add some basic unicode support (utf8 to be precise). This will allow using ttf font with any encoding BUT encoding must be selected manually - I haven't found any reliable way to detect encoding. Especially problematic for very similar Win1250...Win1252 which are used in majority of H3 versions. This will solve "what encoding to use" problem but from my experience ttf fonts quality is a bit lower compared to native bitmaps. |
(0003995) acme_pjz (reporter) 2013-09-11 05:31 |
Hi, I just uploaded a screenshot, taken from Internet, but it represents Chinese version well. |
(0004059) acme_pjz (reporter) 2013-10-02 05:38 |
I tried VCMI 0.94, but it still doesn't work, either GBK or GB2312. Should I need to put HZK10 and other files to specified directory? |
(0004060) Ivan (developer) 2013-10-02 16:27 |
Right now you'll need to install this "mod" - it provides some files necessary for Chinese support (fonts & config file) http://download.vcmi.eu/mods/repository/chinese%20fonts.zip [^] Download this file and unpack int into Mods/ directory. This mod can be also installed via launcher. |
(0004061) acme_pjz (reporter) 2013-10-03 05:15 |
Thanks, it works. But there are a few glitches: mojibake due to incorrect character boundary detection (characters in green circles) and string which is too wide (in cyan circles). http://bugs.vcmi.eu/file_download.php?file_id=1532&type=bug [^] http://bugs.vcmi.eu/file_download.php?file_id=1533&type=bug [^] Maybe you can fix them in next versions? |
(0004062) Ivan (developer) 2013-10-03 11:59 edited on: 2013-10-03 14:29 |
Sure, will fix. Note that time between our releases is around 3 months so you'll have to wait a bit. 1) Too wide string: will see what I can do - it seems that ASCII characters need special treatment. 2) What's wrong in green circles? Missing characters? Incorrect line breaks? Something else? I don't know Chinese so I have not idea how it should look like. |
(0004063) acme_pjz (reporter) 2013-10-04 07:34 edited on: 2013-10-04 14:20 |
1) How about fall back to default ASCII font file for ASCII characters (and characters are variable width)? This is the behavior of original H3. 2) The character render is incorrect in green circles, which called "乱码" in Chinese, and (hopefully) Mojibake in English. (http://en.wikipedia.org/wiki/Mojibake [^]) Since GBK is 2-byte encoding and I think the line breaking code is not taken this into account, so it break the line inside character. By the way, Are there any nightly builds for Windows (or Linux)? [EDIT] Sorry, but I have already found them in the forum. |
(0004064) Ivan (developer) 2013-10-04 13:19 |
1) I thought about this at first but Chinese fonts have different height. OK, I'll try to find matching fonts. Shouldn't be hard to implement. 2) Thanks. Yeah - that's probably some bug in line splitting code. Will fix. |
(0004065) Ivan (developer) 2013-10-04 15:14 |
>> By the way, Are there any nightly builds for Windows (or Linux)? You can find nighty builds for Ubuntu here: https://launchpad.net/~vcmi/+archive/ppa [^] For technical reasons (multiple versions of Ubuntu) launcher is disabled but othervice they should work. "Nighty builds" for Windows are usually done some time before release so there won't be any Windows builds for around 2 months. |
(0004086) Jolly Wing (reporter) 2013-10-19 13:31 |
Hi, Ivan. I have made some little changes on the source code 0.9.3 to make vcmi to support the Game Data archives from Simplified Chinese version. I think such a modification can also support Game Data archives from CJK (Chinese, Japanese, Korean and such multi-bytes language) versions through some improvements. When the engine reads map data and general text data, I convert the GBK encoded string into UTF8 encoded string, then I render the strings with SDL_TTF_RenderUTF8. It works! The changes I made are list as follow: 1. client/CMessage.cpp in function CMessage::breakText(), line 153 // added by [email protected], 2013-08-28 // If the text[z] is less than 0, it is the first byte of a UTF8 Chinese word. #ifdef ZH_CN else if (text[z] < 0){ z++; z++; lineLength += graphics->fonts[font]->getSymbolWidth(text[z]); } #endif 2. client/gui/Fonts.cpp in function CTrueTypeFont::getStringWidth(), line 255 // added by [email protected], 2013-08-28 // If we are handling simplified chinese, it is a UTF8 string #ifdef ZH_CN TTF_SizeUTF8(font.get(), data.c_str(), &width, NULL); #else TTF_SizeText(font.get(), data.c_str(), &width, NULL); #endif 3. client/gui/Fonts.cpp in function CTrueTypeFont::renderText(), line 279 if (blended) // added by [email protected], 2013-09-28 Sat // If we are handling simplified chinese game data, it is a UTF8 string #ifdef ZH_CN rendered = TTF_RenderUTF8_Blended(font.get(), data.c_str(), color); #else rendered = TTF_RenderText_Blended(font.get(), data.c_str(), color); #endif else // added by [email protected], 2013-09-28 Sat // If we are handling simplified chinese game data, it is a UTF8 string #ifdef ZH_CN rendered = TTF_RenderUTF8_Solid(font.get(), data.c_str(), color); #else rendered = TTF_RenderText_Solid(font.get(), data.c_str(), color); #endif 4. add lib/ConvertEncoding.cpp and lib/ConvertEncoding.h The content of ConvertEncoding.h is: char * convert_enc(char *src_enc, char *dest_enc, const char * src_string); The content of ConvertEncoding.cpp is: /* * ConvertEncoding.cpp, for vcmi using CJK(China/Japan/Korea) data. * * Authors: Wu Jiqing ([email protected]) * * License: GNU General Public License v2.0 or later * */ // added by jiqingwu([email protected]) // 2013-09-27 Fri #include <stdio.h> #include <iconv.h> #include <string.h> // added by [email protected], 2013-09-27 Fri char * convert_enc(char *src_enc, char *dest_enc, const char * src_string) { #define UTF8_STR_LEN 5000 static char out_string[UTF8_STR_LEN], *sin, *sout; int in_len, out_len, ret; iconv_t c_pt; if ((c_pt = iconv_open(dest_enc, src_enc)) == (iconv_t)-1) { printf("iconv open failed!\n"); return NULL; } // iconv(c_pt, NULL, NULL, NULL, NULL); in_len = strlen(src_string) + 1; out_len = UTF8_STR_LEN; sin = (char *)src_string; sout = out_string; ret = iconv(c_pt, &sin, (size_t *)&in_len, &sout, (size_t *)&out_len); if (ret == -1) { return NULL; } iconv_close(c_pt); return out_string; } to link ConvertEncoding.o into library, add two lines into lib/CMakeLists.txt: set(lib_SRCS ... ConvertEncoding.cpp ) set(lib_HEADERS ... ConvertEncoding.h ) 5. lib/CGeneralTextHandler.cpp, To include "ConvertEncoding.h" // added by jiqingwu([email protected]) // 2013-09-27 Fri #include "ConvertEncoding.h" in function CLegacyConfigParser::readString(), line 112 // added by [email protected], 2013-09-27 Fri // convert gbk string to utf-8 string. // (For simplified Chinese game data, the string is GBK encoded) #ifdef ZH_CN char * utf8_str = convert_enc("GBK", "UTF8", ret.c_str()); return std::string((const char*)utf8_str); #else return ret; #endif 6. lib/filesystem/CBinaryReader.cpp, to include "ConvertEncoding.h": // added by <[email protected]>, 2013-09-28 Sat #include "../ConvertEncoding.h" in function CBinaryReader::readString(), line 95 // added by [email protected], 2013-08-22 // If we are handling chinese data, convert gbk string to utf-8 string. #ifdef ZH_CN char * utf8_str = convert_enc("GBK", "UTF8", ret.c_str()); return std::string((const char*)utf8_str); #else return ret; #endif 7. add such a line into ./CMakeLists.txt to enable supporting Simplifed Chinese Game Data add_definitions(-DZH_CN) 8. cmake, make, make install To Play Use the Data from the chinese version of Death of Shadow. Link the 'Data', 'Maps', 'Mp3' directories under /usr/local/share/vcmi like this (You need have root privilege): cd /usr/local/share/vcmi ln -s /Data/Dir/of/ChineseGame Data ln -s /Maps/Dir/of/ChineseGame Maps ln -s /Mp3/Dir/of/ChineseGame Mp3 To show chinese characters in this game, you need put a true type font which supports Chinese into /usr/local/share/vcmi/Data. cp /chinese/font/path /usr/local/share/vcmi/Data In addition, you need edit the /usr/local/share/vcmi/config/fonts.json, modify the truetype font section like this: "trueType": { "BIGFONT" : { "file" : "ChineseFont.ttf", "size" : 22, "blend" : true}, "CALLI10R" : { "file" : "ChineseFont.ttf", "size" : 10, "blend" : true}, "CREDITS" : { "file" : "ChineseFont.ttf", "size" : 28, "blend" : true}, "HISCORE" : { "file" : "ChineseFont.ttf", "size" : 13, "blend" : true}, "MEDFONT" : { "file" : "ChineseFont.ttf", "size" : 16, "blend" : true}, "SMALFONT" : { "file" : "ChineseFont.ttf", "size" : 13, "blend" : true}, "TIMES08R" : { "file" : "ChineseFont.ttf", "size" : 11, "blend" : true}, "TINY" : { "file" : "ChineseFont.ttf", "size" : 11, "blend" : true}, "VERD10B" : { "file" : "ChineseFont.ttf", "size" : 13, "blend" : true} } Where ChineseFont.ttf is your true type font. Then, we can Play Game $ vcmiclient |
(0004091) Ivan (developer) 2013-10-20 10:09 |
Thanks. I'll take a look on this. I don't like idea of using ifdef's to enable some functionality but you've tracked every place that needs changes - that will help. |
(0007163) SXX (administrator) 2017-07-22 14:16 |
Interesting how this works now. Last year VCMI got support for UTF in file paths: https://github.com/vcmi/vcmi/pull/156 [^] |
Issue History | |||
Date Modified | Username | Field | Change |
2013-08-26 02:49 | acme_pjz | New Issue | |
2013-08-26 09:33 | Ivan | Note Added: 0003895 | |
2013-08-26 10:06 | Ivan | Assigned To | => Ivan |
2013-08-26 10:06 | Ivan | Status | new => feedback |
2013-09-03 06:11 | acme_pjz | File Added: H3Bitmap.7z | |
2013-09-03 06:11 | acme_pjz | File Added: HZK.7z | |
2013-09-03 06:15 | acme_pjz | Note Added: 0003921 | |
2013-09-03 06:15 | acme_pjz | Status | feedback => assigned |
2013-09-03 10:18 | Ivan | Note Added: 0003922 | |
2013-09-06 16:00 | acme_pjz | Note Added: 0003935 | |
2013-09-06 16:28 | Ivan | Note Added: 0003936 | |
2013-09-08 15:06 | Ivan | File Added: vcmi-chinese.png | |
2013-09-08 15:08 | Ivan | Note Added: 0003961 | |
2013-09-08 16:16 | Tow | Note Added: 0003967 | |
2013-09-09 05:18 | acme_pjz | Note Added: 0003982 | |
2013-09-09 11:15 | Ivan | Note Added: 0003986 | |
2013-09-11 05:30 | acme_pjz | File Added: 1.jpg | |
2013-09-11 05:31 | acme_pjz | Note Added: 0003995 | |
2013-10-02 05:38 | acme_pjz | Note Added: 0004059 | |
2013-10-02 16:27 | Ivan | Note Added: 0004060 | |
2013-10-03 04:57 | acme_pjz | File Added: 1-1.jpg | |
2013-10-03 04:58 | acme_pjz | File Added: 1-2.jpg | |
2013-10-03 05:15 | acme_pjz | Note Added: 0004061 | |
2013-10-03 11:59 | Ivan | Note Added: 0004062 | |
2013-10-03 14:29 | Ivan | Note Edited: 0004062 | View Revisions |
2013-10-04 07:34 | acme_pjz | Note Added: 0004063 | |
2013-10-04 13:19 | Ivan | Note Added: 0004064 | |
2013-10-04 13:48 | acme_pjz | Note Edited: 0004063 | View Revisions |
2013-10-04 14:20 | acme_pjz | Note Edited: 0004063 | View Revisions |
2013-10-04 15:14 | Ivan | Note Added: 0004065 | |
2013-10-19 13:31 | Jolly Wing | Note Added: 0004086 | |
2013-10-20 10:09 | Ivan | Note Added: 0004091 | |
2017-07-22 14:16 | SXX | Note Added: 0007163 | |
2022-12-17 14:07 | Ivan | Assigned To | Ivan => |
2023-04-11 09:21 | Ivan | Status | assigned => resolved |
2023-04-11 09:21 | Ivan | Resolution | open => fixed |
2023-04-11 09:21 | Ivan | Assigned To | => Ivan |
Copyright © 2000 - 2024 MantisBT Team |