[scribus] =?utf-8?q?=3F=3D=3D=3Futf-8=3Fq=3F__xtg

Ralf Mattes

2018-06-05 14:21:23 UTC

- from what I can tell the xtg import plugin correctly detects the declared encoding and does find
the corrent QTextCodec for the encoding, so m_codec seems to be correct. But somehow the
actual text imported is garbled.
- If I import non-multibyte encoded text everything works as expected.
Cheers, RalfD

After a bit of code reading:

in file xtgscanner.cpp, line 1460:

QChar XtgScanner::nextSymbol()
{
char ch = 0;
if (top < input_Buffer.length())
{
ch = input_Buffer.at(top++);
QByteArray ba;
ba.append(ch);
QString m_txt = m_codec->toUnicode(ba);

How is this supposed to work? 'ba' will contain a single _char_ which is passed to QTextCodec::toUnicode.
Isn't that guaranteed to return invaild glyphs for multibyte input since the first byte of a multibyte character
never is a vaild glyph?

Also, in the constructor XtgScanner::XtgScanner there seems to be an attempt to skip over BOM:

if ((input_Buffer[0] == '\xFF') && (input_Buffer[1] == '\xFE'))
{
QByteArray tmpBuf;
for (int a = 2; a < input_Buffer.count(); a += 2)
{
tmpBuf.append(input_Buffer[a]);
}
input_Buffer = tmpBuf;
}

Won't this fail for UTF-16-LE (or, for that matter, for UTF-8 with BOM)?

Cheers, RalfD

___
Scribus Mailing List: ***@lists.scribus.net
Edit your options or unsubscribe:
http://lists.scribus.net/mailman/listinfo/scribus
See also:
http://wiki.scribus.net
http://forums.scribus.net