Gentoo Archives: gentoo-embedded

From: "Anthony G. Basile" <basile@××××××××××××××.edu>
To: gentoo-embedded@l.g.o
Subject: Re: [gentoo-embedded] dev-libs/link-grammar and uclibc mbsrtowcs
Date: Fri, 01 Jul 2016 21:12:38
Message-Id: 03685070-7512-be36-0dbe-f5012323b65a@opensource.dyc.edu
In Reply to: [gentoo-embedded] dev-libs/link-grammar and uclibc mbsrtowcs by "René Rhéaume"
1 On 6/26/16 1:35 PM, René Rhéaume wrote:
2 > Currently, dev-libs/link-grammar fails its test suite on uclibc. In
3 > tests/test-suite.log, the text "link-grammar: Error: Affix dictionary:
4 > QUOTES: Invalid utf8 character" is found. By looking up the "Invalid
5 > utf8 character" message in the link-grammar source code, I found out
6 > it's the call to mbsrtowcs that fails. I tried to check in uClibc
7 > sources how that function can be configured, and from the documents
8 > inside uClibc sources, I learned ctype and wchar support is a mess.
9 >
10 > Because link-grammar loads from UTF-8, and that UTF-8 can be
11 > translated to wide character strings using bit masks and bit shifts
12 > (no big fat table needed), I made up my own implementation of
13 > mbsrtowcs for UTF-8, reading the manual pages for mbsrtowcs and
14 > mbrtowc and the Wikipedia article on UTF-8.
15 >
16 > But before integrating in link-grammar or somewhere else, I would like
17 > a code review on it. The attached source code is MIT-licensed, so I
18 > can put in any open source project I want without worrying about the
19 > license issues, so do you.
20 >
21
22 Why not make a patch against uclibc and try to get it upstream?
23
24 --
25 Anthony G. Basile, Ph. D.
26 Chair of Information Technology
27 D'Youville College
28 Buffalo, NY 14201
29 (716) 829-8197