1 |
On 6/26/16 1:35 PM, René Rhéaume wrote: |
2 |
> Currently, dev-libs/link-grammar fails its test suite on uclibc. In |
3 |
> tests/test-suite.log, the text "link-grammar: Error: Affix dictionary: |
4 |
> QUOTES: Invalid utf8 character" is found. By looking up the "Invalid |
5 |
> utf8 character" message in the link-grammar source code, I found out |
6 |
> it's the call to mbsrtowcs that fails. I tried to check in uClibc |
7 |
> sources how that function can be configured, and from the documents |
8 |
> inside uClibc sources, I learned ctype and wchar support is a mess. |
9 |
> |
10 |
> Because link-grammar loads from UTF-8, and that UTF-8 can be |
11 |
> translated to wide character strings using bit masks and bit shifts |
12 |
> (no big fat table needed), I made up my own implementation of |
13 |
> mbsrtowcs for UTF-8, reading the manual pages for mbsrtowcs and |
14 |
> mbrtowc and the Wikipedia article on UTF-8. |
15 |
> |
16 |
> But before integrating in link-grammar or somewhere else, I would like |
17 |
> a code review on it. The attached source code is MIT-licensed, so I |
18 |
> can put in any open source project I want without worrying about the |
19 |
> license issues, so do you. |
20 |
> |
21 |
|
22 |
Why not make a patch against uclibc and try to get it upstream? |
23 |
|
24 |
-- |
25 |
Anthony G. Basile, Ph. D. |
26 |
Chair of Information Technology |
27 |
D'Youville College |
28 |
Buffalo, NY 14201 |
29 |
(716) 829-8197 |