José Romildo Malaquias
2012-08-18 15:16:43 UTC
Hello.
It seems that the regex-pcre has a bug dealing with utf-8:
Prelude> :m + Text.Regex.PCRE
Prelude Text.Regex.PCRE> "país:Brasil" =~ "país:(.*)" :: (String,String,String,[String])
("","pa\237s:Brasil","",["rasil"])
Notice the missing 'B' in the result of the regex matching.
With regex-posix this does not happen:
Prelude> :m + Text.Regex.Posix
Prelude Text.Regex.Posix> "país:Brasil" =~ "país:(.*)" ::(String,String,String,[String])
("","pa\237s:Brasil","",["Brasil"])
I hope this bug can be fixed soon.
Is there a bug tracker to report the bug? If so, what is it?
Romildo
It seems that the regex-pcre has a bug dealing with utf-8:
Prelude> :m + Text.Regex.PCRE
Prelude Text.Regex.PCRE> "país:Brasil" =~ "país:(.*)" :: (String,String,String,[String])
("","pa\237s:Brasil","",["rasil"])
Notice the missing 'B' in the result of the regex matching.
With regex-posix this does not happen:
Prelude> :m + Text.Regex.Posix
Prelude Text.Regex.Posix> "país:Brasil" =~ "país:(.*)" ::(String,String,String,[String])
("","pa\237s:Brasil","",["Brasil"])
I hope this bug can be fixed soon.
Is there a bug tracker to report the bug? If so, what is it?
Romildo