| Commit message (Collapse) | Author | Age | Lines |
| |
|
|
|
|
|
| |
As noted in the ruby docs (http://ruby-doc.org/core-1.9.3/String.html#method-i-encode),
any conversion from an encoding to the same encoding is a no-op, covert it first to utf-16.
|
|\ |
|
| |
| |
| |
| |
| |
| |
| |
| |
| | |
In a database with encoding SQL-ASCII, an invalid utf-8 filename
can be saved but will cause an "invalid byte sequence in UTF-8"
when the filename is prepared for display. In a database with a
UTF-8 encoding, saving the string will cause an error like
"ActiveRecord::StatementInvalid (PG::Error: ERROR: invalid byte
sequence for encoding "UTF8""
|
| |
| |
| |
| |
| | |
Try likely conversions but if that fails, just replace the characters
that are invalid utf-8.
|
|/
|
|
|
|
| |
This is important under ruby 1.9 in order to determine the
encoding that will be used for new strings created in the code in
the file.
|
| |
|
|
|
|
|
|
| |
From: http://ruby-doc.org/core-2.0/String.html#method-i-encode
Ducktypes for having encode rather than relying on RUBY_VERSION
|
|
|
|
|
|
|
| |
This function is useful for investigating problems with
handling of emails, attachments and the related character
encoding issues. It can safely be removed later, but is
currently useful to have for debugging purposes.
|
|
Throughout the codebase it is simplest and most consistent
if we could assume that all text/* attachments are represented
by UTF-8 strings, and this was largely true with the TMail
backend which ensured that all returned text parts were in
UTF-8. We have to change the replacement Mail-backed to
similarly attempt to convert text parts to UTF-8. This commit
introduces two functions which are useful for this.
The normalize_string_to_utf8 function will try various
encodings, either suggested or guessed (with charlock_holmes)
to convert the passed string to UTF-8, and if it can't find a
suitable encoding will throw an exception.
Unfortunately, the current behaviour of the site is that
uninterpretable text/* attachments are still passed around and
mangled to UTF-8 just before display. To mimic this it's also
useful to have the convert_string_to_utf8_or_binary function,
which tries to convert the string to UTF-8 with
normalize_string_to_utf8, but if that's not possible just
returns the original string. (In Ruby 1.9, encoding will be
set to UTF-8 or ASCII-8BIT appropriately.)
|