← Back to team overview

trafodion-development team mailing list archive

Re: An invalid character value encountered in TRANSLATE function

 

Hi Radu,

IMHO that brings up three questions:

   1. How can we handle error rows in a LOAD operation, can we have an
   error file instead of error messages like this.
   2. Shouldn't we log the actual value that's in error? In part we don't
   do that for security reasons, because it could print user data in the log
   messages. It would be good to have a discussion how we can have both
   security and better error messages.
   3. How to set up the right character sets for Hive and native Trafodion
   tables.


About question #3: Trafodion assumes by default that Hive tables have UTF-8
data in them, this is controlled by the HIVE_DEFAULT_CHARSET default. The
default character set for your SNA_PERSONS table, created in Trafodion is
ISO 8859-1.

If the HDFS data really contains UTF-8, then you could create the Trafodion
table like this:

*  create TABLE sna_persons (*
*    id NUMERIC not null,*
*    last_name varchar(70 bytes) character set utf8,*
*    first_name varchar(30** bytes**)** character set utf8**,*
*    middle_name varchar(30** bytes**)** character set utf8**,*
*    sex char(1),*
*    fraud_flag char(1),*
*    party_id varchar(24** bytes**)** character set utf8**,*
*    primary key (id)*
*)*
*salt using 8 partitions on (id);*


Note that you may need to increase the byte length somewhat, if necessary.

On the other hand, if the HDFS file contains ISO 8859-1 character data, you
could try this:

cqd HIVE_DEFAULT_CHARSET 'ISO88591';
load into TRAFODION.SEABASE.SNA_PERSONS select * from hive.hive.sna_persons;


I hope one of these solutions will work for you,

Hans

Hans

On Fri, Jul 17, 2015 at 8:24 AM, Radu Marias <radumarias@xxxxxxxxx> wrote:

> Hi,
>
> Using Trafodion Sandbox 1.1 with VirtualBox I get this error while loading
> data in a table. The scenario is:
>
> *>>create TABLE sna_persons (*
> *    id NUMERIC not null,*
> *    last_name varchar(70),*
> *    first_name varchar(30),*
> *    middle_name varchar(30),*
> *    sex char(1),*
> *    fraud_flag char(1),*
> *    party_id varchar(24),*
> *    primary key (id)*
> *)*
> *salt using 8 partitions on (id);*
>
> *>>create index sna_persons_i1 on sna_persons(first_name)*
> *salt like table;*
>
> *>>create index sna_persons_i2 on sna_persons(last_name)*
> *salt like table;*
>
> *>>create index sna_persons_i3 on sna_persons(party_id) salt like table;*
>
> *>>load into TRAFODION.SEABASE.SNA_PERSONS select * from
> hive.hive.sna_persons;*
> *Task: LOAD             Status: Started    Object:
> TRAFODION.SEABASE.SNA_PERSONS*
> *Task:  CLEANUP         Status: Started    Object:
> TRAFODION.SEABASE.SNA_PERSONS*
> *Task:  CLEANUP         Status: Ended      Object:
> TRAFODION.SEABASE.SNA_PERSONS*
> *Task:  DISABLE INDEXE  Status: Started    Object:
> TRAFODION.SEABASE.SNA_PERSONS*
> *Task:  DISABLE INDEXE  Status: Ended      Object:
> TRAFODION.SEABASE.SNA_PERSONS*
> *Task:  PREPARATION     Status: Started    Object:
> TRAFODION.SEABASE.SNA_PERSONS*
>
> **** ERROR[8690] An invalid character value encountered in TRANSLATE
> function.*
>
> If I skip the LAST_NAME column it works. I cannot see anything specific in
> logs, how can I determine which value is invalid? I have 1 million rows in
> table, will not be easy to isolate it ;)
>
> --
> And in the end, it's not the years in your life that count. It's the life
> in your years.
>
> --
> Mailing list: https://launchpad.net/~trafodion-development
> Post to     : trafodion-development@xxxxxxxxxxxxxxxxxxx
> Unsubscribe : https://launchpad.net/~trafodion-development
> More help   : https://help.launchpad.net/ListHelp
>
>

Follow ups

References