select * from table_name (or) select column_name from table_name does not work in avro backed table. #23

svmanoj1220 · 2012-06-07T09:29:22Z

I have created an avro backed table, But when i tried to fire a query select * from table_name ; its not giving any error or any output. When i tried query, select column_name from table_name ; getting error as FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask

I added all required jars to hive aux jars path and set in bash rc

export HIVE_AUX_JARS_PATH=/home/user1/Desktop/Manoj/Avro_Lib/avro-1.5.1.jar,/home/user1/Desktop/Manoj/Avro_Lib/AvroLibJarjar.jar,/home/user1/Desktop/Manoj/Avro_Lib/avro-mapred-1.5.1.jar,/home/user1/Desktop/Manoj/Avro_Lib/paranamer-2.3.jar,/home/user1/Desktop/Manoj/Avro_Lib/haivvreo.jar,/home/user1/Desktop/Manoj/Avro_Lib/snappy-java-1.0.4.1.jar,/home/user1/Desktop/Manoj/Avro_Lib/avro-tools-1.5.4.jar

I added this aux jars path on all task nodes bash rc, placing these jars on those machines.

The Avro schema I am passing

{
"type":"record",
"name":"Pair",
"namespace":"org.apache.avro.mapred",
"fields":
[
{
"name":"key",
"type":"string",
"doc":""
}
]
}

The Process I followed to create and load the data :

[user1@bhk123456d ~]$ hive
Hive history file=/tmp/user1/hive_job_log_user1_201206071413_79116159.txt

hive> SET hive.exec.compress.output=true;
hive> SET avro.output.codec=snappy;
hive> drop table EmpDetails;
OK
Time taken: 2.312 seconds
hive> CREATE TABLE EmpDetails COMMENT 'How many datatypes can we stuff into one table' PARTITIONED BY (ds String) ROW FORMAT SERDE 'com.linkedin.haivvreo.AvroSerDe' WITH SERDEPROPERTIES ('schema.url'='hdfs://bhk123456d:51400/user/user1/development/Manoj/EmpDetailsSchema.avsc') STORED AS INPUTFORMAT 'com.linkedin.haivvreo.AvroContainerInputFormat' OUTPUTFORMAT 'com.linkedin.haivvreo.AvroContainerOutputFormat';
OK
Time taken: 0.156 seconds
hive> describe EmpDetails;
OK
key string from deserializer
ds string
Time taken: 0.2 seconds
hive>load data inpath 'hdfs://bhk123456d:51400/user/user1/development/Manoj/EmpDetails.avro' into table EmpDetails;
Loading data to table default.empdetails
OK
Time taken: 0.375 seconds

hive> select * from EmpDetails;
OK
Time taken: 0.156 seconds

hive> select key from empdetails;
Total MapReduce jobs = 1
Launching Job 1 out of 1
Number of reduce tasks is set to 0 since there's no reduce operator
java.io.IOException: com.linkedin.haivvreo.HaivvreoException: Neither schema.literal nor schema.url specified, can't determine table schema
at com.linkedin.haivvreo.AvroContainerOutputFormat.getHiveRecordWriter(AvroContainerOutputFormat.java:55)
at org.apache.hadoop.hive.ql.exec.ExecDriver.addInputPath(ExecDriver.java:1219)
at org.apache.hadoop.hive.ql.exec.ExecDriver.addInputPaths(ExecDriver.java:1276)
at org.apache.hadoop.hive.ql.exec.ExecDriver.execute(ExecDriver.java:632)
at org.apache.hadoop.hive.ql.exec.MapRedTask.execute(MapRedTask.java:123)
at org.apache.hadoop.hive.ql.exec.Task.executeTask(Task.java:130)
at org.apache.hadoop.hive.ql.exec.TaskRunner.runSequential(TaskRunner.java:57)
at org.apache.hadoop.hive.ql.Driver.launchTask(Driver.java:1063)
at org.apache.hadoop.hive.ql.Driver.execute(Driver.java:900)
at org.apache.hadoop.hive.ql.Driver.run(Driver.java:748)
at org.apache.hadoop.hive.cli.CliDriver.processCmd(CliDriver.java:164)
at org.apache.hadoop.hive.cli.CliDriver.processLine(CliDriver.java:241)
at org.apache.hadoop.hive.cli.CliDriver.main(CliDriver.java:456)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:186)
Caused by: com.linkedin.haivvreo.HaivvreoException: Neither schema.literal nor schema.url specified, can't determine table schema
at com.linkedin.haivvreo.HaivvreoUtils.determineSchemaOrThrowException(HaivvreoUtils.java:55)
at com.linkedin.haivvreo.AvroContainerOutputFormat.getHiveRecordWriter(AvroContainerOutputFormat.java:53)
... 17 more
Job Submission failed with exception 'java.io.IOException(com.linkedin.haivvreo.HaivvreoException: Neither schema.literal nor schema.url specified, can't determine table schema)'
FAILED: Execution Error, return code 1 from org.apache.hadoop.hive.ql.exec.MapRedTask

One solution I found is :

Why do I get a java.io.IOException: com.linkedin.haivvreo.HaivvreoException: Neither schema.literal nor schema.url specified, can't determine table schema when pruning by a partition that doesn't exist?

Hive creates a temporary empty file for non-existent partitions in order that queries referencing them succeed (returning a count of zero rows). However, when doing so, it doesn't pass the correct information to the RecordWriter, leaving Haivvreo unable to construct one. This problem has been corrected in Hive 0.8 . With previous versions of Hive, either be sure to only filter on existing partitions or apply HIVE-2260

But this is in the case when pruning by a partition that does not exist . can this solve my issue. Is my problem not solved in CDH3U3 Hive 0.7..

Please help me by giving a solution how to solve this issue.

Thanks,
Manoj.
Mail : [email protected]
Mobile : +91 9658222732

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

select * from table_name (or) select column_name from table_name does not work in avro backed table. #23

select * from table_name (or) select column_name from table_name does not work in avro backed table. #23

svmanoj1220 commented Jun 7, 2012

select * from table_name (or) select column_name from table_name does not work in avro backed table. #23

select * from table_name (or) select column_name from table_name does not work in avro backed table. #23

Comments

svmanoj1220 commented Jun 7, 2012