Discussion:
[mongodb-user] Cannot load data from mongodb to hdfs using hive
Siti Hajar Hanani
2018-11-28 08:09:25 UTC
Permalink
Hi,

I'm having trouble setting up mongo-hadoop connector.. I try to load data
from mongodb to hadoop using Hive.

At first I got error while trying to create a table using below command:

add jar /usr/local/hadoop/lib/mongo-hadoop-core-2.0.2.jar;
add jar /usr/local/hadoop/lib/mongo-hadoop-hive-2.0.2.jar;
add jar /usr/local/hadoop/lib/mongo-java-driver-3.2.1.jar;
CREATE TABLE testing22_to_mongo (
id String,
age String,
gender String,
race String,
custState String,
purchaseDate String,
purchaseTime String,
foodname String,
restaurant String,
foodtype String,
quantity String,
totalPrice String,
orderType String,
rating String,
servingType String,
characteristic String,
restaurantType String,
restaurantState String,
priceRange String,
paymentMethod String,
tableBooking String,
onlineBooking String,
deliveryService String
)
ROW FORMAT SERDE 'com.mongodb.hadoop.hive.BSONSerDe' STORED BY
'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES ('mongo.columns.mapping'='{"id":"_id", "age":"age",
"gender":"gender", "race":"race", "custState":"custState",
"purchaseDate":"purchaseDate", "purchaseTime":"purchaseTime",
"foodname":"foodname", "restaurant":"restaurant", "foodtype":"foodtype",
"quantity":"quantity", "totalPrice":"totalPrice", "orderType":"orderType",
"rating":"rating", "servingType":"servingType",
"characteristic":"characteristic", "restaurantType":"restaurantType",
"restaurantState":"restaurantState", "priceRange":"priceRange",
"paymentMethod":"paymentMethod", "tableBooking":"tableBooking",
"onlineBooking":"onlineBooking", "deliveryService":"deliveryService"}')
TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/database.dataset');
and I got this error

[image: mongohive.PNG]

but then I try add a few others jar:

add jar /usr/local/hadoop/lib/hive-serde-1.2.1.jar;
add jar
/usr/local/Hive-JSON-Serde/json-serde/target/json-serde-1.3.9-SNAPSHOT-jar-with-dependencies.jar;

and I also change ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe',
the error is gone and I'm* successfuly* create the table. However, the
table is empty. I try to run SELECT * FROM testing22_to_mongo; but got this
error.

[image: Screenshot from 2018-11-28 15-51-34.png]



I am using :-

java version "1.8.0_191"
Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)

MongoDB shell version v4.0.4

Hive 2.3.3

I not user if this is jar compatibility issue of something else. But if it
is, how to find the compatible jar version? I'm a newbie in all of this. I
just want to load the data from mongodb to hadoop, If have any other option
or any other way to do it, I would love to try the suggestion.

Thank you.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/35146fc6-9618-4ebc-bc61-60fd9e644f09%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Bob Cochran
2018-11-28 21:20:04 UTC
Permalink
I’m not a user of Hadoop but I think you probably need to populate your table with data after you create the table. There may be a further action needed to populate the table.

Thanks

Bob
Hi,
I'm having trouble setting up mongo-hadoop connector.. I try to load data from mongodb to hadoop using Hive.
Post by Siti Hajar Hanani
add jar /usr/local/hadoop/lib/mongo-hadoop-core-2.0.2.jar;
add jar /usr/local/hadoop/lib/mongo-hadoop-hive-2.0.2.jar;
add jar /usr/local/hadoop/lib/mongo-java-driver-3.2.1.jar;
CREATE TABLE testing22_to_mongo (
id String,
age String,
gender String,
race String,
custState String,
purchaseDate String,
purchaseTime String,
foodname String,
restaurant String,
foodtype String,
quantity String,
totalPrice String,
orderType String,
rating String,
servingType String,
characteristic String,
restaurantType String,
restaurantState String,
priceRange String,
paymentMethod String,
tableBooking String,
onlineBooking String,
deliveryService String
)
ROW FORMAT SERDE 'com.mongodb.hadoop.hive.BSONSerDe' STORED BY 'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES ('mongo.columns.mapping'='{"id":"_id", "age":"age", "gender":"gender", "race":"race", "custState":"custState", "purchaseDate":"purchaseDate", "purchaseTime":"purchaseTime", "foodname":"foodname", "restaurant":"restaurant", "foodtype":"foodtype", "quantity":"quantity", "totalPrice":"totalPrice", "orderType":"orderType", "rating":"rating", "servingType":"servingType", "characteristic":"characteristic", "restaurantType":"restaurantType", "restaurantState":"restaurantState", "priceRange":"priceRange", "paymentMethod":"paymentMethod", "tableBooking":"tableBooking", "onlineBooking":"onlineBooking", "deliveryService":"deliveryService"}')
TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/database.dataset');
and I got this error
<mongohive.PNG>
add jar /usr/local/hadoop/lib/hive-serde-1.2.1.jar;
add jar /usr/local/Hive-JSON-Serde/json-serde/target/json-serde-1.3.9-SNAPSHOT-jar-with-dependencies.jar;
and I also change ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe', the error is gone and I'm successfuly create the table. However, the table is empty. I try to run SELECT * FROM testing22_to_mongo; but got this error.
<Screenshot from 2018-11-28 15-51-34.png>
I am using :-
java version "1.8.0_191"
Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)
MongoDB shell version v4.0.4
Hive 2.3.3
I not user if this is jar compatibility issue of something else. But if it is, how to find the compatible jar version? I'm a newbie in all of this. I just want to load the data from mongodb to hadoop, If have any other option or any other way to do it, I would love to try the suggestion.
Thank you.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/35146fc6-9618-4ebc-bc61-60fd9e644f09%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
<mongohive.PNG>
<Screenshot from 2018-11-28 15-51-34.png>
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/964B507A-1858-4468-B432-11C45ADACD57%40gmail.com.
For more options, visit https://groups.google.com/d/optout.
Robert Cochran
2018-11-29 00:44:34 UTC
Permalink
Hi,

First -- I accidentally posted this as a reply to a quite different topic
from poster "davidmo". I apologize for the mistake.

I was curious about this, and started looking at Hadoop and MongoDb's
Hadoop Connector. Then I looked again at your email and noticed the Java
driver version you quote -- 3.2.1 -- does not support MongoDB server
version 4.x. I suspect you need to upgrade your MongoDB Java driver version
-- try the 3.9.1 version.

In my opinion, the MongoDB Hadoop connector available on GitHub has not had
an active commit since 2017. That was a while ago! The connector itself
might need updating or correction, in order for it to work with MongoDB
4.x.

With that said, you don't mention whether you already have a collection
containing the data of interest to you. From the code example you show --
it appears that you do have a real MongoDB collection that you can work
with and which is populated with data. So perhaps the true issue is that
the Java driver is too old to support MongoDB 4.x, and the Hadoop connector
might also be too old, too.

One last suggestion. I think your source data might be too complicated when
you are just starting out. Why not start with a very simple MongoDB
collection, until you have a working Hadoop/Java driver/Hive
implementation? Like so:

MongoDB Enterprise > use hadoop

switched to db hadoop

MongoDB Enterprise > db.tc.insert( { "a" : NumberInt("1") } )

WriteResult({ "nInserted" : 1 })

MongoDB Enterprise > db.tc.insert( { "a" : NumberInt("3") } )

WriteResult({ "nInserted" : 1 })

MongoDB Enterprise > db.tc.insert( { "a" : NumberInt("4") } )

WriteResult({ "nInserted" : 1 })

MongoDB Enterprise > db.tc.find({})

{ "_id" : ObjectId("5bff32b51b4428a5aaedf37a"), "a" : 1 }

{ "_id" : ObjectId("5bff32c71b4428a5aaedf37b"), "a" : 3 }

{ "_id" : ObjectId("5bff32ce1b4428a5aaedf37c"), "a" : 4 }


So with the above, you get 3 nice simple documents to start with, and you
know exactly what they should look like. That will help you get your Hadoop
implementation going.


Thanks so much

Bob
Post by Bob Cochran
I’m not a user of Hadoop but I think you probably need to populate your
table with data after you create the table. There may be a further action
needed to populate the table.
Thanks
Bob
Hi,
I'm having trouble setting up mongo-hadoop connector.. I try to load data
from mongodb to hadoop using Hive.
add jar /usr/local/hadoop/lib/mongo-hadoop-core-2.0.2.jar;
add jar /usr/local/hadoop/lib/mongo-hadoop-hive-2.0.2.jar;
add jar /usr/local/hadoop/lib/mongo-java-driver-3.2.1.jar;
CREATE TABLE testing22_to_mongo (
id String,
age String,
gender String,
race String,
custState String,
purchaseDate String,
purchaseTime String,
foodname String,
restaurant String,
foodtype String,
quantity String,
totalPrice String,
orderType String,
rating String,
servingType String,
characteristic String,
restaurantType String,
restaurantState String,
priceRange String,
paymentMethod String,
tableBooking String,
onlineBooking String,
deliveryService String
)
ROW FORMAT SERDE 'com.mongodb.hadoop.hive.BSONSerDe' STORED BY
'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES ('mongo.columns.mapping'='{"id":"_id", "age":"age",
"gender":"gender", "race":"race", "custState":"custState",
"purchaseDate":"purchaseDate", "purchaseTime":"purchaseTime",
"foodname":"foodname", "restaurant":"restaurant", "foodtype":"foodtype",
"quantity":"quantity", "totalPrice":"totalPrice", "orderType":"orderType",
"rating":"rating", "servingType":"servingType",
"characteristic":"characteristic", "restaurantType":"restaurantType",
"restaurantState":"restaurantState", "priceRange":"priceRange",
"paymentMethod":"paymentMethod", "tableBooking":"tableBooking",
"onlineBooking":"onlineBooking", "deliveryService":"deliveryService"}')
TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/database.dataset');
and I got this error
<mongohive.PNG>
add jar /usr/local/hadoop/lib/hive-serde-1.2.1.jar;
add jar
/usr/local/Hive-JSON-Serde/json-serde/target/json-serde-1.3.9-SNAPSHOT-jar-with-dependencies.jar;
and I also change ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe',
the error is gone and I'm* successfuly* create the table. However, the
table is empty. I try to run SELECT * FROM testing22_to_mongo; but got this
error.
<Screenshot from 2018-11-28 15-51-34.png>
I am using :-
java version "1.8.0_191"
Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)
MongoDB shell version v4.0.4
Hive 2.3.3
I not user if this is jar compatibility issue of something else. But if it
is, how to find the compatible jar version? I'm a newbie in all of this. I
just want to load the data from mongodb to hadoop, If have any other option
or any other way to do it, I would love to try the suggestion.
Thank you.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/35146fc6-9618-4ebc-bc61-60fd9e644f09%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/35146fc6-9618-4ebc-bc61-60fd9e644f09%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
<mongohive.PNG>
<Screenshot from 2018-11-28 15-51-34.png>
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/CAMZC1ccpdKCs7G6tN1VGZzhQ-yw%3DftwJ2ELt9eWgHdbbfXG4Uw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Siti Hajar Hanani
2018-11-29 04:31:23 UTC
Permalink
Hi,
Thank you so much for your reply. Actually the data collection that I have
is the sample data only, and I already try the suggestion you gave which I
change the java-drive jar version to 3.9.1 and use simple document. But, I
still have the same error which now I'm not really sure where the problem
is. I try to find the other jar latest version but I only get v2.0.2. Btw,
thank you
[image: Boxbe] <https://www.boxbe.com/overview> This message is eligible
<https://www.boxbe.com/popup?url=https%3A%2F%2Fwww.boxbe.com%2Fcleanup%3Fkey%3DcC%252FcJ%252BNW2%252Flf9riZ%252FSbnkmekWABMX%252B0JOJdkYY6RsNQ%253D%26token%3D73QuM6AQJ4vKPv1a8gH1FX4tmQQPccmjLdDYcgJnDxiDMYrAstsoav8P2WlmQkknc5PDXjSIIgQVt2%252BwJr6oL42sFznLY%252FL6gasDFXYbK3SF38MzUMNxGDo%252FLjrmEiLfVOixa%252BRcFSo%253D&tc_serial=45638306629&tc_rand=1293190411&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001>
| More info
<http://blog.boxbe.com/general/boxbe-automatic-cleanup?tc_serial=45638306629&tc_rand=1293190411&utm_source=stf&utm_medium=email&utm_campaign=ANNO_CLEANUP_ADD&utm_content=001>
Hi,
First -- I accidentally posted this as a reply to a quite different topic
from poster "davidmo". I apologize for the mistake.
I was curious about this, and started looking at Hadoop and MongoDb's
Hadoop Connector. Then I looked again at your email and noticed the Java
driver version you quote -- 3.2.1 -- does not support MongoDB server
version 4.x. I suspect you need to upgrade your MongoDB Java driver version
-- try the 3.9.1 version.
In my opinion, the MongoDB Hadoop connector available on GitHub has not
had an active commit since 2017. That was a while ago! The connector itself
might need updating or correction, in order for it to work with MongoDB
4.x.
With that said, you don't mention whether you already have a collection
containing the data of interest to you. From the code example you show --
it appears that you do have a real MongoDB collection that you can work
with and which is populated with data. So perhaps the true issue is that
the Java driver is too old to support MongoDB 4.x, and the Hadoop connector
might also be too old, too.
One last suggestion. I think your source data might be too complicated
when you are just starting out. Why not start with a very simple MongoDB
collection, until you have a working Hadoop/Java driver/Hive
MongoDB Enterprise > use hadoop
switched to db hadoop
MongoDB Enterprise > db.tc.insert( { "a" : NumberInt("1") } )
WriteResult({ "nInserted" : 1 })
MongoDB Enterprise > db.tc.insert( { "a" : NumberInt("3") } )
WriteResult({ "nInserted" : 1 })
MongoDB Enterprise > db.tc.insert( { "a" : NumberInt("4") } )
WriteResult({ "nInserted" : 1 })
MongoDB Enterprise > db.tc.find({})
{ "_id" : ObjectId("5bff32b51b4428a5aaedf37a"), "a" : 1 }
{ "_id" : ObjectId("5bff32c71b4428a5aaedf37b"), "a" : 3 }
{ "_id" : ObjectId("5bff32ce1b4428a5aaedf37c"), "a" : 4 }
So with the above, you get 3 nice simple documents to start with, and you
know exactly what they should look like. That will help you get your Hadoop
implementation going.
Thanks so much
Bob
Post by Bob Cochran
I’m not a user of Hadoop but I think you probably need to populate your
table with data after you create the table. There may be a further action
needed to populate the table.
Thanks
Bob
Hi,
I'm having trouble setting up mongo-hadoop connector.. I try to load data
from mongodb to hadoop using Hive.
add jar /usr/local/hadoop/lib/mongo-hadoop-core-2.0.2.jar;
add jar /usr/local/hadoop/lib/mongo-hadoop-hive-2.0.2.jar;
add jar /usr/local/hadoop/lib/mongo-java-driver-3.2.1.jar;
CREATE TABLE testing22_to_mongo (
id String,
age String,
gender String,
race String,
custState String,
purchaseDate String,
purchaseTime String,
foodname String,
restaurant String,
foodtype String,
quantity String,
totalPrice String,
orderType String,
rating String,
servingType String,
characteristic String,
restaurantType String,
restaurantState String,
priceRange String,
paymentMethod String,
tableBooking String,
onlineBooking String,
deliveryService String
)
ROW FORMAT SERDE 'com.mongodb.hadoop.hive.BSONSerDe' STORED BY
'com.mongodb.hadoop.hive.MongoStorageHandler'
WITH SERDEPROPERTIES ('mongo.columns.mapping'='{"id":"_id", "age":"age",
"gender":"gender", "race":"race", "custState":"custState",
"purchaseDate":"purchaseDate", "purchaseTime":"purchaseTime",
"foodname":"foodname", "restaurant":"restaurant", "foodtype":"foodtype",
"quantity":"quantity", "totalPrice":"totalPrice", "orderType":"orderType",
"rating":"rating", "servingType":"servingType",
"characteristic":"characteristic", "restaurantType":"restaurantType",
"restaurantState":"restaurantState", "priceRange":"priceRange",
"paymentMethod":"paymentMethod", "tableBooking":"tableBooking",
"onlineBooking":"onlineBooking", "deliveryService":"deliveryService"}')
TBLPROPERTIES('mongo.uri'='mongodb://localhost:27017/database.dataset');
and I got this error
<mongohive.PNG>
add jar /usr/local/hadoop/lib/hive-serde-1.2.1.jar;
add jar
/usr/local/Hive-JSON-Serde/json-serde/target/json-serde-1.3.9-SNAPSHOT-jar-with-dependencies.jar;
and I also change ROW FORMAT SERDE 'org.openx.data.jsonserde.JsonSerDe',
the error is gone and I'm* successfuly* create the table. However, the
table is empty. I try to run SELECT * FROM testing22_to_mongo; but got this
error.
<Screenshot from 2018-11-28 15-51-34.png>
I am using :-
java version "1.8.0_191"
Java(TM) SE Runtime Environment (build 1.8.0_191-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.191-b12, mixed mode)
MongoDB shell version v4.0.4
Hive 2.3.3
I not user if this is jar compatibility issue of something else. But if
it is, how to find the compatible jar version? I'm a newbie in all of this.
I just want to load the data from mongodb to hadoop, If have any other
option or any other way to do it, I would love to try the suggestion.
Thank you.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/35146fc6-9618-4ebc-bc61-60fd9e644f09%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/35146fc6-9618-4ebc-bc61-60fd9e644f09%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
<mongohive.PNG>
<Screenshot from 2018-11-28 15-51-34.png>
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/CAMZC1ccpdKCs7G6tN1VGZzhQ-yw%3DftwJ2ELt9eWgHdbbfXG4Uw%40mail.gmail.com
<https://groups.google.com/d/msgid/mongodb-user/CAMZC1ccpdKCs7G6tN1VGZzhQ-yw%3DftwJ2ELt9eWgHdbbfXG4Uw%40mail.gmail.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/CAPGX9QhFYnCDifdXvcePmutGT_fu1DMUp%2B8C%3Dhi17bbOMpRQfw%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Loading...