Discussion:
hadoop cluster for querying data on mongodb
(too old to reply)
Martinus Martinus
2011-12-21 03:29:55 UTC
Permalink
Hi,

I have hadoop cluster running and have my data inside mongodb database. I
already write a java code to query data on mongodb using mongodb-java
driver. And right now, I want to use hadoop cluster to run my java code to
get and put the data from and to mongo database. Did anyone has done this
before? Can you explain to me how to do that?

Thanks.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Eliot Horowitz
2011-12-21 06:08:34 UTC
Permalink
Take a look at:
https://github.com/mongodb/mongo-hadoop

On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb database. I
already write a java code to query data on mongodb using mongodb-java
driver. And right now, I want to use hadoop cluster to run my java code to
get and put the data from and to mongo database. Did anyone has done this
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
Martinus Martinus
2011-12-21 06:20:42 UTC
Permalink
Hi Eliot,

I have tried to built the jar file from the core folder inside it, but it
gaves me error of source-5, so then I add this in the pom.xml file below
the </resources> :

<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>

and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I still don't
know how to use this library along with mongodb inside eclipse.

Thanks.
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb database. I
already write a java code to query data on mongodb using mongodb-java
driver. And right now, I want to use hadoop cluster to run my java code
to
Post by Martinus Martinus
get and put the data from and to mongo database. Did anyone has done this
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Martinus Martinus
2011-12-26 04:46:12 UTC
Permalink
Hi Eliot,

I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I need to
add external library for all of hadoop library? and when I tried to run the
WordCount.java program in eclipse, it gave me this error :

Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set. User classes
may not be found. See JobConf(Class) or JobConf#setJar(String).
11/12/26 12:42:58 INFO util.MongoSplitter: Calculate Splits Code ... Use
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input Splits is
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split mode
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded input
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a split
size of '8'mb per
Exception in thread "main" java.lang.IllegalArgumentException: Unable to
calculate input splits: ns not found
at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
at org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at WordCount.main(WordCount.java:76)

Would you be so kindly to tell me how to fix this problem?

Thanks.
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside it, but it
gaves me error of source-5, so then I add this in the pom.xml file below
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I still don't
know how to use this library along with mongodb inside eclipse.
Thanks.
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb database.
I
Post by Martinus Martinus
already write a java code to query data on mongodb using mongodb-java
driver. And right now, I want to use hadoop cluster to run my java code
to
Post by Martinus Martinus
get and put the data from and to mongo database. Did anyone has done
this
Post by Martinus Martinus
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the Google
Groups
Post by Martinus Martinus
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Martinus Martinus
2011-12-26 09:02:03 UTC
Permalink
Hi Eliot,

I knew where the problem is : I haven't made the "in" collection when I run
the program, so it gave me above error.

Thanks.

Merry Christmas.

On Mon, Dec 26, 2011 at 12:46 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I need to
add external library for all of hadoop library? and when I tried to run the
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set. User
classes may not be found. See JobConf(Class) or JobConf#setJar(String).
11/12/26 12:42:58 INFO util.MongoSplitter: Calculate Splits Code ... Use
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input Splits is
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split mode
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded input
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a split
size of '8'mb per
Exception in thread "main" java.lang.IllegalArgumentException: Unable to
calculate input splits: ns not found
at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at WordCount.main(WordCount.java:76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside it, but it
gaves me error of source-5, so then I add this in the pom.xml file below
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I still don't
know how to use this library along with mongodb inside eclipse.
Thanks.
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb
database. I
Post by Martinus Martinus
already write a java code to query data on mongodb using mongodb-java
driver. And right now, I want to use hadoop cluster to run my java
code to
Post by Martinus Martinus
get and put the data from and to mongo database. Did anyone has done
this
Post by Martinus Martinus
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the Google
Groups
Post by Martinus Martinus
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Martinus Martinus
2012-01-02 09:44:46 UTC
Permalink
Hi,

Is there any better way to do map/reduce in parallel on many machines to
query and put data inside mongodb database besides using hadoop?

Thanks and Happy New Year 2012.
Post by Martinus Martinus
Hi Eliot,
I knew where the problem is : I haven't made the "in" collection when I
run the program, so it gave me above error.
Thanks.
Merry Christmas.
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I need to
add external library for all of hadoop library? and when I tried to run the
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set. User
classes may not be found. See JobConf(Class) or JobConf#setJar(String).
11/12/26 12:42:58 INFO util.MongoSplitter: Calculate Splits Code ... Use
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input Splits is
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split mode
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded input
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a split
size of '8'mb per
Exception in thread "main" java.lang.IllegalArgumentException: Unable to
calculate input splits: ns not found
at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at WordCount.main(WordCount.java:76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside it, but
it gaves me error of source-5, so then I add this in the pom.xml file below
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I still don't
know how to use this library along with mongodb inside eclipse.
Thanks.
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb
database. I
Post by Martinus Martinus
already write a java code to query data on mongodb using mongodb-java
driver. And right now, I want to use hadoop cluster to run my java
code to
Post by Martinus Martinus
get and put the data from and to mongo database. Did anyone has done
this
Post by Martinus Martinus
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the Google
Groups
Post by Martinus Martinus
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Scott Hernandez
2012-01-02 12:36:32 UTC
Permalink
Do you mean other than using this? https://github.com/mongodb/mongo-hadoop

It sounded like you got it working once you set the correct params.
Post by Martinus Martinus
Hi,
Is there any better way to do map/reduce in parallel on many machines to
query and put data inside mongodb database besides using hadoop?
Thanks and Happy New Year 2012.
Post by Martinus Martinus
Hi Eliot,
I knew where the problem is : I haven't made the "in" collection when I
run the program, so it gave me above error.
Thanks.
Merry Christmas.
On Mon, Dec 26, 2011 at 12:46 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I need to
add external library for all of hadoop library? and when I tried to run the
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set.  User
classes may not be found. See JobConf(Class) or JobConf#setJar(String).
11/12/26 12:42:58 INFO util.MongoSplitter:  Calculate Splits Code ... Use
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input Splits is
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split mode
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded input
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a split size
of '8'mb per
Exception in thread "main" java.lang.IllegalArgumentException: Unable to
calculate input splits: ns not found
    at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
    at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
    at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
    at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
    at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
    at WordCount.main(WordCount.java:76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
On Wed, Dec 21, 2011 at 2:20 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside it, but
it gaves me error of source-5, so then I add this in the pom.xml file below
    <plugins>
     <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-compiler-plugin</artifactId>
      <configuration>
       <source>1.5</source>
       <target>1.5</target>
      </configuration>
     </plugin>
    </plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I still don't
know how to use this library along with mongodb inside eclipse.
Thanks.
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb database. I
already write a java code to query data on mongodb using mongodb-java
driver. And right now, I want to use hadoop cluster to run my java code to
get and put the data from and to mongo database. Did anyone has done this
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Martinus Martinus
2012-01-02 16:37:43 UTC
Permalink
Hi Scott,

Yes, I did tried the example, but I'm wondering if there is another
distributed computing software like hadoop that can be used to do
map/reduce with MongoDB and faster.

Thanks and Happy New Year 2012.
Post by Scott Hernandez
Do you mean other than using this? https://github.com/mongodb/mongo-hadoop
It sounded like you got it working once you set the correct params.
Post by Martinus Martinus
Hi,
Is there any better way to do map/reduce in parallel on many machines to
query and put data inside mongodb database besides using hadoop?
Thanks and Happy New Year 2012.
On Mon, Dec 26, 2011 at 5:02 PM, Martinus Martinus <
Post by Martinus Martinus
Hi Eliot,
I knew where the problem is : I haven't made the "in" collection when I
run the program, so it gave me above error.
Thanks.
Merry Christmas.
On Mon, Dec 26, 2011 at 12:46 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I need
to
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
add external library for all of hadoop library? and when I tried to
run the
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set. User
classes may not be found. See JobConf(Class) or JobConf#setJar(String).
11/12/26 12:42:58 INFO util.MongoSplitter: Calculate Splits Code ...
Use
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input Splits is
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split mode
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded input
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a
split size
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
of '8'mb per
Exception in thread "main" java.lang.IllegalArgumentException: Unable
to
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
calculate input splits: ns not found
at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at WordCount.main(WordCount.java:76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
On Wed, Dec 21, 2011 at 2:20 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside it, but
it gaves me error of source-5, so then I add this in the pom.xml file
below
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I still
don't
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
know how to use this library along with mongodb inside eclipse.
Thanks.
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb database. I
already write a java code to query data on mongodb using
mongodb-java
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
driver. And right now, I want to use hadoop cluster to run my java code to
get and put the data from and to mongo database. Did anyone has
done
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
this
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
.
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Scott Hernandez
2012-01-02 16:53:14 UTC
Permalink
Conceptually there are many, and people even have written their own
map/reduce (or batch processing frameworks for mongodb), but in
practice there are only a few established patterns and well-used
systems. There is the built-in map/reduce system in MongoDB and there
are external extensions, like the hadoop stuff. That is about all I
know of that are actively being used in the open source community.

Here is an example of someone rolling their own as I mentioned above :
http://sourceforge.net/p/zarkov/blog/2011/07/zarkov-is-a-lightweight-map-reduce-framework/

The (possible) advantage of taking the map/reduce jobs off the MongoDB
is that you can (dynamically) scale CPU/processing and enable richer
programming logic/libraries using an external system.
Post by Martinus Martinus
Hi Scott,
Yes, I did tried the example, but I'm wondering if there is another
distributed computing software like hadoop that can be used to do map/reduce
with MongoDB and faster.
Thanks and Happy New Year 2012.
Post by Scott Hernandez
Do you mean other than using this? https://github.com/mongodb/mongo-hadoop
It sounded like you got it working once you set the correct params.
Post by Martinus Martinus
Hi,
Is there any better way to do map/reduce in parallel on many machines to
query and put data inside mongodb database besides using hadoop?
Thanks and Happy New Year 2012.
On Mon, Dec 26, 2011 at 5:02 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I knew where the problem is : I haven't made the "in" collection when I
run the program, so it gave me above error.
Thanks.
Merry Christmas.
On Mon, Dec 26, 2011 at 12:46 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I need to
add external library for all of hadoop library? and when I tried to run the
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set.  User
classes may not be found. See JobConf(Class) or
JobConf#setJar(String).
11/12/26 12:42:58 INFO util.MongoSplitter:  Calculate Splits Code ... Use
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input Splits is
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split mode
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded input
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a split size
of '8'mb per
Exception in thread "main" java.lang.IllegalArgumentException: Unable to
calculate input splits: ns not found
    at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
    at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
    at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
    at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
    at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
    at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
    at WordCount.main(WordCount.java:76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
On Wed, Dec 21, 2011 at 2:20 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside it, but
it gaves me error of source-5, so then I add this in the pom.xml file below
    <plugins>
     <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-compiler-plugin</artifactId>
      <configuration>
       <source>1.5</source>
       <target>1.5</target>
      </configuration>
     </plugin>
    </plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I still don't
know how to use this library along with mongodb inside eclipse.
Thanks.
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb database. I
already write a java code to query data on mongodb using mongodb-java
driver. And right now, I want to use hadoop cluster to run my java
code to
get and put the data from and to mongo database. Did anyone has done
this
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Martinus Martinus
2012-01-04 08:54:42 UTC
Permalink
Hi Scott,

I remembered hadoop has single point of failure on it's namenode. It's
there any other way to make this hadoop-mongo to become more fault tolerant
system?

Thanks.

On Tue, Jan 3, 2012 at 12:53 AM, Scott Hernandez
Post by Scott Hernandez
Conceptually there are many, and people even have written their own
map/reduce (or batch processing frameworks for mongodb), but in
practice there are only a few established patterns and well-used
systems. There is the built-in map/reduce system in MongoDB and there
are external extensions, like the hadoop stuff. That is about all I
know of that are actively being used in the open source community.
http://sourceforge.net/p/zarkov/blog/2011/07/zarkov-is-a-lightweight-map-reduce-framework/
The (possible) advantage of taking the map/reduce jobs off the MongoDB
is that you can (dynamically) scale CPU/processing and enable richer
programming logic/libraries using an external system.
Post by Martinus Martinus
Hi Scott,
Yes, I did tried the example, but I'm wondering if there is another
distributed computing software like hadoop that can be used to do
map/reduce
Post by Martinus Martinus
with MongoDB and faster.
Thanks and Happy New Year 2012.
On Mon, Jan 2, 2012 at 8:36 PM, Scott Hernandez <
Post by Scott Hernandez
Do you mean other than using this?
https://github.com/mongodb/mongo-hadoop
Post by Martinus Martinus
Post by Scott Hernandez
It sounded like you got it working once you set the correct params.
On Mon, Jan 2, 2012 at 9:44 AM, Martinus Martinus <
Post by Martinus Martinus
Hi,
Is there any better way to do map/reduce in parallel on many machines
to
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
query and put data inside mongodb database besides using hadoop?
Thanks and Happy New Year 2012.
On Mon, Dec 26, 2011 at 5:02 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I knew where the problem is : I haven't made the "in" collection
when I
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
run the program, so it gave me above error.
Thanks.
Merry Christmas.
On Mon, Dec 26, 2011 at 12:46 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I
need
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
to
add external library for all of hadoop library? and when I tried to run the
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use GenericOptionsParser
for
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
parsing the arguments. Applications should implement Tool for the same.
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set. User
classes may not be found. See JobConf(Class) or
JobConf#setJar(String).
11/12/26 12:42:58 INFO util.MongoSplitter: Calculate Splits Code
...
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Use
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input Splits
is
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split
mode
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded
input
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a split size
of '8'mb per
Unable
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
to
calculate input splits: ns not found
at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at
org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at WordCount.main(WordCount.java:76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
On Wed, Dec 21, 2011 at 2:20 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside it, but
it gaves me error of source-5, so then I add this in the pom.xml
file
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
below
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I
still
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
don't
know how to use this library along with mongodb inside eclipse.
Thanks.
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb
database. I
already write a java code to query data on mongodb using mongodb-java
driver. And right now, I want to use hadoop cluster to run my
java
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
code to
get and put the data from and to mongo database. Did anyone has done
this
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the
Google
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
Groups
"mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups
Post by Martinus Martinus
Post by Scott Hernandez
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Scott Hernandez
2012-01-13 23:10:51 UTC
Permalink
Not sure what you mean. You mean independent of the hadoop framework?
On the mongo side you can use a replica set, or sharded cluster to
take care of any single points of failure (SPF).

I'm not a SPF expert for hadoop but i thought they'd taken care of
this stuff a while ago.

On Wed, Jan 4, 2012 at 12:54 AM, Martinus Martinus
Post by Martinus Martinus
Hi Scott,
I remembered hadoop has single point of failure on it's namenode. It's there
any other way to make this hadoop-mongo to become more fault tolerant
system?
Thanks.
Post by Scott Hernandez
Conceptually there are many, and people even have written their own
map/reduce (or batch processing frameworks for mongodb), but in
practice there are only a few established patterns and well-used
systems. There is the built-in map/reduce system in MongoDB and there
are external extensions, like the hadoop stuff. That is about all I
know of that are actively being used in the open source community.
http://sourceforge.net/p/zarkov/blog/2011/07/zarkov-is-a-lightweight-map-reduce-framework/
The (possible) advantage of taking the map/reduce jobs off the MongoDB
is that you can (dynamically) scale CPU/processing and enable richer
programming logic/libraries using an external system.
Post by Martinus Martinus
Hi Scott,
Yes, I did tried the example, but I'm wondering if there is another
distributed computing software like hadoop that can be used to do map/reduce
with MongoDB and faster.
Thanks and Happy New Year 2012.
On Mon, Jan 2, 2012 at 8:36 PM, Scott Hernandez
Post by Scott Hernandez
Do you mean other than using this?
https://github.com/mongodb/mongo-hadoop
It sounded like you got it working once you set the correct params.
On Mon, Jan 2, 2012 at 9:44 AM, Martinus Martinus
Post by Martinus Martinus
Hi,
Is there any better way to do map/reduce in parallel on many machines to
query and put data inside mongodb database besides using hadoop?
Thanks and Happy New Year 2012.
On Mon, Dec 26, 2011 at 5:02 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I knew where the problem is : I haven't made the "in" collection when I
run the program, so it gave me above error.
Thanks.
Merry Christmas.
On Mon, Dec 26, 2011 at 12:46 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I need
to
add external library for all of hadoop library? and when I tried to run the
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set.  User
classes may not be found. See JobConf(Class) or
JobConf#setJar(String).
11/12/26 12:42:58 INFO util.MongoSplitter:  Calculate Splits Code
...
Use
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input Splits is
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split mode
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded input
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a
split size
of '8'mb per
Exception in thread "main" java.lang.IllegalArgumentException: Unable
to
calculate input splits: ns not found
    at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
    at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
    at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
    at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
    at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
    at
org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
    at WordCount.main(WordCount.java:76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
On Wed, Dec 21, 2011 at 2:20 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside it, but
it gaves me error of source-5, so then I add this in the pom.xml file
below
    <plugins>
     <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-compiler-plugin</artifactId>
      <configuration>
       <source>1.5</source>
       <target>1.5</target>
      </configuration>
     </plugin>
    </plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I still
don't
know how to use this library along with mongodb inside eclipse.
Thanks.
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb
database. I
already write a java code to query data on mongodb using
mongodb-java
driver. And right now, I want to use hadoop cluster to run my java
code to
get and put the data from and to mongo database. Did anyone has
done
this
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the Google
Groups
"mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Martinus Martinus
2012-01-16 06:19:47 UTC
Permalink
Hi Scott,

Is this mongo-hadoop can also put data inside the mongo-hadoop with
replicated and sharded environment? and how to make or setup a connection
into a replicated and sharded environment on mongodb using java driver?

Thanks.

On Sat, Jan 14, 2012 at 7:10 AM, Scott Hernandez
Post by Scott Hernandez
Not sure what you mean. You mean independent of the hadoop framework?
On the mongo side you can use a replica set, or sharded cluster to
take care of any single points of failure (SPF).
I'm not a SPF expert for hadoop but i thought they'd taken care of
this stuff a while ago.
On Wed, Jan 4, 2012 at 12:54 AM, Martinus Martinus
Post by Martinus Martinus
Hi Scott,
I remembered hadoop has single point of failure on it's namenode. It's
there
Post by Martinus Martinus
any other way to make this hadoop-mongo to become more fault tolerant
system?
Thanks.
On Tue, Jan 3, 2012 at 12:53 AM, Scott Hernandez <
Post by Scott Hernandez
Conceptually there are many, and people even have written their own
map/reduce (or batch processing frameworks for mongodb), but in
practice there are only a few established patterns and well-used
systems. There is the built-in map/reduce system in MongoDB and there
are external extensions, like the hadoop stuff. That is about all I
know of that are actively being used in the open source community.
http://sourceforge.net/p/zarkov/blog/2011/07/zarkov-is-a-lightweight-map-reduce-framework/
Post by Martinus Martinus
Post by Scott Hernandez
The (possible) advantage of taking the map/reduce jobs off the MongoDB
is that you can (dynamically) scale CPU/processing and enable richer
programming logic/libraries using an external system.
On Mon, Jan 2, 2012 at 4:37 PM, Martinus Martinus <
Post by Martinus Martinus
Hi Scott,
Yes, I did tried the example, but I'm wondering if there is another
distributed computing software like hadoop that can be used to do map/reduce
with MongoDB and faster.
Thanks and Happy New Year 2012.
On Mon, Jan 2, 2012 at 8:36 PM, Scott Hernandez
Post by Scott Hernandez
Do you mean other than using this?
https://github.com/mongodb/mongo-hadoop
It sounded like you got it working once you set the correct params.
On Mon, Jan 2, 2012 at 9:44 AM, Martinus Martinus
Post by Martinus Martinus
Hi,
Is there any better way to do map/reduce in parallel on many
machines
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
to
query and put data inside mongodb database besides using hadoop?
Thanks and Happy New Year 2012.
On Mon, Dec 26, 2011 at 5:02 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I knew where the problem is : I haven't made the "in" collection when I
run the program, so it gave me above error.
Thanks.
Merry Christmas.
On Mon, Dec 26, 2011 at 12:46 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I need
to
add external library for all of hadoop library? and when I tried
to
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
run the
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set.
User
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
classes may not be found. See JobConf(Class) or
JobConf#setJar(String).
11/12/26 12:42:58 INFO util.MongoSplitter: Calculate Splits Code ...
Use
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input
Splits
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
is
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split mode
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded input
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and
a
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
split size
of '8'mb per
Exception in thread "main" java.lang.IllegalArgumentException: Unable
to
calculate input splits: ns not found
at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at
org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
at WordCount.main(WordCount.java:76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
On Wed, Dec 21, 2011 at 2:20 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside
it,
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
but
it gaves me error of source-5, so then I add this in the pom.xml file
below
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I still
don't
know how to use this library along with mongodb inside eclipse.
Thanks.
On Wed, Dec 21, 2011 at 2:08 PM, Eliot Horowitz <
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb
database. I
already write a java code to query data on mongodb using
mongodb-java
driver. And right now, I want to use hadoop cluster to run my
java
code to
get and put the data from and to mongo database. Did anyone
has
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
done
this
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the
Google
Groups
"mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
.
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups
Post by Martinus Martinus
Post by Scott Hernandez
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Scott Hernandez
2012-01-16 13:48:45 UTC
Permalink
In a sharded environment use one of the mongos addresses when
connecting. For a replica set you just need to use a seed list of more
than one address.

On Sun, Jan 15, 2012 at 10:19 PM, Martinus Martinus
Post by Martinus Martinus
Hi Scott,
Is this mongo-hadoop can also put data inside the mongo-hadoop with
replicated and sharded environment? and how to make or setup a connection
into a replicated and sharded environment on mongodb using java driver?
Thanks.
Post by Scott Hernandez
Not sure what you mean. You mean independent of the hadoop framework?
On the mongo side you can use a replica set, or sharded cluster to
take care of any single points of failure (SPF).
I'm not a SPF expert for hadoop but i thought they'd taken care of
this stuff a while ago.
On Wed, Jan 4, 2012 at 12:54 AM, Martinus Martinus
Post by Martinus Martinus
Hi Scott,
I remembered hadoop has single point of failure on it's namenode. It's there
any other way to make this hadoop-mongo to become more fault tolerant
system?
Thanks.
On Tue, Jan 3, 2012 at 12:53 AM, Scott Hernandez
Post by Scott Hernandez
Conceptually there are many, and people even have written their own
map/reduce (or batch processing frameworks for mongodb), but in
practice there are only a few established patterns and well-used
systems. There is the built-in map/reduce system in MongoDB and there
are external extensions, like the hadoop stuff. That is about all I
know of that are actively being used in the open source community.
http://sourceforge.net/p/zarkov/blog/2011/07/zarkov-is-a-lightweight-map-reduce-framework/
The (possible) advantage of taking the map/reduce jobs off the MongoDB
is that you can (dynamically) scale CPU/processing and enable richer
programming logic/libraries using an external system.
On Mon, Jan 2, 2012 at 4:37 PM, Martinus Martinus
Post by Martinus Martinus
Hi Scott,
Yes, I did tried the example, but I'm wondering if there is another
distributed computing software like hadoop that can be used to do map/reduce
with MongoDB and faster.
Thanks and Happy New Year 2012.
On Mon, Jan 2, 2012 at 8:36 PM, Scott Hernandez
Post by Scott Hernandez
Do you mean other than using this?
https://github.com/mongodb/mongo-hadoop
It sounded like you got it working once you set the correct params.
On Mon, Jan 2, 2012 at 9:44 AM, Martinus Martinus
Post by Martinus Martinus
Hi,
Is there any better way to do map/reduce in parallel on many machines
to
query and put data inside mongodb database besides using hadoop?
Thanks and Happy New Year 2012.
On Mon, Dec 26, 2011 at 5:02 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I knew where the problem is : I haven't made the "in" collection when I
run the program, so it gave me above error.
Thanks.
Merry Christmas.
On Mon, Dec 26, 2011 at 12:46 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I need
to
add external library for all of hadoop library? and when I tried to
run the
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics with
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use
GenericOptionsParser
for
parsing the arguments. Applications should implement Tool for
the
same.
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set.
User
classes may not be found. See JobConf(Class) or
JobConf#setJar(String).
11/12/26 12:42:58 INFO util.MongoSplitter:  Calculate Splits Code
...
Use
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input Splits
is
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split mode
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded
input
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a
split size
of '8'mb per
Exception in thread "main" java.lang.IllegalArgumentException: Unable
to
calculate input splits: ns not found
    at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
    at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
    at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
    at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
    at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
    at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
    at
org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
    at WordCount.main(WordCount.java:76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
On Wed, Dec 21, 2011 at 2:20 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside
it,
but
it gaves me error of source-5, so then I add this in the pom.xml
file
below
    <plugins>
     <plugin>
      <groupId>org.apache.maven.plugins</groupId>
      <artifactId>maven-compiler-plugin</artifactId>
      <configuration>
       <source>1.5</source>
       <target>1.5</target>
      </configuration>
     </plugin>
    </plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I
still
don't
know how to use this library along with mongodb inside eclipse.
Thanks.
On Wed, Dec 21, 2011 at 2:08 PM, Eliot Horowitz
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb
database. I
already write a java code to query data on mongodb using
mongodb-java
driver. And right now, I want to use hadoop cluster to run my
java
code to
get and put the data from and to mongo database. Did anyone
has
done
this
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the
Google
Groups
"mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the
Google
Groups "mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Brendan W. McAdams
2012-01-16 13:51:48 UTC
Permalink
There is some explanation oof the different settings for MongoHadoop WRT
sharding as well in the README.
Post by Scott Hernandez
In a sharded environment use one of the mongos addresses when
connecting. For a replica set you just need to use a seed list of more
than one address.
On Sun, Jan 15, 2012 at 10:19 PM, Martinus Martinus
Post by Martinus Martinus
Hi Scott,
Is this mongo-hadoop can also put data inside the mongo-hadoop with
replicated and sharded environment? and how to make or setup a connection
into a replicated and sharded environment on mongodb using java driver?
Thanks.
On Sat, Jan 14, 2012 at 7:10 AM, Scott Hernandez <
Post by Scott Hernandez
Not sure what you mean. You mean independent of the hadoop framework?
On the mongo side you can use a replica set, or sharded cluster to
take care of any single points of failure (SPF).
I'm not a SPF expert for hadoop but i thought they'd taken care of
this stuff a while ago.
On Wed, Jan 4, 2012 at 12:54 AM, Martinus Martinus
Post by Martinus Martinus
Hi Scott,
I remembered hadoop has single point of failure on it's namenode. It's there
any other way to make this hadoop-mongo to become more fault tolerant
system?
Thanks.
On Tue, Jan 3, 2012 at 12:53 AM, Scott Hernandez
Post by Scott Hernandez
Conceptually there are many, and people even have written their own
map/reduce (or batch processing frameworks for mongodb), but in
practice there are only a few established patterns and well-used
systems. There is the built-in map/reduce system in MongoDB and there
are external extensions, like the hadoop stuff. That is about all I
know of that are actively being used in the open source community.
Here is an example of someone rolling their own as I mentioned above
http://sourceforge.net/p/zarkov/blog/2011/07/zarkov-is-a-lightweight-map-reduce-framework/
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
The (possible) advantage of taking the map/reduce jobs off the
MongoDB
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
is that you can (dynamically) scale CPU/processing and enable richer
programming logic/libraries using an external system.
On Mon, Jan 2, 2012 at 4:37 PM, Martinus Martinus
Post by Martinus Martinus
Hi Scott,
Yes, I did tried the example, but I'm wondering if there is another
distributed computing software like hadoop that can be used to do
map/reduce
with MongoDB and faster.
Thanks and Happy New Year 2012.
On Mon, Jan 2, 2012 at 8:36 PM, Scott Hernandez
Post by Scott Hernandez
Do you mean other than using this?
https://github.com/mongodb/mongo-hadoop
It sounded like you got it working once you set the correct
params.
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
On Mon, Jan 2, 2012 at 9:44 AM, Martinus Martinus
Post by Martinus Martinus
Hi,
Is there any better way to do map/reduce in parallel on many machines
to
query and put data inside mongodb database besides using hadoop?
Thanks and Happy New Year 2012.
On Mon, Dec 26, 2011 at 5:02 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I knew where the problem is : I haven't made the "in"
collection
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
when I
run the program, so it gave me above error.
Thanks.
Merry Christmas.
On Mon, Dec 26, 2011 at 12:46 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and
do I
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
need
to
add external library for all of hadoop library? and when I
tried
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
to
run the
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM
Metrics
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
with
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use
GenericOptionsParser
for
parsing the arguments. Applications should implement Tool for
the
same.
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set.
User
classes may not be found. See JobConf(Class) or
JobConf#setJar(String).
11/12/26 12:42:58 INFO util.MongoSplitter: Calculate Splits Code
...
Use
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input
Splits
is
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded
Split
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
mode
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating
unsharded
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
input
splits on namespace 'test.in' with Split Key '{ "_id" : 1}'
and
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
a
split size
of '8'mb per
Unable
to
calculate input splits: ns not found
at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at
org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
at WordCount.main(WordCount.java:76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
On Wed, Dec 21, 2011 at 2:20 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder
inside
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
it,
but
it gaves me error of source-5, so then I add this in the
pom.xml
file
below
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but
I
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
still
don't
know how to use this library along with mongodb inside
eclipse.
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Thanks.
On Wed, Dec 21, 2011 at 2:08 PM, Eliot Horowitz
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside
mongodb
database. I
already write a java code to query data on mongodb using
mongodb-java
driver. And right now, I want to use hadoop cluster to run
my
java
code to
get and put the data from and to mongo database. Did
anyone
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
has
done
this
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to
the
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
Google
Groups
"mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the
Google
Groups "mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the
Google
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Groups
"mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
.
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
Post by Scott Hernandez
Post by Martinus Martinus
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups
Post by Martinus Martinus
Post by Scott Hernandez
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Martinus Martinus
2012-02-17 02:38:49 UTC
Permalink
Hi Elif,

I guess you should read this first :

http://www.mongodb.org/display/DOCS/Import+Export+Tools.

I have never test that either. I guess the developers should know better
than me. :)

Thanks.
Thanks Martinus. I did that and now WordCount is working. BUT how do I
import text to in collection? Right now I have empty in so empty out
collections.
I am assuming that "beyond_lies_the_wub.txt" is the sample input they
provide for this example.
But how do I import that into mongodb since it is not in json,csv or
tsv format?
thanks,
elif
Hi elif,
You need to made the collection named "in" inside your MongoDB database
from the Mongo shell and then you can run your WordCount.java example,
otherwise there is nothing to be map/reduce by mongo-hadoop.
Thanks.
How did you made the "in" collection later. I am getting the same
error and don't know how to proceed.
Are we supposed to import the beyond_lies_the_wub.txt to the mongodb
or we need to set it up as the input?
thanks.
Post by Martinus Martinus
Hi Eliot,
I knew where the problem is : I haven't made the "in" collection
when I
run
Post by Martinus Martinus
the program, so it gave me above error.
Thanks.
Merry Christmas.
On Mon, Dec 26, 2011 at 12:46 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I
need
to
Post by Martinus Martinus
Post by Martinus Martinus
add external library for all of hadoop library? and when I tried to
run the
Post by Martinus Martinus
Post by Martinus Martinus
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics
with
Post by Martinus Martinus
Post by Martinus Martinus
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use GenericOptionsParser
for
Post by Martinus Martinus
Post by Martinus Martinus
parsing the arguments. Applications should implement Tool for the
same.
Post by Martinus Martinus
Post by Martinus Martinus
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set. User
classes may not be found. See JobConf(Class) or
JobConf#setJar(String).
Post by Martinus Martinus
Post by Martinus Martinus
11/12/26 12:42:58 INFO util.MongoSplitter: Calculate Splits Code
...
Use
Post by Martinus Martinus
Post by Martinus Martinus
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input
Splits is
Post by Martinus Martinus
Post by Martinus Martinus
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split
mode
Post by Martinus Martinus
Post by Martinus Martinus
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded
input
Post by Martinus Martinus
Post by Martinus Martinus
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a
split
Post by Martinus Martinus
Post by Martinus Martinus
size of '8'mb per
Unable
to
Post by Martinus Martinus
Post by Martinus Martinus
calculate input splits: ns not found
at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
Post by Martinus Martinus
Post by Martinus Martinus
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at
org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
Post by Martinus Martinus
Post by Martinus Martinus
at WordCount.main(WordCount.java:76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
On Wed, Dec 21, 2011 at 2:20 PM, Martinus Martinus <
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside it,
but it
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
gaves me error of source-5, so then I add this in the pom.xml file
below
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I
still
don't
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
know how to use this library along with mongodb inside eclipse.
Thanks.
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb
database. I
Post by Martinus Martinus
already write a java code to query data on mongodb using
mongodb-java
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
driver. And right now, I want to use hadoop cluster to run my
java
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
code to
Post by Martinus Martinus
get and put the data from and to mongo database. Did anyone has
done
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
this
Post by Martinus Martinus
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the
Google
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Groups
Post by Martinus Martinus
"mongodb-user" group.
To post to this group, send email to
.
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the
Google
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Groups "mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
kg
2012-03-15 20:42:44 UTC
Permalink
Hi Martinus/Elif,
I
Post by Martinus Martinus
Hi Elif,
http://www.mongodb.org/display/DOCS/Import+Export+Tools.
I have never test that either. I guess the developers should know better
than me. :)
Thanks.
I created a collection 'in' in the 'test' db in mongos
show collections
in
system.indexes

and then ran the hadoop process
***@ip-10-252-31-236:/home/ubuntu/mongo-hadoop$ cd /usr/lib/hadoop-0.20/
***@ip-10-252-31-236:/usr/lib/hadoop-0.20$ bin/hadoop jar WordCount.jar
WordCount
Conf: Configuration: core-default.xml, core-site.xml
12/03/15 20:33:00 INFO security.UserGroupInformation: JAAS Configuration
already set up for Hadoop, not re-installing.
12/03/15 20:33:00 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
12/03/15 20:33:01 INFO util.MongoSplitter: Calculate Splits Code ... Use
Shards? false, Use Chunks? true; Collection Sharded? false
12/03/15 20:33:01 INFO util.MongoSplitter: Creation of Input Splits is
enabled.
12/03/15 20:33:01 INFO util.MongoSplitter: Using Unsharded Split mode
(Calculating multiple splits though)
12/03/15 20:33:01 INFO util.MongoSplitter: Calculating unsharded input
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a split
size of '8'mb per
12/03/15 20:33:01 INFO mapred.JobClient: Cleaning up the staging area
hdfs://master:54310/app/hadoop/tmp/mapred/staging/hduser/.staging/job_201203151831_0003
Exception in thread "main" java.lang.IllegalArgumentException: Error
calculating splits: { "serverUsed" :
"ec2-50-112-19-33.us-west-2.compute.amazonaws.com:27017" , "$err" :
"unrecognized command: splitVector" , "code" : 13390}
at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:104)
at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944)
at
org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961)
at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
at WordCount.main(WordCount.java:97)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
and I still get the same error... any idea what am I doing wrong???

Thanks!!!
-k
Post by Martinus Martinus
Thanks Martinus. I did that and now WordCount is working. BUT how do I
import text to in collection? Right now I have empty in so empty out
collections.
I am assuming that "beyond_lies_the_wub.txt" is the sample input they
provide for this example.
But how do I import that into mongodb since it is not in json,csv or
tsv format?
thanks,
elif
Hi elif,
You need to made the collection named "in" inside your MongoDB database
from the Mongo shell and then you can run your WordCount.java example,
otherwise there is nothing to be map/reduce by mongo-hadoop.
Thanks.
How did you made the "in" collection later. I am getting the same
error and don't know how to proceed.
Are we supposed to import the beyond_lies_the_wub.txt to the mongodb
or we need to set it up as the input?
thanks.
Post by Martinus Martinus
Hi Eliot,
I knew where the problem is : I haven't made the "in" collection
when I
run
Post by Martinus Martinus
the program, so it gave me above error.
Thanks.
Merry Christmas.
On Mon, Dec 26, 2011 at 12:46 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I
need
to
Post by Martinus Martinus
Post by Martinus Martinus
add external library for all of hadoop library? and when I tried
to
run the
Post by Martinus Martinus
Post by Martinus Martinus
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics
with
Post by Martinus Martinus
Post by Martinus Martinus
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use GenericOptionsParser
for
Post by Martinus Martinus
Post by Martinus Martinus
parsing the arguments. Applications should implement Tool for the
same.
Post by Martinus Martinus
Post by Martinus Martinus
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set.
User
Post by Martinus Martinus
Post by Martinus Martinus
classes may not be found. See JobConf(Class) or
JobConf#setJar(String).
Post by Martinus Martinus
Post by Martinus Martinus
11/12/26 12:42:58 INFO util.MongoSplitter: Calculate Splits Code
...
Use
Post by Martinus Martinus
Post by Martinus Martinus
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input
Splits is
Post by Martinus Martinus
Post by Martinus Martinus
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split
mode
Post by Martinus Martinus
Post by Martinus Martinus
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded
input
Post by Martinus Martinus
Post by Martinus Martinus
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a
split
Post by Martinus Martinus
Post by Martinus Martinus
size of '8'mb per
Unable
to
Post by Martinus Martinus
Post by Martinus Martinus
calculate input splits: ns not found
at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
Post by Martinus Martinus
Post by Martinus Martinus
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at
org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
Post by Martinus Martinus
Post by Martinus Martinus
at WordCount.main(WordCount.java:76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
On Wed, Dec 21, 2011 at 2:20 PM, Martinus Martinus <
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside
it,
but it
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
gaves me error of source-5, so then I add this in the pom.xml
file
below
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I
still
don't
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
know how to use this library along with mongodb inside eclipse.
Thanks.
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb
database. I
Post by Martinus Martinus
already write a java code to query data on mongodb using
mongodb-java
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
driver. And right now, I want to use hadoop cluster to run my
java
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
code to
Post by Martinus Martinus
get and put the data from and to mongo database. Did anyone
has
done
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
this
Post by Martinus Martinus
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the
Google
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Groups
Post by Martinus Martinus
"mongodb-user" group.
To post to this group, send email to
.
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the
Google
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Groups "mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/v5OWl03uLxQJ.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
kg
2012-03-15 21:13:36 UTC
Permalink
Hi Elif/Martinus,

I am running into the same issue as u guys, even though I have the
collection 'in' and the database 'test' created in mongos.

***@ip-10-252-31-236:/usr/lib/hadoop-0.20$ bin/hadoop jar WordCount.jar
WordCount
Conf: Configuration: core-default.xml, core-site.xml
12/03/15 20:33:00 INFO security.UserGroupInformation: JAAS Configuration
already set up for Hadoop, not re-installing.
12/03/15 20:33:00 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
12/03/15 20:33:01 INFO util.MongoSplitter: Calculate Splits Code ... Use
Shards? false, Use Chunks? true; Collection Sharded? false
12/03/15 20:33:01 INFO util.MongoSplitter: Creation of Input Splits is
enabled.
12/03/15 20:33:01 INFO util.MongoSplitter: Using Unsharded Split mode
(Calculating multiple splits though)
12/03/15 20:33:01 INFO util.MongoSplitter: Calculating unsharded input
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a split
size of '8'mb per
12/03/15 20:33:01 INFO mapred.JobClient: Cleaning up the staging area
hdfs://master:54310/app/hadoop/tmp/mapred/staging/hduser/.staging/job_201203151831_0003
Exception in thread "main" java.lang.IllegalArgumentException: Error
calculating splits: { "serverUsed" :
"ec2-50-112-19-33.us-west-2.compute.amazonaws.com:27017" , "$err" :
"unrecognized command: splitVector" , "code" : 13390}
at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:104)
at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944)
at
org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961)
at org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
at WordCount.main(WordCount.java:97)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)

Any ideas on what am I doing wrong??
Post by Martinus Martinus
Hi Elif,
http://www.mongodb.org/display/DOCS/Import+Export+Tools.
I have never test that either. I guess the developers should know better
than me. :)
Thanks.
Thanks Martinus. I did that and now WordCount is working. BUT how do I
import text to in collection? Right now I have empty in so empty out
collections.
I am assuming that "beyond_lies_the_wub.txt" is the sample input they
provide for this example.
But how do I import that into mongodb since it is not in json,csv or
tsv format?
thanks,
elif
Hi elif,
You need to made the collection named "in" inside your MongoDB database
from the Mongo shell and then you can run your WordCount.java example,
otherwise there is nothing to be map/reduce by mongo-hadoop.
Thanks.
How did you made the "in" collection later. I am getting the same
error and don't know how to proceed.
Are we supposed to import the beyond_lies_the_wub.txt to the mongodb
or we need to set it up as the input?
thanks.
Post by Martinus Martinus
Hi Eliot,
I knew where the problem is : I haven't made the "in" collection
when I
run
Post by Martinus Martinus
the program, so it gave me above error.
Thanks.
Merry Christmas.
On Mon, Dec 26, 2011 at 12:46 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I
need
to
Post by Martinus Martinus
Post by Martinus Martinus
add external library for all of hadoop library? and when I tried
to
run the
Post by Martinus Martinus
Post by Martinus Martinus
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics
with
Post by Martinus Martinus
Post by Martinus Martinus
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use GenericOptionsParser
for
Post by Martinus Martinus
Post by Martinus Martinus
parsing the arguments. Applications should implement Tool for the
same.
Post by Martinus Martinus
Post by Martinus Martinus
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set.
User
Post by Martinus Martinus
Post by Martinus Martinus
classes may not be found. See JobConf(Class) or
JobConf#setJar(String).
Post by Martinus Martinus
Post by Martinus Martinus
11/12/26 12:42:58 INFO util.MongoSplitter: Calculate Splits Code
...
Use
Post by Martinus Martinus
Post by Martinus Martinus
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input
Splits is
Post by Martinus Martinus
Post by Martinus Martinus
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split
mode
Post by Martinus Martinus
Post by Martinus Martinus
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded
input
Post by Martinus Martinus
Post by Martinus Martinus
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a
split
Post by Martinus Martinus
Post by Martinus Martinus
size of '8'mb per
Unable
to
Post by Martinus Martinus
Post by Martinus Martinus
calculate input splits: ns not found
at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:106)
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:885)
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:779)
Post by Martinus Martinus
Post by Martinus Martinus
at org.apache.hadoop.mapreduce.Job.submit(Job.java:432)
at
org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:447)
Post by Martinus Martinus
Post by Martinus Martinus
at WordCount.main(WordCount.java:76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
On Wed, Dec 21, 2011 at 2:20 PM, Martinus Martinus <
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside
it,
but it
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
gaves me error of source-5, so then I add this in the pom.xml
file
below
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
<plugins>
<plugin>
<groupId>org.apache.maven.plugins</groupId>
<artifactId>maven-compiler-plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-SNAPSHOT.jar on the target folder, but I
still
don't
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
know how to use this library along with mongodb inside eclipse.
Thanks.
Post by Eliot Horowitz
https://github.com/mongodb/mongo-hadoop
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb
database. I
Post by Martinus Martinus
already write a java code to query data on mongodb using
mongodb-java
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
driver. And right now, I want to use hadoop cluster to run my
java
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
code to
Post by Martinus Martinus
get and put the data from and to mongo database. Did anyone
has
done
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
this
Post by Martinus Martinus
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the
Google
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Groups
Post by Martinus Martinus
"mongodb-user" group.
To post to this group, send email to
.
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Post by Martinus Martinus
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the
Google
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Eliot Horowitz
Groups "mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google
Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To view this discussion on the web visit https://groups.google.com/d/msg/mongodb-user/-/Ig3E3wS-P0oJ.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Brendan W. McAdams
2012-03-15 21:24:52 UTC
Permalink
Are you running in a sharded cluster?

if so, *mongos* does not currently expose the *splitVector* command needed
to split an unsharded collection.

For now, you'll need to either run with no splits, or point hadoop directly
at the *mongod *which is the primary for that collection.

I am working on a workaround for this issue, and a ticket is open to expose
splitVector through mongos.
Post by kg
Hi Elif/Martinus,
I am running into the same issue as u guys, even though I have the
collection 'in' and the database 'test' created in mongos.
WordCount.jar WordCount
Conf: Configuration: core-default.xml, core-site.xml
12/03/15 20:33:00 INFO security.UserGroupInformation: JAAS Configuration
already set up for Hadoop, not re-installing.
12/03/15 20:33:00 WARN mapred.JobClient: Use GenericOptionsParser for
parsing the arguments. Applications should implement Tool for the same.
12/03/15 20:33:01 INFO util.MongoSplitter: Calculate Splits Code ... Use
Shards? false, Use Chunks? true; Collection Sharded? false
12/03/15 20:33:01 INFO util.MongoSplitter: Creation of Input Splits is
enabled.
12/03/15 20:33:01 INFO util.MongoSplitter: Using Unsharded Split mode
(Calculating multiple splits though)
12/03/15 20:33:01 INFO util.MongoSplitter: Calculating unsharded input
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and a split
size of '8'mb per
12/03/15 20:33:01 INFO mapred.JobClient: Cleaning up the staging area
hdfs://master:54310/app/hadoop/tmp/mapred/staging/hduser/.staging/job_201203151831_0003
Exception in thread "main" java.lang.IllegalArgumentException: Error
calculating splits: { "serverUsed" : "
"unrecognized command: splitVector" , "code" : 13390}
at
com.mongodb.hadoop.util.MongoSplitter.calculateUnshardedSplits(MongoSplitter.java:104)
at
com.mongodb.hadoop.util.MongoSplitter.calculateSplits(MongoSplitter.java:75)
at
com.mongodb.hadoop.MongoInputFormat.getSplits(MongoInputFormat.java:51)
at
org.apache.hadoop.mapred.JobClient.writeNewSplits(JobClient.java:944)
at
org.apache.hadoop.mapred.JobClient.writeSplits(JobClient.java:961)
at
org.apache.hadoop.mapred.JobClient.access$500(JobClient.java:170)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:880)
at org.apache.hadoop.mapred.JobClient$2.run(JobClient.java:833)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:396)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1157)
at
org.apache.hadoop.mapred.JobClient.submitJobInternal(JobClient.java:833)
at org.apache.hadoop.mapreduce.Job.submit(Job.java:476)
at org.apache.hadoop.mapreduce.Job.waitForCompletion(Job.java:506)
at WordCount.main(WordCount.java:97)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.hadoop.util.RunJar.main(RunJar.java:197)
Any ideas on what am I doing wrong??
Post by Martinus Martinus
Hi Elif,
http://www.mongodb.org/**display/DOCS/Import+Export+**Tools<http://www.mongodb.org/display/DOCS/Import+Export+Tools>
.
I have never test that either. I guess the developers should know better
than me. :)
Thanks.
Thanks Martinus. I did that and now WordCount is working. BUT how do I
import text to in collection? Right now I have empty in so empty out
collections.
I am assuming that "beyond_lies_the_wub.txt" is the sample input they
provide for this example.
But how do I import that into mongodb since it is not in json,csv or
tsv format?
thanks,
elif
Hi elif,
You need to made the collection named "in" inside your MongoDB database
from the Mongo shell and then you can run your WordCount.java example,
otherwise there is nothing to be map/reduce by mongo-hadoop.
Thanks.
How did you made the "in" collection later. I am getting the same
error and don't know how to proceed.
Are we supposed to import the beyond_lies_the_wub.txt to the mongodb
or we need to set it up as the input?
thanks.
Post by Martinus Martinus
Hi Eliot,
I knew where the problem is : I haven't made the "in" collection
when I
run
Post by Martinus Martinus
the program, so it gave me above error.
Thanks.
Merry Christmas.
On Mon, Dec 26, 2011 at 12:46 PM, Martinus Martinus
Post by Martinus Martinus
Hi Eliot,
I tried to used hadoop-mongo plugin using hadoop-0.20.2 and do I
need
to
Post by Martinus Martinus
Post by Martinus Martinus
add external library for all of hadoop library? and when I tried
to
run the
Post by Martinus Martinus
Post by Martinus Martinus
Conf: Configuration: core-default.xml, core-site.xml
11/12/26 12:42:46 INFO jvm.JvmMetrics: Initializing JVM Metrics
with
Post by Martinus Martinus
Post by Martinus Martinus
processName=JobTracker, sessionId=
11/12/26 12:42:46 WARN mapred.JobClient: Use
GenericOptionsParser for
Post by Martinus Martinus
Post by Martinus Martinus
parsing the arguments. Applications should implement Tool for
the same.
Post by Martinus Martinus
Post by Martinus Martinus
11/12/26 12:42:46 WARN mapred.JobClient: No job jar file set.
User
Post by Martinus Martinus
Post by Martinus Martinus
classes may not be found. See JobConf(Class) or
JobConf#setJar(String).
Post by Martinus Martinus
Post by Martinus Martinus
11/12/26 12:42:58 INFO util.MongoSplitter: Calculate Splits
Code ...
Use
Post by Martinus Martinus
Post by Martinus Martinus
Shards? false, Use Chunks? true; Collection Sharded? false
11/12/26 12:42:58 INFO util.MongoSplitter: Creation of Input
Splits is
Post by Martinus Martinus
Post by Martinus Martinus
enabled.
11/12/26 12:42:58 INFO util.MongoSplitter: Using Unsharded Split
mode
Post by Martinus Martinus
Post by Martinus Martinus
(Calculating multiple splits though)
11/12/26 12:42:58 INFO util.MongoSplitter: Calculating unsharded
input
Post by Martinus Martinus
Post by Martinus Martinus
splits on namespace 'test.in' with Split Key '{ "_id" : 1}' and
a
split
Post by Martinus Martinus
Post by Martinus Martinus
size of '8'mb per
Unable
to
Post by Martinus Martinus
Post by Martinus Martinus
calculate input splits: ns not found
at
com.mongodb.hadoop.util.**MongoSplitter.**calculateUnshardedSplits(*
*MongoSplitter.java:106)
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.util.**MongoSplitter.calculateSplits(**
MongoSplitter.java:75)
Post by Martinus Martinus
Post by Martinus Martinus
at
com.mongodb.hadoop.**MongoInputFormat.getSplits(**
MongoInputFormat.java:51)
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.**JobClient.writeNewSplits(**
JobClient.java:885)
Post by Martinus Martinus
Post by Martinus Martinus
at
org.apache.hadoop.mapred.**JobClient.submitJobInternal(**
JobClient.java:779)
Post by Martinus Martinus
Post by Martinus Martinus
at org.apache.hadoop.mapreduce.**Job.submit(Job.java:432)
at org.apache.hadoop.mapreduce.**Job.waitForCompletion(Job.*
*java:447)
Post by Martinus Martinus
Post by Martinus Martinus
at WordCount.main(WordCount.java:**76)
Would you be so kindly to tell me how to fix this problem?
Thanks.
On Wed, Dec 21, 2011 at 2:20 PM, Martinus Martinus <
Post by Martinus Martinus
Hi Eliot,
I have tried to built the jar file from the core folder inside
it,
but it
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
gaves me error of source-5, so then I add this in the pom.xml
file
below
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
<plugins>
<plugin>
<groupId>org.apache.maven.**plugins</groupId>
<artifactId>maven-compiler-**plugin</artifactId>
<configuration>
<source>1.5</source>
<target>1.5</target>
</configuration>
</plugin>
</plugins>
and it can be built using mvn package. It gaves me
mongo-hadoop-core-1.0-**SNAPSHOT.jar on the target folder, but
I still
don't
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
know how to use this library along with mongodb inside eclipse.
Thanks.
On Wed, Dec 21, 2011 at 2:08 PM, Eliot Horowitz <
https://github.com/mongodb/**mongo-hadoop<https://github.com/mongodb/mongo-hadoop>
On Tue, Dec 20, 2011 at 10:29 PM, Martinus Martinus
Post by Martinus Martinus
Hi,
I have hadoop cluster running and have my data inside mongodb
database. I
Post by Martinus Martinus
already write a java code to query data on mongodb using
mongodb-java
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
driver. And right now, I want to use hadoop cluster to run
my java
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
code to
Post by Martinus Martinus
get and put the data from and to mongo database. Did anyone
has
done
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
this
Post by Martinus Martinus
before? Can you explain to me how to do that?
Thanks.
--
You received this message because you are subscribed to the
Google
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Groups
Post by Martinus Martinus
"mongodb-user" group.
To post to this group, send email to
.
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
To unsubscribe from this group, send email to
.
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
For more options, visit this group at
http://groups.google.com/**group/mongodb-user?hl=en<http://groups.google.com/group/mongodb-user?hl=en>
.
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
--
You received this message because you are subscribed to the
Google
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
Groups "mongodb-user" group.
To post to this group, send email to
To unsubscribe from this group, send email to
.
Post by Martinus Martinus
Post by Martinus Martinus
Post by Martinus Martinus
For more options, visit this group at
http://groups.google.com/**group/mongodb-user?hl=en<http://groups.google.com/group/mongodb-user?hl=en>
.
--
You received this message because you are subscribed to the Google
Groups
"mongodb-user" group.
To unsubscribe from this group, send email to
.
For more options, visit this group at
http://groups.google.com/**group/mongodb-user?hl=en<http://groups.google.com/group/mongodb-user?hl=en>
.
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
For more options, visit this group at http://groups.google.com/**
group/mongodb-user?hl=en<http://groups.google.com/group/mongodb-user?hl=en>
.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To view this discussion on the web visit
https://groups.google.com/d/msg/mongodb-user/-/Ig3E3wS-P0oJ.
To unsubscribe from this group, send email to
For more options, visit this group at
http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Continue reading on narkive:
Loading...