Discussion:
Degraded query performance over large set of data
(too old to reply)
Harshad Shirsat
2015-02-06 06:26:11 UTC
Permalink
Hi,

I am currently working on a project for which I am using MongoDB as a back
end. I have around 30-40 fields in schema including that of sub documents.
I have over 2.5 million of documents in the collection which constitute for
over 30 GB. It has been distributed in sharded environment with 2 shards,
each consisting of replica set. Sample document is as below:

user:
{
_id: UUID,
name: string,
addr: string,
date: ISODate,
dob: ISODate,
.
.
.
transaction:[
{
createdAt: ISODate,
firstTransaction: ISODate,
lastTransaction: ISODate,
amount: float,
product: string
}
],
more sub documents...
}

In above schema I have indexed fields such as- date in main document,
createdAt, firstTransaction, lastTransaction, product in transaction sub
document. Sub document length varies from 0-50 or so. My question is that,
despite of indexing those fields, queries such as getting all users who
have done transaction within a day or two takes over two minutes which is
very slow. If the duration is month it takes over 10 minutes to fetch the
records. If I use multiple fields in same query (which are all indexed) the
time whoppingly increases to over 30 minutes. I have checked all the
configuration & it really is suitable for the scenario. For e.g.: RAM to
sum over all the shards constitutes to more that 10 GB, with each shard
having 7.5 GB of RAM. Index size is 4 GB. Servers are especially dedicated
to MongoDB only, so no other overhead for the server. All the RAM is
dedicated to MongoDB. Still it takes so much time. For the multiple fields
query I tried to build compound index but to my surprise instead of
performance gain, the query seems to get stuck & never returns the result
(I mean it "NEVER"). I don't know where I went wrong. I am newbie to the
system, so any expert who can help me over here is welcome. I tried the
solution online & found out that there is problem with the range query in
MongoDB, which sometimes doesn't take into consideration one bound (either
lower or upper). So for that I found out the solution, which is min(),
max() cursor functions & they work properly (thank God! :)) for main
document field date. But for sub document fields such as firstTransaction
or lastTransaction it shows the error as:

*planner returned error: unable to find relevant index for max/min query*

I tried using hint() but then it gave another error, which is:

*planner returned error: hint provided does not work with min query*

Queries which I used are:

*db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).explain()*

*db.user.find().min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).hint("transaction.firstTransaction_1").explain()*
Above two queries returned me the two errors mentioned above.

Queries which performs slowly are:

*1)
db.user.find({transaction:{$elemMatch:{product:'tv'}}}).min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*


*2) db.user.find({transaction:{$elemMatch:{product:'fridge',
firstTransaction:{$in:[ISODate("2015-01-01"),ISODate("2015-01-02")]}}}}).explain()*


*3) db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain()*
This query retrieves over half of the collection & takes over 20 minutes
despite of having the field indexed.

*4)
db.user.find().min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*
This query takes a while when run at first (around 2 mins). But after
warming up it executes really fast (in matter of milliseconds :)).

So I tried warming up for the queries 1,3 but no improvement.

Indexes are:

*db.user.ensureIndex({date:1})*

*db.user.ensureIndex({"transaction.product":1})*

*db.user.ensureIndex({"transaction.firstTransaction":1})*

*db.user.ensureIndex({"transaction.lastTransaction":1})*

*db.user.ensureIndex({**transaction.firstTransaction":1,
"transaction.product":1**})*

*db.user.ensureIndex({date:1, "transaction.product":1})*

My main concern is the 2 query. For other queries I can warm up the
database, but for this query however I couldn't find the solution. I have
tried different versions of it such as with min max functions, $gt, $lt
operators, & with compound index, but no help. Can anyone find out what's
going wrong here? Thanks to bear with my really long question, you can
guess how desperate I am to solve this :P . Please let me know if I have
miss any of the details.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Dhananjay Makwana
2015-02-06 12:35:25 UTC
Permalink
<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
I am not an expert but "explain()" output would be helpful to look
into this. <br>
<br>
<div class="moz-cite-prefix">On 2/6/15 1:26 AM, Harshad Shirsat
wrote:<br>
</div>
<blockquote
cite="mid:c34f0a0b-0915-4d6f-9c1b-***@googlegroups.com"
type="cite">
<div dir="ltr">Hi,
<div><br>
</div>
<div>I am currently working on a project for which I am using
MongoDB as a back end. I have around 30-40 fields in schema
including that of sub documents. I have over 2.5 million of
documents in the collection which constitute for over 30 GB.
It has been distributed in sharded environment with 2 shards,
each consisting of replica set. Sample document is as below:</div>
<div><br>
</div>
<div>user:</div>
<div>{</div>
<div>_id: UUID,</div>
<div>name: string,</div>
<div>addr: string,</div>
<div>date: ISODate,</div>
<div>dob: ISODate,</div>
<div>.</div>
<div>.</div>
<div>.</div>
<div>transaction:[</div>
<div>{</div>
<div>createdAt: ISODate,</div>
<div>firstTransaction: ISODate,</div>
<div>lastTransaction: ISODate,</div>
<div>amount: float,</div>
<div>product: string</div>
<div>}</div>
<div>],</div>
<div>more sub documents...</div>
<div>}</div>
<div><br>
</div>
<div>In above schema I have indexed fields such as- date in main
document, createdAt, firstTransaction, lastTransaction,
product in transaction sub document. Sub document length
varies from 0-50 or so. My question is that, despite of
indexing those fields, queries such as getting all users who
have done transaction within a day or two takes over two
minutes which is very slow. If the duration is month it takes
over 10 minutes to fetch the records. If I use multiple fields
in same query (which are all indexed) the time whoppingly
increases to over 30 minutes. I have checked all the
configuration &amp; it really is suitable for the scenario.
For e.g.: RAM to sum over all the shards constitutes to more
that 10 GB, with each shard having 7.5 GB of RAM. Index size
is 4 GB. Servers are especially dedicated to MongoDB only, so
no other overhead for the server. All the RAM is dedicated to
MongoDB. Still it takes so much time. For the multiple fields
query I tried to build compound index but to my surprise
instead of performance gain, the query seems to get stuck
&amp; never returns the result (I mean it "NEVER"). I don't
know where I went wrong. I am newbie to the system, so any
expert who can help me over here is welcome. I tried the
solution online &amp; found out that there is problem with the
range query in MongoDB, which sometimes doesn't take into
consideration one bound (either lower or upper). So for that I
found out the solution, which is min(), max() cursor functions
&amp; they work properly (thank God! :)) for main document
field date. But for sub document fields such as
firstTransaction or lastTransaction it shows the error as:</div>
<div><br>
</div>
<div><b>planner returned error: unable to find relevant index
for max/min query</b><br>
</div>
<div><b><br>
</b></div>
<div>I tried using hint() but then it gave another error, which
is:</div>
<div><br>
</div>
<div><b>planner returned error: hint provided does not work with
min query</b></div>
<div><b><br>
</b></div>
<div>Queries which I used are:</div>
<div><br>
</div>
<div><b>db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).explain()</b><br>
</div>
<div><b><br>
</b></div>
<div>
<div><b>db.user.find().min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).hint("transaction.firstTransaction_1").explain()</b><br>
</div>
</div>
<div>Above two queries returned me the two errors mentioned
above.</div>
<div><br>
</div>
<div>Queries which performs slowly are:</div>
<div><br>
</div>
<div><b>1)
db.user.find({transaction:{$elemMatch:{product:'tv'}}}).min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()</b><br>
</div>
<div><b><br>
</b></div>
<div><b>2)
db.user.find({transaction:{$elemMatch:{product:'fridge',
firstTransaction:{$in:[ISODate("2015-01-01"),ISODate("2015-01-02")]}}}}).explain()<br>
</b></div>
<div><b><br>
</b></div>
<div><b>3)
db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain()<br>
</b></div>
<div>This query retrieves over half of the collection &amp;
takes over 20 minutes despite of having the field indexed.</div>
<div><br>
</div>
<div><b>4)
db.user.find().min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()</b><br>
</div>
<div>This query takes a while when run at first (around 2 mins).
But after warming up it executes really fast (in matter of
milliseconds :)).</div>
<div><br>
</div>
<div>So I tried warming up for the queries 1,3 but no
improvement.</div>
<div><br>
</div>
<div>Indexes are:</div>
<div><br>
</div>
<div><b>db.user.ensureIndex({date:1})</b></div>
<div><b><br>
</b></div>
<div><b>db.user.ensureIndex({"transaction.product":1})</b></div>
<div><b><br>
</b></div>
<div><b>db.user.ensureIndex({"transaction.firstTransaction":1})</b><b><br>
</b></div>
<div><b><br>
</b></div>
<div><b>db.user.ensureIndex({"transaction.lastTransaction":1})</b><b><br>
</b></div>
<div><b><br>
</b></div>
<div><b>db.user.ensureIndex({</b><b>transaction.firstTransaction":1,
"transaction.product":1</b><b>})</b></div>
<div><b><br>
</b></div>
<div><b>db.user.ensureIndex({date:1, "transaction.product":1})</b></div>
<div><b><br>
</b></div>
<div>My main concern is the 2 query. For other queries I can
warm up the database, but for this query however I couldn't
find the solution. I have tried different versions of it such
as with min max functions, $gt, $lt operators, &amp; with
compound index, but no help. Can anyone find out what's going
wrong here? Thanks to bear with my really long question, you
can guess how desperate I am to solve this :P . Please let me
know if I have miss any of the details.</div>
</div>
-- <br>
You received this message because you are subscribed to the Google
Groups "mongodb-user"<br>
group.<br>
 <br>
For other MongoDB technical support options, see: <a
moz-do-not-send="true"
href="http://www.mongodb.org/about/support/">http://www.mongodb.org/about/support/</a>.<br>
--- <br>
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.<br>
To unsubscribe from this group and stop receiving emails from it,
send an email to <a moz-do-not-send="true"
href="mailto:mongodb-user+***@googlegroups.com">mongodb-user+***@googlegroups.com</a>.<br>
To post to this group, send email to <a moz-do-not-send="true"
href="mailto:mongodb-***@googlegroups.com">mongodb-***@googlegroups.com</a>.<br>
Visit this group at <a moz-do-not-send="true"
href="http://groups.google.com/group/mongodb-user">http://groups.google.com/group/mongodb-user</a>.<br>
To view this discussion on the web visit <a
moz-do-not-send="true"
href="https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com?utm_medium=email&amp;utm_source=footer">https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com</a>.<br>
For more options, visit <a moz-do-not-send="true"
href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.<br>
</blockquote>
<br>
</body>
</html>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups &quot;mongodb-user&quot;<br />
group.<br />
&nbsp;<br />
For other MongoDB technical support options, see: <a href="http://www.mongodb.org/about/support/">http://www.mongodb.org/about/support/</a>.<br />
--- <br />
You received this message because you are subscribed to the Google Groups &quot;mongodb-user&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:mongodb-user+***@googlegroups.com">mongodb-user+***@googlegroups.com</a>.<br />
To post to this group, send email to <a href="mailto:mongodb-***@googlegroups.com">mongodb-***@googlegroups.com</a>.<br />
Visit this group at <a href="http://groups.google.com/group/mongodb-user">http://groups.google.com/group/mongodb-user</a>.<br />
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/mongodb-user/54D4B50D.6050300%40semandex.net?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/mongodb-user/54D4B50D.6050300%40semandex.net</a>.<br />
For more options, visit <a href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.<br />
Harshad Shirsat
2015-02-06 13:11:06 UTC
Permalink
Following is the output for:
query:
db.user.find({transaction:{$elemMatch:{product:"mobile",
firstTransaction:{$gte:ISODate("2015-01-01"),
$lt:ISODate("2015-01-02")}}}}).hint("transaction.firstTransaction_1_transaction.product_1").explain()
output:
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 622,
"nscannedObjects" : 350931,
"nscanned" : 352000,
"nscannedObjectsAllPlans" : 350931,
"nscannedAllPlans" : 352000,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 119503,
"nChunkSkips" : 0,
"millis" : 375693,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,

ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 547,
"nscannedObjects" : 350984,
"nscanned" : 352028,
"nscannedObjectsAllPlans" : 350984,
"nscannedAllPlans" : 352028,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 132669,
"nChunkSkips" : 0,
"millis" : 891898,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,

ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"n" : 1169,
"nChunkSkips" : 0,
"nYields" : 252172,
"nscanned" : 704028,
"nscannedAllPlans" : 704028,
"nscannedObjects" : 701915,
"nscannedObjectsAllPlans" : 701915,
"millisShardTotal" : 1267591,
"millisShardAvg" : 633795,
"numQueries" : 2,
"numShards" : 2,
"millis" : 891910
}

Query: db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain()
Output:
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 553072,
"nscannedObjects" : 553072,
"nscanned" : 553072,
"nscannedObjectsAllPlans" : 553072,
"nscannedAllPlans" : 553072,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 164888,
"nChunkSkips" : 0,
"millis" : 337909,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 554176,
"nscannedObjects" : 554176,
"nscanned" : 554176,
"nscannedObjectsAllPlans" : 554176,
"nscannedAllPlans" : 554176,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 107496,
"nChunkSkips" : 0,
"millis" : 327928,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor transaction.product_1",
"n" : 1107248,
"nChunkSkips" : 0,
"nYields" : 272384,
"nscanned" : 1107248,
"nscannedAllPlans" : 1107248,
"nscannedObjects" : 1107248,
"nscannedObjectsAllPlans" : 1107248,
"millisShardTotal" : 665837,
"millisShardAvg" : 332918,
"numQueries" : 2,
"numShards" : 2,
"millis" : 337952
}
I am not an expert but "explain()" output would be helpful to look into
this.
Hi,
I am currently working on a project for which I am using MongoDB as a
back end. I have around 30-40 fields in schema including that of sub
documents. I have over 2.5 million of documents in the collection which
constitute for over 30 GB. It has been distributed in sharded environment
{
_id: UUID,
name: string,
addr: string,
date: ISODate,
dob: ISODate,
.
.
.
transaction:[
{
createdAt: ISODate,
firstTransaction: ISODate,
lastTransaction: ISODate,
amount: float,
product: string
}
],
more sub documents...
}
In above schema I have indexed fields such as- date in main document,
createdAt, firstTransaction, lastTransaction, product in transaction sub
document. Sub document length varies from 0-50 or so. My question is that,
despite of indexing those fields, queries such as getting all users who
have done transaction within a day or two takes over two minutes which is
very slow. If the duration is month it takes over 10 minutes to fetch the
records. If I use multiple fields in same query (which are all indexed) the
time whoppingly increases to over 30 minutes. I have checked all the
configuration & it really is suitable for the scenario. For e.g.: RAM to
sum over all the shards constitutes to more that 10 GB, with each shard
having 7.5 GB of RAM. Index size is 4 GB. Servers are especially dedicated
to MongoDB only, so no other overhead for the server. All the RAM is
dedicated to MongoDB. Still it takes so much time. For the multiple fields
query I tried to build compound index but to my surprise instead of
performance gain, the query seems to get stuck & never returns the result
(I mean it "NEVER"). I don't know where I went wrong. I am newbie to the
system, so any expert who can help me over here is welcome. I tried the
solution online & found out that there is problem with the range query in
MongoDB, which sometimes doesn't take into consideration one bound (either
lower or upper). So for that I found out the solution, which is min(),
max() cursor functions & they work properly (thank God! :)) for main
document field date. But for sub document fields such as firstTransaction
*planner returned error: unable to find relevant index for max/min query*
*planner returned error: hint provided does not work with min query*
*db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).explain()*
*db.user.find().min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).hint("transaction.firstTransaction_1").explain()*
Above two queries returned me the two errors mentioned above.
*1)
db.user.find({transaction:{$elemMatch:{product:'tv'}}}).min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*
*2) db.user.find({transaction:{$elemMatch:{product:'fridge',
firstTransaction:{$in:[ISODate("2015-01-01"),ISODate("2015-01-02")]}}}}).explain()
*
*3) db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain() *
This query retrieves over half of the collection & takes over 20 minutes
despite of having the field indexed.
*4)
db.user.find().min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*
This query takes a while when run at first (around 2 mins). But after
warming up it executes really fast (in matter of milliseconds :)).
So I tried warming up for the queries 1,3 but no improvement.
*db.user.ensureIndex({date:1})*
*db.user.ensureIndex({"transaction.product":1})*
*db.user.ensureIndex({"transaction.firstTransaction":1})*
*db.user.ensureIndex({"transaction.lastTransaction":1})*
*db.user.ensureIndex({**transaction.firstTransaction":1,
"transaction.product":1**})*
*db.user.ensureIndex({date:1, "transaction.product":1})*
My main concern is the 2 query. For other queries I can warm up the
database, but for this query however I couldn't find the solution. I have
tried different versions of it such as with min max functions, $gt, $lt
operators, & with compound index, but no help. Can anyone find out what's
going wrong here? Thanks to bear with my really long question, you can
guess how desperate I am to solve this :P . Please let me know if I have
miss any of the details.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
<javascript:>.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/ea0610b1-e56b-47b9-a7f0-7f0cfc9bce7a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Dhananjay Makwana
2015-02-06 13:29:28 UTC
Permalink
<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
As I said before I am no expert but here are some pointers for you<br>
- why are you storing transactions as part of user collection? If
you have many of them then it is going to be a problem
inserting/updating/querying. <br>
- what is your shard key? Is it giving you sufficient query
isolation (aka targeted query)? Otherwise, mongo will do a
scatter-gather query which will be slow. <br>
- check your workingset size and see if it can fit in RAM.
<a class="moz-txt-link-freetext" href="http://docs.mongodb.org/manual/faq/diagnostics/">http://docs.mongodb.org/manual/faq/diagnostics/</a><br>
- keep running "mongostat --discover" against your cluster. Note in
particular the index miss and faults numbers. <br>
- check "iostat" output on your shards to see if disk is giving you
sufficient throughput or not. <br>
<br>
Others on the list may be able to give you more pointers. <br>
<br>
<div class="moz-cite-prefix">On 2/6/15 8:11 AM, Harshad Shirsat
wrote:<br>
</div>
<blockquote
cite="mid:ea0610b1-e56b-47b9-a7f0-***@googlegroups.com"
type="cite">
<div>Following is the output for:</div>
<div>query:</div>
<div>db.user.find({transaction:{$elemMatch:{product:"mobile",
firstTransaction:{$gte:ISODate("2015-01-01"),
$lt:ISODate("2015-01-02")}}}}).hint("transaction.firstTransaction_1_transaction.product_1").explain()</div>
<div>output:</div>
<div>{</div>
<div>        "clusteredType" : "ParallelSort",</div>
<div>        "shards" : {</div>
<div>               
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [</div>
<div>                        {</div>
<div>                                "cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",</div>
<div>                                "isMultiKey" : true,</div>
<div>                                "n" : 622,</div>
<div>                                "nscannedObjects" : 350931,</div>
<div>                                "nscanned" : 352000,</div>
<div>                                "nscannedObjectsAllPlans" :
350931,</div>
<div>                                "nscannedAllPlans" : 352000,</div>
<div>                                "scanAndOrder" : false,</div>
<div>                                "indexOnly" : false,</div>
<div>                                "nYields" : 119503,</div>
<div>                                "nChunkSkips" : 0,</div>
<div>                                "millis" : 375693,</div>
<div>                                "indexBounds" : {</div>
<div>                                        "transaction.product"
: [</div>
<div>                                                [</div>
<div>                                                       
"mobile",</div>
<div>                                                       
"mobile"</div>
<div>                                                ]</div>
<div>                                        ],</div>
<div>                                       
"transaction.firstTransaction" : [</div>
<div>                                                [</div>
<div>                                                        true,</div>
<div>                                                       
ISODate("2015-01-02T00:00:00Z")</div>
<div>                                                ]</div>
<div>                                        ]</div>
<div>                                },</div>
<div>                                "server" :
"ip-12-0-0-31:27017",</div>
<div>                                "filterSet" : false</div>
<div>                        }</div>
<div>                ],</div>
<div>               
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [</div>
<div>                        {</div>
<div>                                "cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",</div>
<div>                                "isMultiKey" : true,</div>
<div>                                "n" : 547,</div>
<div>                                "nscannedObjects" : 350984,</div>
<div>                                "nscanned" : 352028,</div>
<div>                                "nscannedObjectsAllPlans" :
350984,</div>
<div>                                "nscannedAllPlans" : 352028,</div>
<div>                                "scanAndOrder" : false,</div>
<div>                                "indexOnly" : false,</div>
<div>                                "nYields" : 132669,</div>
<div>                                "nChunkSkips" : 0,</div>
<div>                                "millis" : 891898,</div>
<div>                                "indexBounds" : {</div>
<div>                                        "transaction.product"
: [</div>
<div>                                                [</div>
<div>                                                       
"mobile",</div>
<div>                                                       
"mobile"</div>
<div>                                                ]</div>
<div>                                        ],</div>
<div>                                       
"transaction.firstTransaction" : [</div>
<div>                                                [</div>
<div>                                                        true,</div>
<div>                                                       
ISODate("2015-01-02T00:00:00Z")</div>
<div>                                                ]</div>
<div>                                        ]</div>
<div>                                },</div>
<div>                                "server" :
"ip-12-0-0-34:27017",</div>
<div>                                "filterSet" : false</div>
<div>                        }</div>
<div>                ]</div>
<div>        },</div>
<div>        "cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",</div>
<div>        "n" : 1169,</div>
<div>        "nChunkSkips" : 0,</div>
<div>        "nYields" : 252172,</div>
<div>        "nscanned" : 704028,</div>
<div>        "nscannedAllPlans" : 704028,</div>
<div>        "nscannedObjects" : 701915,</div>
<div>        "nscannedObjectsAllPlans" : 701915,</div>
<div>        "millisShardTotal" : 1267591,</div>
<div>        "millisShardAvg" : 633795,</div>
<div>        "numQueries" : 2,</div>
<div>        "numShards" : 2,</div>
<div>        "millis" : 891910</div>
<div>}</div>
<div><br>
</div>
<div>Query:
db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain()</div>
<div>Output:</div>
<div>{</div>
<div>        "clusteredType" : "ParallelSort",</div>
<div>        "shards" : {</div>
<div>               
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [</div>
<div>                        {</div>
<div>                                "cursor" : "BtreeCursor
transaction.product_1",</div>
<div>                                "isMultiKey" : true,</div>
<div>                                "n" : 553072,</div>
<div>                                "nscannedObjects" : 553072,</div>
<div>                                "nscanned" : 553072,</div>
<div>                                "nscannedObjectsAllPlans" :
553072,</div>
<div>                                "nscannedAllPlans" : 553072,</div>
<div>                                "scanAndOrder" : false,</div>
<div>                                "indexOnly" : false,</div>
<div>                                "nYields" : 164888,</div>
<div>                                "nChunkSkips" : 0,</div>
<div>                                "millis" : 337909,</div>
<div>                                "indexBounds" : {</div>
<div>                                        "transaction.product"
: [</div>
<div>                                                [</div>
<div>                                                       
"mobile",</div>
<div>                                                       
"mobile"</div>
<div>                                                ]</div>
<div>                                        ]</div>
<div>                                },</div>
<div>                                "server" :
"ip-12-0-0-31:27017",</div>
<div>                                "filterSet" : false</div>
<div>                        }</div>
<div>                ],</div>
<div>               
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [</div>
<div>                        {</div>
<div>                                "cursor" : "BtreeCursor
transaction.product_1",</div>
<div>                                "isMultiKey" : true,</div>
<div>                                "n" : 554176,</div>
<div>                                "nscannedObjects" : 554176,</div>
<div>                                "nscanned" : 554176,</div>
<div>                                "nscannedObjectsAllPlans" :
554176,</div>
<div>                                "nscannedAllPlans" : 554176,</div>
<div>                                "scanAndOrder" : false,</div>
<div>                                "indexOnly" : false,</div>
<div>                                "nYields" : 107496,</div>
<div>                                "nChunkSkips" : 0,</div>
<div>                                "millis" : 327928,</div>
<div>                                "indexBounds" : {</div>
<div>                                        "transaction.product"
: [</div>
<div>                                                [</div>
<div>                                                       
"mobile",</div>
<div>                                                       
"mobile"</div>
<div>                                                ]</div>
<div>                                        ]</div>
<div>                                },</div>
<div>                                "server" :
"ip-12-0-0-34:27017",</div>
<div>                                "filterSet" : false</div>
<div>                        }</div>
<div>                ]</div>
<div>        },</div>
<div>        "cursor" : "BtreeCursor transaction.product_1",</div>
<div>        "n" : 1107248,</div>
<div>        "nChunkSkips" : 0,</div>
<div>        "nYields" : 272384,</div>
<div>        "nscanned" : 1107248,</div>
<div>        "nscannedAllPlans" : 1107248,</div>
<div>        "nscannedObjects" : 1107248,</div>
<div>        "nscannedObjectsAllPlans" : 1107248,</div>
<div>        "millisShardTotal" : 665837,</div>
<div>        "millisShardAvg" : 332918,</div>
<div>        "numQueries" : 2,</div>
<div>        "numShards" : 2,</div>
<div>        "millis" : 337952</div>
<div>}</div>
<br>
On Friday, 6 February 2015 18:05:36 UTC+5:30, Dhananjay Makwana
wrote:
<blockquote class="gmail_quote" style="margin: 0;margin-left:
0.8ex;border-left: 1px #ccc solid;padding-left: 1ex;">
<div bgcolor="#FFFFFF" text="#000000"> I am not an expert but
"explain()" output would be helpful to look into this. <br>
<br>
<div>On 2/6/15 1:26 AM, Harshad Shirsat wrote:<br>
</div>
<blockquote type="cite">
<div dir="ltr">Hi,
<div><br>
</div>
<div>I am currently working on a project for which I am
using MongoDB as a back end. I have around 30-40 fields
in schema including that of sub documents. I have over
2.5 million of documents in the collection which
constitute for over 30 GB. It has been distributed in
sharded environment with 2 shards, each consisting of
replica set. Sample document is as below:</div>
<div><br>
</div>
<div>user:</div>
<div>{</div>
<div>_id: UUID,</div>
<div>name: string,</div>
<div>addr: string,</div>
<div>date: ISODate,</div>
<div>dob: ISODate,</div>
<div>.</div>
<div>.</div>
<div>.</div>
<div>transaction:[</div>
<div>{</div>
<div>createdAt: ISODate,</div>
<div>firstTransaction: ISODate,</div>
<div>lastTransaction: ISODate,</div>
<div>amount: float,</div>
<div>product: string</div>
<div>}</div>
<div>],</div>
<div>more sub documents...</div>
<div>}</div>
<div><br>
</div>
<div>In above schema I have indexed fields such as- date
in main document, createdAt, firstTransaction,
lastTransaction, product in transaction sub document.
Sub document length varies from 0-50 or so. My question
is that, despite of indexing those fields, queries such
as getting all users who have done transaction within a
day or two takes over two minutes which is very slow. If
the duration is month it takes over 10 minutes to fetch
the records. If I use multiple fields in same query
(which are all indexed) the time whoppingly increases to
over 30 minutes. I have checked all the configuration
&amp; it really is suitable for the scenario. For e.g.:
RAM to sum over all the shards constitutes to more that
10 GB, with each shard having 7.5 GB of RAM. Index size
is 4 GB. Servers are especially dedicated to MongoDB
only, so no other overhead for the server. All the RAM
is dedicated to MongoDB. Still it takes so much time.
For the multiple fields query I tried to build compound
index but to my surprise instead of performance gain,
the query seems to get stuck &amp; never returns the
result (I mean it "NEVER"). I don't know where I went
wrong. I am newbie to the system, so any expert who can
help me over here is welcome. I tried the solution
online &amp; found out that there is problem with the
range query in MongoDB, which sometimes doesn't take
into consideration one bound (either lower or upper). So
for that I found out the solution, which is min(), max()
cursor functions &amp; they work properly (thank God!
:)) for main document field date. But for sub document
fields such as firstTransaction or lastTransaction it
shows the error as:</div>
<div><br>
</div>
<div><b>planner returned error: unable to find relevant
index for max/min query</b><br>
</div>
<div><b><br>
</b></div>
<div>I tried using hint() but then it gave another error,
which is:</div>
<div><br>
</div>
<div><b>planner returned error: hint provided does not
work with min query</b></div>
<div><b><br>
</b></div>
<div>Queries which I used are:</div>
<div><br>
</div>
<div><b>db.user.find({transaction:{$<wbr>elemMatch:{product:'mobile'}}}<wbr>).min({transaction:{$<wbr>elemMatch:{firstTransaction:<wbr>ISODate("2015-01-01")}}}).max(<wbr>{transaction:{$elemMatch:{<wbr>lastTransaction:ISODate("2015-<wbr>01-02")}}}).explain()</b><br>
</div>
<div><b><br>
</b></div>
<div>
<div><b>db.user.find().min({<wbr>transaction:{$elemMatch:{<wbr>firstTransaction:ISODate("<wbr>2015-01-01")}}}).max({<wbr>transaction:{$elemMatch:{<wbr>lastTransaction:ISODate("2015-<wbr>01-02")}}}).hint("transaction.<wbr>firstTransaction_1").explain()</b><br>
</div>
</div>
<div>Above two queries returned me the two errors
mentioned above.</div>
<div><br>
</div>
<div>Queries which performs slowly are:</div>
<div><br>
</div>
<div><b>1)
db.user.find({transaction:{$<wbr>elemMatch:{product:'tv'}}}).<wbr>min({date:ISODate("2014-12-01"<wbr>)}).max({date:ISODate("2014-<wbr>12-31")}).explain()</b><br>
</div>
<div><b><br>
</b></div>
<div><b>2) db.user.find({transaction:{$<wbr>elemMatch:{product:'fridge',
firstTransaction:{$in:[<wbr>ISODate("2015-01-01"),ISODate(<wbr>"2015-01-02")]}}}}).explain()<br>
</b></div>
<div><b><br>
</b></div>
<div><b>3) db.user.find({transaction:{$<wbr>elemMatch:{product:'mobile'}}}<wbr>).explain()<br>
</b></div>
<div>This query retrieves over half of the collection
&amp; takes over 20 minutes despite of having the field
indexed.</div>
<div><br>
</div>
<div><b>4)
db.user.find().min({date:<wbr>ISODate("2014-12-01")}).max({<wbr>date:ISODate("2014-12-31")}).<wbr>explain()</b><br>
</div>
<div>This query takes a while when run at first (around 2
mins). But after warming up it executes really fast (in
matter of milliseconds :)).</div>
<div><br>
</div>
<div>So I tried warming up for the queries 1,3 but no
improvement.</div>
<div><br>
</div>
<div>Indexes are:</div>
<div><br>
</div>
<div><b>db.user.ensureIndex({date:1})</b></div>
<div><b><br>
</b></div>
<div><b>db.user.ensureIndex({"<wbr>transaction.product":1})</b></div>
<div><b><br>
</b></div>
<div><b>db.user.ensureIndex({"<wbr>transaction.firstTransaction":<wbr>1})</b><b><br>
</b></div>
<div><b><br>
</b></div>
<div><b>db.user.ensureIndex({"<wbr>transaction.lastTransaction":<wbr>1})</b><b><br>
</b></div>
<div><b><br>
</b></div>
<div><b>db.user.ensureIndex({</b><b>transacti<wbr>on.firstTransaction":1,

"transaction.product":1</b><b>})</b></div>
<div><b><br>
</b></div>
<div><b>db.user.ensureIndex({date:1,
"transaction.product":1})</b></div>
<div><b><br>
</b></div>
<div>My main concern is the 2 query. For other queries I
can warm up the database, but for this query however I
couldn't find the solution. I have tried different
versions of it such as with min max functions, $gt, $lt
operators, &amp; with compound index, but no help. Can
anyone find out what's going wrong here? Thanks to bear
with my really long question, you can guess how
desperate I am to solve this :P . Please let me know if
I have miss any of the details.</div>
</div>
-- <br>
You received this message because you are subscribed to the
Google Groups "mongodb-user"<br>
group.<br>
 <br>
For other MongoDB technical support options, see: <a
moz-do-not-send="true"
href="http://www.mongodb.org/about/support/"
target="_blank" rel="nofollow"
onmousedown="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.mongodb.org%2Fabout%2Fsupport%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNH7yr56Y1g6tetBaxEcNyBGf3MWig';return
true;"
onclick="this.href='http://www.google.com/url?q\75http%3A%2F%2Fwww.mongodb.org%2Fabout%2Fsupport%2F\46sa\75D\46sntz\0751\46usg\75AFQjCNH7yr56Y1g6tetBaxEcNyBGf3MWig';return
true;">http://www.mongodb.org/about/<wbr>support/</a>.<br>
--- <br>
You received this message because you are subscribed to the
Google Groups "mongodb-user" group.<br>
To unsubscribe from this group and stop receiving emails
from it, send an email to <a moz-do-not-send="true"
href="javascript:" target="_blank"
gdf-obfuscated-mailto="LlMKsAF_LacJ" rel="nofollow"
onmousedown="this.href='javascript:';return true;"
onclick="this.href='javascript:';return true;">mongodb-user...@<wbr>googlegroups.com</a>.<br>
To post to this group, send email to <a
moz-do-not-send="true" href="javascript:" target="_blank"
gdf-obfuscated-mailto="LlMKsAF_LacJ" rel="nofollow"
onmousedown="this.href='javascript:';return true;"
onclick="this.href='javascript:';return true;">***@googlegroups.com</a>.<br>
Visit this group at <a moz-do-not-send="true"
href="http://groups.google.com/group/mongodb-user"
target="_blank" rel="nofollow"
onmousedown="this.href='http://groups.google.com/group/mongodb-user';return
true;"
onclick="this.href='http://groups.google.com/group/mongodb-user';return
true;">http://groups.google.com/<wbr>group/mongodb-user</a>.<br>
To view this discussion on the web visit <a
moz-do-not-send="true"
href="https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com?utm_medium=email&amp;utm_source=footer"
target="_blank" rel="nofollow"
onmousedown="this.href='https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return
true;"
onclick="this.href='https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com?utm_medium\75email\46utm_source\75footer';return
true;">https://groups.google.com/d/<wbr>msgid/mongodb-user/c34f0a0b-<wbr>0915-4d6f-9c1b-27ca73bb0449%<wbr>40googlegroups.com</a>.<br>
For more options, visit <a moz-do-not-send="true"
href="https://groups.google.com/d/optout" target="_blank"
rel="nofollow"
onmousedown="this.href='https://groups.google.com/d/optout';return
true;"
onclick="this.href='https://groups.google.com/d/optout';return
true;">https://groups.google.com/d/<wbr>optout</a>.<br>
</blockquote>
<br>
</div>
</blockquote>
-- <br>
You received this message because you are subscribed to the Google
Groups "mongodb-user"<br>
group.<br>
 <br>
For other MongoDB technical support options, see: <a
moz-do-not-send="true"
href="http://www.mongodb.org/about/support/">http://www.mongodb.org/about/support/</a>.<br>
--- <br>
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.<br>
To unsubscribe from this group and stop receiving emails from it,
send an email to <a moz-do-not-send="true"
href="mailto:mongodb-user+***@googlegroups.com">mongodb-user+***@googlegroups.com</a>.<br>
To post to this group, send email to <a moz-do-not-send="true"
href="mailto:mongodb-***@googlegroups.com">mongodb-***@googlegroups.com</a>.<br>
Visit this group at <a moz-do-not-send="true"
href="http://groups.google.com/group/mongodb-user">http://groups.google.com/group/mongodb-user</a>.<br>
To view this discussion on the web visit <a
moz-do-not-send="true"
href="https://groups.google.com/d/msgid/mongodb-user/ea0610b1-e56b-47b9-a7f0-7f0cfc9bce7a%40googlegroups.com?utm_medium=email&amp;utm_source=footer">https://groups.google.com/d/msgid/mongodb-user/ea0610b1-e56b-47b9-a7f0-7f0cfc9bce7a%40googlegroups.com</a>.<br>
For more options, visit <a moz-do-not-send="true"
href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.<br>
</blockquote>
<br>
</body>
</html>

<p></p>

-- <br />
You received this message because you are subscribed to the Google Groups &quot;mongodb-user&quot;<br />
group.<br />
&nbsp;<br />
For other MongoDB technical support options, see: <a href="http://www.mongodb.org/about/support/">http://www.mongodb.org/about/support/</a>.<br />
--- <br />
You received this message because you are subscribed to the Google Groups &quot;mongodb-user&quot; group.<br />
To unsubscribe from this group and stop receiving emails from it, send an email to <a href="mailto:mongodb-user+***@googlegroups.com">mongodb-user+***@googlegroups.com</a>.<br />
To post to this group, send email to <a href="mailto:mongodb-***@googlegroups.com">mongodb-***@googlegroups.com</a>.<br />
Visit this group at <a href="http://groups.google.com/group/mongodb-user">http://groups.google.com/group/mongodb-user</a>.<br />
To view this discussion on the web visit <a href="https://groups.google.com/d/msgid/mongodb-user/54D4C1B8.6010605%40semandex.net?utm_medium=email&utm_source=footer">https://groups.google.com/d/msgid/mongodb-user/54D4C1B8.6010605%40semandex.net</a>.<br />
For more options, visit <a href="https://groups.google.com/d/optout">https://groups.google.com/d/optout</a>.<br />
Harshad Shirsat
2015-02-09 05:54:15 UTC
Permalink
Thanks for your help Dhananjay. However I have some doubts, I will
appreciate if you clarify them.
1) My goal to include transactions as part of user collection was to store
all things in same document. & wasn't MongoDB created for such scenarios?
If I create other collection for transactions I have to join then for
querying purpose & wouldn't that defeat the purpose of MongoDB?
2) My shard key is _id in the document, which is unique & indexed by
default. I couldn't really apprehend your suggestion reagarding shard key.
So now will you be able to elaborate more?
3) I checked about the working set size. It does fit into memory as I
explained in my main question.

Thanks for your other suggestions. I will check out them & let you know
whether it works or not.

Thanks.
Post by Dhananjay Makwana
As I said before I am no expert but here are some pointers for you
- why are you storing transactions as part of user collection? If you have
many of them then it is going to be a problem inserting/updating/querying.
- what is your shard key? Is it giving you sufficient query isolation (aka
targeted query)? Otherwise, mongo will do a scatter-gather query which will
be slow.
- check your workingset size and see if it can fit in RAM.
http://docs.mongodb.org/manual/faq/diagnostics/
- keep running "mongostat --discover" against your cluster. Note in
particular the index miss and faults numbers.
- check "iostat" output on your shards to see if disk is giving you
sufficient throughput or not.
Others on the list may be able to give you more pointers.
db.user.find({transaction:{$elemMatch:{product:"mobile",
firstTransaction:{$gte:ISODate("2015-01-01"),
$lt:ISODate("2015-01-02")}}}}).hint("transaction.firstTransaction_1_transaction.product_1").explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 622,
"nscannedObjects" : 350931,
"nscanned" : 352000,
"nscannedObjectsAllPlans" : 350931,
"nscannedAllPlans" : 352000,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 119503,
"nChunkSkips" : 0,
"millis" : 375693,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,
ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 547,
"nscannedObjects" : 350984,
"nscanned" : 352028,
"nscannedObjectsAllPlans" : 350984,
"nscannedAllPlans" : 352028,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 132669,
"nChunkSkips" : 0,
"millis" : 891898,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,
ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"n" : 1169,
"nChunkSkips" : 0,
"nYields" : 252172,
"nscanned" : 704028,
"nscannedAllPlans" : 704028,
"nscannedObjects" : 701915,
"nscannedObjectsAllPlans" : 701915,
"millisShardTotal" : 1267591,
"millisShardAvg" : 633795,
"numQueries" : 2,
"numShards" : 2,
"millis" : 891910
}
db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 553072,
"nscannedObjects" : 553072,
"nscanned" : 553072,
"nscannedObjectsAllPlans" : 553072,
"nscannedAllPlans" : 553072,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 164888,
"nChunkSkips" : 0,
"millis" : 337909,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 554176,
"nscannedObjects" : 554176,
"nscanned" : 554176,
"nscannedObjectsAllPlans" : 554176,
"nscannedAllPlans" : 554176,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 107496,
"nChunkSkips" : 0,
"millis" : 327928,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor transaction.product_1",
"n" : 1107248,
"nChunkSkips" : 0,
"nYields" : 272384,
"nscanned" : 1107248,
"nscannedAllPlans" : 1107248,
"nscannedObjects" : 1107248,
"nscannedObjectsAllPlans" : 1107248,
"millisShardTotal" : 665837,
"millisShardAvg" : 332918,
"numQueries" : 2,
"numShards" : 2,
"millis" : 337952
}
I am not an expert but "explain()" output would be helpful to look into
this.
Hi,
I am currently working on a project for which I am using MongoDB as a
back end. I have around 30-40 fields in schema including that of sub
documents. I have over 2.5 million of documents in the collection which
constitute for over 30 GB. It has been distributed in sharded environment
{
_id: UUID,
name: string,
addr: string,
date: ISODate,
dob: ISODate,
.
.
.
transaction:[
{
createdAt: ISODate,
firstTransaction: ISODate,
lastTransaction: ISODate,
amount: float,
product: string
}
],
more sub documents...
}
In above schema I have indexed fields such as- date in main document,
createdAt, firstTransaction, lastTransaction, product in transaction sub
document. Sub document length varies from 0-50 or so. My question is that,
despite of indexing those fields, queries such as getting all users who
have done transaction within a day or two takes over two minutes which is
very slow. If the duration is month it takes over 10 minutes to fetch the
records. If I use multiple fields in same query (which are all indexed) the
time whoppingly increases to over 30 minutes. I have checked all the
configuration & it really is suitable for the scenario. For e.g.: RAM to
sum over all the shards constitutes to more that 10 GB, with each shard
having 7.5 GB of RAM. Index size is 4 GB. Servers are especially dedicated
to MongoDB only, so no other overhead for the server. All the RAM is
dedicated to MongoDB. Still it takes so much time. For the multiple fields
query I tried to build compound index but to my surprise instead of
performance gain, the query seems to get stuck & never returns the result
(I mean it "NEVER"). I don't know where I went wrong. I am newbie to the
system, so any expert who can help me over here is welcome. I tried the
solution online & found out that there is problem with the range query in
MongoDB, which sometimes doesn't take into consideration one bound (either
lower or upper). So for that I found out the solution, which is min(),
max() cursor functions & they work properly (thank God! :)) for main
document field date. But for sub document fields such as firstTransaction
*planner returned error: unable to find relevant index for max/min query*
*planner returned error: hint provided does not work with min query*
*db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).explain()*
*db.user.find().min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).hint("transaction.firstTransaction_1").explain()*
Above two queries returned me the two errors mentioned above.
*1)
db.user.find({transaction:{$elemMatch:{product:'tv'}}}).min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*
*2) db.user.find({transaction:{$elemMatch:{product:'fridge',
firstTransaction:{$in:[ISODate("2015-01-01"),ISODate("2015-01-02")]}}}}).explain()
*
*3) db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain() *
This query retrieves over half of the collection & takes over 20 minutes
despite of having the field indexed.
*4)
db.user.find().min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*
This query takes a while when run at first (around 2 mins). But after
warming up it executes really fast (in matter of milliseconds :)).
So I tried warming up for the queries 1,3 but no improvement.
*db.user.ensureIndex({date:1})*
*db.user.ensureIndex({"transaction.product":1})*
*db.user.ensureIndex({"transaction.firstTransaction":1})*
*db.user.ensureIndex({"transaction.lastTransaction":1})*
*db.user.ensureIndex({**transaction.firstTransaction":1,
"transaction.product":1**})*
*db.user.ensureIndex({date:1, "transaction.product":1})*
My main concern is the 2 query. For other queries I can warm up the
database, but for this query however I couldn't find the solution. I have
tried different versions of it such as with min max functions, $gt, $lt
operators, & with compound index, but no help. Can anyone find out what's
going wrong here? Thanks to bear with my really long question, you can
guess how desperate I am to solve this :P . Please let me know if I have
miss any of the details.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
<javascript:>.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/ea0610b1-e56b-47b9-a7f0-7f0cfc9bce7a%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/ea0610b1-e56b-47b9-a7f0-7f0cfc9bce7a%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/23dec06f-ea13-4402-88e3-91abe4429682%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Harshad Shirsat
2015-02-10 08:35:43 UTC
Permalink
I have found with mongostat that there are over 2000 of page faults
occurring per second & I think that's the reason queries are being slow.
Warming up also doesn't help much as I checked running the query again &
again, but the improvement was very less (in 1000s of milliseconds)
compared to overall computing time (which is in minutes). Any suggestions
on how can I reduce this? There are no index miss by the way.
Post by Harshad Shirsat
Thanks for your help Dhananjay. However I have some doubts, I will
appreciate if you clarify them.
1) My goal to include transactions as part of user collection was to store
all things in same document. & wasn't MongoDB created for such scenarios?
If I create other collection for transactions I have to join then for
querying purpose & wouldn't that defeat the purpose of MongoDB?
2) My shard key is _id in the document, which is unique & indexed by
default. I couldn't really apprehend your suggestion reagarding shard key.
So now will you be able to elaborate more?
3) I checked about the working set size. It does fit into memory as I
explained in my main question.
Thanks for your other suggestions. I will check out them & let you know
whether it works or not.
Thanks.
As I said before I am no expert but here are some pointers for you
- why are you storing transactions as part of user collection? If you have
many of them then it is going to be a problem inserting/updating/querying.
- what is your shard key? Is it giving you sufficient query isolation (aka
targeted query)? Otherwise, mongo will do a scatter-gather query which will
be slow.
- check your workingset size and see if it can fit in RAM.
http://docs.mongodb.org/manual/faq/diagnostics/
- keep running "mongostat --discover" against your cluster. Note in
particular the index miss and faults numbers.
- check "iostat" output on your shards to see if disk is giving you
sufficient throughput or not.
Others on the list may be able to give you more pointers.
db.user.find({transaction:{$elemMatch:{product:"mobile",
firstTransaction:{$gte:ISODate("2015-01-01"),
$lt:ISODate("2015-01-02")}}}}).hint("transaction.firstTransaction_1_transaction.product_1").explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 622,
"nscannedObjects" : 350931,
"nscanned" : 352000,
"nscannedObjectsAllPlans" : 350931,
"nscannedAllPlans" : 352000,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 119503,
"nChunkSkips" : 0,
"millis" : 375693,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,
ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 547,
"nscannedObjects" : 350984,
"nscanned" : 352028,
"nscannedObjectsAllPlans" : 350984,
"nscannedAllPlans" : 352028,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 132669,
"nChunkSkips" : 0,
"millis" : 891898,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,
ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"n" : 1169,
"nChunkSkips" : 0,
"nYields" : 252172,
"nscanned" : 704028,
"nscannedAllPlans" : 704028,
"nscannedObjects" : 701915,
"nscannedObjectsAllPlans" : 701915,
"millisShardTotal" : 1267591,
"millisShardAvg" : 633795,
"numQueries" : 2,
"numShards" : 2,
"millis" : 891910
}
db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 553072,
"nscannedObjects" : 553072,
"nscanned" : 553072,
"nscannedObjectsAllPlans" : 553072,
"nscannedAllPlans" : 553072,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 164888,
"nChunkSkips" : 0,
"millis" : 337909,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 554176,
"nscannedObjects" : 554176,
"nscanned" : 554176,
"nscannedObjectsAllPlans" : 554176,
"nscannedAllPlans" : 554176,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 107496,
"nChunkSkips" : 0,
"millis" : 327928,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor transaction.product_1",
"n" : 1107248,
"nChunkSkips" : 0,
"nYields" : 272384,
"nscanned" : 1107248,
"nscannedAllPlans" : 1107248,
"nscannedObjects" : 1107248,
"nscannedObjectsAllPlans" : 1107248,
"millisShardTotal" : 665837,
"millisShardAvg" : 332918,
"numQueries" : 2,
"numShards" : 2,
"millis" : 337952
}
I am not an expert but "explain()" output would be helpful to look into
this.
Hi,
I am currently working on a project for which I am using MongoDB as a
back end. I have around 30-40 fields in schema including that of sub
documents. I have over 2.5 million of documents in the collection which
constitute for over 30 GB. It has been distributed in sharded environment
{
_id: UUID,
name: string,
addr: string,
date: ISODate,
dob: ISODate,
.
.
.
transaction:[
{
createdAt: ISODate,
firstTransaction: ISODate,
lastTransaction: ISODate,
amount: float,
product: string
}
],
more sub documents...
}
In above schema I have indexed fields such as- date in main document,
createdAt, firstTransaction, lastTransaction, product in transaction sub
document. Sub document length varies from 0-50 or so. My question is that,
despite of indexing those fields, queries such as getting all users who
have done transaction within a day or two takes over two minutes which is
very slow. If the duration is month it takes over 10 minutes to fetch the
records. If I use multiple fields in same query (which are all indexed) the
time whoppingly increases to over 30 minutes. I have checked all the
configuration & it really is suitable for the scenario. For e.g.: RAM to
sum over all the shards constitutes to more that 10 GB, with each shard
having 7.5 GB of RAM. Index size is 4 GB. Servers are especially dedicated
to MongoDB only, so no other overhead for the server. All the RAM is
dedicated to MongoDB. Still it takes so much time. For the multiple fields
query I tried to build compound index but to my surprise instead of
performance gain, the query seems to get stuck & never returns the result
(I mean it "NEVER"). I don't know where I went wrong. I am newbie to the
system, so any expert who can help me over here is welcome. I tried the
solution online & found out that there is problem with the range query in
MongoDB, which sometimes doesn't take into consideration one bound (either
lower or upper). So for that I found out the solution, which is min(),
max() cursor functions & they work properly (thank God! :)) for main
document field date. But for sub document fields such as firstTransaction
*planner returned error: unable to find relevant index for max/min query*
*planner returned error: hint provided does not work with min query*
*db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).explain()*
*db.user.find().min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).hint("transaction.firstTransaction_1").explain()*
Above two queries returned me the two errors mentioned above.
*1)
db.user.find({transaction:{$elemMatch:{product:'tv'}}}).min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*
*2) db.user.find({transaction:{$elemMatch:{product:'fridge',
firstTransaction:{$in:[ISODate("2015-01-01"),ISODate("2015-01-02")]}}}}).explain()
*
*3) db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain() *
This query retrieves over half of the collection & takes over 20 minutes
despite of having the field indexed.
*4)
db.user.find().min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*
This query takes a while when run at first (around 2 mins). But after
warming up it executes really fast (in matter of milliseconds :)).
So I tried warming up for the queries 1,3 but no improvement.
*db.user.ensureIndex({date:1})*
*db.user.ensureIndex({"transaction.product":1})*
*db.user.ensureIndex({"transaction.firstTransaction":1})*
*db.user.ensureIndex({"transaction.lastTransaction":1})*
*db.user.ensureIndex({**transaction.firstTransaction":1,
"transaction.product":1**})*
*db.user.ensureIndex({date:1, "transaction.product":1})*
My main concern is the 2 query. For other queries I can warm up the
database, but for this query however I couldn't find the solution. I have
tried different versions of it such as with min max functions, $gt, $lt
operators, & with compound index, but no help. Can anyone find out what's
going wrong here? Thanks to bear with my really long question, you can
guess how desperate I am to solve this :P . Please let me know if I have
miss any of the details.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
...
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/753089a2-47e4-406a-91a2-78c35367c003%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Dhananjay D. Makwana
2015-02-10 15:01:13 UTC
Permalink
I am no expert and others on the list can certainly correct me if I am wrong.

But this means that your workingset doesn't fit in your memory so
MongoDB (through your OS) has to move data out to create room. I don't
know of any clear solution. Some of your options could be:
- add more shards
- go for SSDs

http://docs.mongodb.org/manual/faq/diagnostics/ and
http://docs.mongodb.org/manual/administration/production-notes/
may help you figure out your options.
Post by Harshad Shirsat
I have found with mongostat that there are over 2000 of page faults
occurring per second & I think that's the reason queries are being slow.
Warming up also doesn't help much as I checked running the query again &
again, but the improvement was very less (in 1000s of milliseconds)
compared to overall computing time (which is in minutes). Any suggestions
on how can I reduce this? There are no index miss by the way.
Post by Harshad Shirsat
Thanks for your help Dhananjay. However I have some doubts, I will
appreciate if you clarify them.
1) My goal to include transactions as part of user collection was to store
all things in same document. & wasn't MongoDB created for such scenarios?
If I create other collection for transactions I have to join then for
querying purpose & wouldn't that defeat the purpose of MongoDB?
2) My shard key is _id in the document, which is unique & indexed by
default. I couldn't really apprehend your suggestion reagarding shard key.
So now will you be able to elaborate more?
3) I checked about the working set size. It does fit into memory as I
explained in my main question.
Thanks for your other suggestions. I will check out them & let you know
whether it works or not.
Thanks.
As I said before I am no expert but here are some pointers for you
- why are you storing transactions as part of user collection? If you have
many of them then it is going to be a problem inserting/updating/querying.
- what is your shard key? Is it giving you sufficient query isolation (aka
targeted query)? Otherwise, mongo will do a scatter-gather query which will
be slow.
- check your workingset size and see if it can fit in RAM.
http://docs.mongodb.org/manual/faq/diagnostics/
- keep running "mongostat --discover" against your cluster. Note in
particular the index miss and faults numbers.
- check "iostat" output on your shards to see if disk is giving you
sufficient throughput or not.
Others on the list may be able to give you more pointers.
db.user.find({transaction:{$elemMatch:{product:"mobile",
firstTransaction:{$gte:ISODate("2015-01-01"),
$lt:ISODate("2015-01-02")}}}}).hint("transaction.firstTransaction_1_transaction.product_1").explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 622,
"nscannedObjects" : 350931,
"nscanned" : 352000,
"nscannedObjectsAllPlans" : 350931,
"nscannedAllPlans" : 352000,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 119503,
"nChunkSkips" : 0,
"millis" : 375693,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,
ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 547,
"nscannedObjects" : 350984,
"nscanned" : 352028,
"nscannedObjectsAllPlans" : 350984,
"nscannedAllPlans" : 352028,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 132669,
"nChunkSkips" : 0,
"millis" : 891898,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,
ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"n" : 1169,
"nChunkSkips" : 0,
"nYields" : 252172,
"nscanned" : 704028,
"nscannedAllPlans" : 704028,
"nscannedObjects" : 701915,
"nscannedObjectsAllPlans" : 701915,
"millisShardTotal" : 1267591,
"millisShardAvg" : 633795,
"numQueries" : 2,
"numShards" : 2,
"millis" : 891910
}
db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 553072,
"nscannedObjects" : 553072,
"nscanned" : 553072,
"nscannedObjectsAllPlans" : 553072,
"nscannedAllPlans" : 553072,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 164888,
"nChunkSkips" : 0,
"millis" : 337909,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 554176,
"nscannedObjects" : 554176,
"nscanned" : 554176,
"nscannedObjectsAllPlans" : 554176,
"nscannedAllPlans" : 554176,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 107496,
"nChunkSkips" : 0,
"millis" : 327928,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor transaction.product_1",
"n" : 1107248,
"nChunkSkips" : 0,
"nYields" : 272384,
"nscanned" : 1107248,
"nscannedAllPlans" : 1107248,
"nscannedObjects" : 1107248,
"nscannedObjectsAllPlans" : 1107248,
"millisShardTotal" : 665837,
"millisShardAvg" : 332918,
"numQueries" : 2,
"numShards" : 2,
"millis" : 337952
}
I am not an expert but "explain()" output would be helpful to look into
this.
Hi,
I am currently working on a project for which I am using MongoDB as a
back end. I have around 30-40 fields in schema including that of sub
documents. I have over 2.5 million of documents in the collection which
constitute for over 30 GB. It has been distributed in sharded environment
{
_id: UUID,
name: string,
addr: string,
date: ISODate,
dob: ISODate,
.
.
.
transaction:[
{
createdAt: ISODate,
firstTransaction: ISODate,
lastTransaction: ISODate,
amount: float,
product: string
}
],
more sub documents...
}
In above schema I have indexed fields such as- date in main document,
createdAt, firstTransaction, lastTransaction, product in transaction sub
document. Sub document length varies from 0-50 or so. My question is that,
despite of indexing those fields, queries such as getting all users who
have done transaction within a day or two takes over two minutes which is
very slow. If the duration is month it takes over 10 minutes to fetch the
records. If I use multiple fields in same query (which are all indexed) the
time whoppingly increases to over 30 minutes. I have checked all the
configuration & it really is suitable for the scenario. For e.g.: RAM to
sum over all the shards constitutes to more that 10 GB, with each shard
having 7.5 GB of RAM. Index size is 4 GB. Servers are especially dedicated
to MongoDB only, so no other overhead for the server. All the RAM is
dedicated to MongoDB. Still it takes so much time. For the multiple fields
query I tried to build compound index but to my surprise instead of
performance gain, the query seems to get stuck & never returns the result
(I mean it "NEVER"). I don't know where I went wrong. I am newbie to the
system, so any expert who can help me over here is welcome. I tried the
solution online & found out that there is problem with the range query in
MongoDB, which sometimes doesn't take into consideration one bound (either
lower or upper). So for that I found out the solution, which is min(),
max() cursor functions & they work properly (thank God! :)) for main
document field date. But for sub document fields such as firstTransaction
*planner returned error: unable to find relevant index for max/min query*
*planner returned error: hint provided does not work with min query*
*db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).explain()*
*db.user.find().min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).hint("transaction.firstTransaction_1").explain()*
Above two queries returned me the two errors mentioned above.
*1)
db.user.find({transaction:{$elemMatch:{product:'tv'}}}).min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*
*2) db.user.find({transaction:{$elemMatch:{product:'fridge',
firstTransaction:{$in:[ISODate("2015-01-01"),ISODate("2015-01-02")]}}}}).explain()
*
*3) db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain() *
This query retrieves over half of the collection & takes over 20 minutes
despite of having the field indexed.
*4)
db.user.find().min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*
This query takes a while when run at first (around 2 mins). But after
warming up it executes really fast (in matter of milliseconds :)).
So I tried warming up for the queries 1,3 but no improvement.
*db.user.ensureIndex({date:1})*
*db.user.ensureIndex({"transaction.product":1})*
*db.user.ensureIndex({"transaction.firstTransaction":1})*
*db.user.ensureIndex({"transaction.lastTransaction":1})*
*db.user.ensureIndex({**transaction.firstTransaction":1,
"transaction.product":1**})*
*db.user.ensureIndex({date:1, "transaction.product":1})*
My main concern is the 2 query. For other queries I can warm up the
database, but for this query however I couldn't find the solution. I have
tried different versions of it such as with min max functions, $gt, $lt
operators, & with compound index, but no help. Can anyone find out what's
going wrong here? Thanks to bear with my really long question, you can
guess how desperate I am to solve this :P . Please let me know if I have
miss any of the details.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
...
-DJ
609-309-5340
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/20150210100113.Horde.9EN52L_UNsJGzh5wEjtGTg1%40alexander.swishmail.com.
For more options, visit https://groups.google.com/d/optout.
Asya Kamsky
2015-02-10 19:37:31 UTC
Permalink
TL;DR this is long, but in short you need schema and indexes optimized to
your use case and you need to make sure your queries are structured
correctly.

There are a few possible problems here, mainly that your schema and your
indexes are not optimal for the types of queries that you are doing.

First is the misconception that you must embed documents into a single
record to be using MongoDB "correctly" - this is *not* the case. You want
to embed things that are *always* fetched with the document, and those
things will tend to be limited. If something is a first class entity it
should be in its own collection.

Let me give you an example: users and orders (e-commerce). Do you want
to get back every single order when you fetch a user record? No, not
really - I've been shopping at Amazon for over 15 years, and I can't
imagine how slow it would be if my orders were all part of my document.
This is an antipattern in MongoDB - any time you have an array that's going
to grow indefinitely (unbounded) it's likely that this schema won't perform
well.

You mention needing joins if you don't embed - think of the possibility of
embedding in the other direction - many things you need to know about the
user, you can embed (denormalize) into the orders (or transactions in your
case). But again, just the stuff you need *every* time you fetch a
transaction, use in queries, etc. The rest you can do a second lookup in
the user table for.

Third, indexes and queries. Getting compound indexes right is very
important and that means putting equality fields first, range fields after.
One of the compound indexes you were trying to use, for example, is
exactly the opposite (it has date first and product second). Great
article to read is
http://emptysqua.re/blog/optimizing-mongodb-compound-indexes/ although
written about much older version of MongoDB, the general principles still
apply.
found out that there is problem with the range query in MongoDB, which
sometimes doesn't take into consideration one bound

This is not really so much that it *doesn't* take one bound into
consideration, but that it *cannot* in some cases because in a compound
indexes, there is no information about which field is an array - there is
very extensive discussion of this in
https://jira.mongodb.org/browse/SERVER-15086 which is the feature request
to enhance the index spec or the query language to allow better use of
bounds. Basically, if it used both bounds, it could miss results that
match the query.

So when querying arrays, you don't need to use $elemMatch for single field
test, but you *must* use it if you want a single element to satisfy the
condition. Unfortunately, $elemMatch only matches array elements, it will
*not* match regular fields.

Compare:

db.user.find( { "transaction.date" : {$gte: ISODate("2014-01-02"), $lt:
ISODate("2015-01-01")} } );

and

db.user.find({"transaction":{$elemMatch:{"date":{$gt:ISODate("2014-01-01"),$lt:ISODate("2015-01-01")}}}})

The first one says you want all users who have a transaction date >
2014-01-01 in their transaction array and also who have a transaction date
< 2015-01-01 in their array.

That's *everyone* who has a transaction since 2015 *and* a transaction in
2013 and earlier - that's unlikely to be what you want so you use the
second one.

The second one says you want users with at least a single transaction
that's in the range 2014-01-01 and 2015-01-01. This is why both bounds
cannot be used by the query optimizer in your case, you are asking for the
first one of these, not the second one.

If the query optimizer uses bounds 2014-01-02, 2015-01-01 for this query,
it will work correctly in the second case, but it will *not* work correctly
in the first case - in fact, it would miss many documents that it should
return. And that's the core limitation in this case.

When the query optimizer doesn't know which field in a *compound* index is
an array, it cannot use both bounds for a field that doesn't have
$elemMatch because it could then miss values that should be returned. This
is what happens in your compound index "date" and product - date is top
level (not array) and product is inside the array but the optimizer doesn't
know that date cannot be an array (by the way, what does top level date
mean anyway?)

It's also happening with product, firstTransaction because even though they
are both inside the array and you are using $elemMatch (properly) to
indicate you want the same element to match all conditions, the query
optimizer doesn't know that "transaction.firstTransaction" value is not an
array as well - if it did know that then you would see optimal bounds used.

So, because the query optimizer doesn't know which of multiple fields can
be arrays (and technically more than one can) it cannot use both bounds on
range queries like you would like. I hope we will be able to enhance the
query optimizer and indexes in a way that would improve this soon, but it's
very tricky because we have no restrictions on schema and no way to know
what types each field is.

Now, min and max are not really part of the query subsystem (they are not
operators), usually if you are needing to use them, there is something
suboptimal or incorrect in your query. I think this is also why you are
getting an error with hint - you aren't using an index specifier that
exists. Or one that doesn't match the query predicate (so you are either
forcing a full scan of index sub-set or you aren't indicating existing
index.

Now, what is your solution? I can't really speak confidently on this
because I don't understand what top level "date" means - what is it an
indicator of and how is it used? Normally I would expect it refers to the
user and so I'm surprised when I see you using it in combination with
transaction criteria in the transaction array.

One possible solution - get rid of suboptimal indexes, add optimal indexes
and correctly structured queries on current schema (after clarifying what
the top level "date" field represents). Unfortunately it seems that the
way you want to query some of these the queries won't be able to be as
performant as you expect.

Second possible solution and one that I would prefer - if your queries are
mostly against transactions, then they should be in their own collection
(with some user info denormalized into each if appropriate). It will then
be simple to create correct indexes on users and correct indexes on
transactions for all the queries/updates you will be doing. Since you
won't have transactions in an array you won't have the compound multi-key
index issues (at least not the one you're seeing), and as an extra bonus,
I'm guessing the queries will become more readable as well.

There may be a third possibility, but again, without knowing all access
patterns, it's hard to say.

Asya
P.S. you should let us know what version of MongoDB you are using as that
will let us know if there are some other bugs or features that need to be
taken into consideration.
db.user.find({transaction:{$elemMatch:{product:"mobile",
firstTransaction:{$gte:ISODate("2015-01-01"),
$lt:ISODate("2015-01-02")}}}}).hint("transaction.firstTransaction_1_transaction.product_1").explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 622,
"nscannedObjects" : 350931,
"nscanned" : 352000,
"nscannedObjectsAllPlans" : 350931,
"nscannedAllPlans" : 352000,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 119503,
"nChunkSkips" : 0,
"millis" : 375693,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,
ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 547,
"nscannedObjects" : 350984,
"nscanned" : 352028,
"nscannedObjectsAllPlans" : 350984,
"nscannedAllPlans" : 352028,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 132669,
"nChunkSkips" : 0,
"millis" : 891898,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,
ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"n" : 1169,
"nChunkSkips" : 0,
"nYields" : 252172,
"nscanned" : 704028,
"nscannedAllPlans" : 704028,
"nscannedObjects" : 701915,
"nscannedObjectsAllPlans" : 701915,
"millisShardTotal" : 1267591,
"millisShardAvg" : 633795,
"numQueries" : 2,
"numShards" : 2,
"millis" : 891910
}
db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 553072,
"nscannedObjects" : 553072,
"nscanned" : 553072,
"nscannedObjectsAllPlans" : 553072,
"nscannedAllPlans" : 553072,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 164888,
"nChunkSkips" : 0,
"millis" : 337909,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 554176,
"nscannedObjects" : 554176,
"nscanned" : 554176,
"nscannedObjectsAllPlans" : 554176,
"nscannedAllPlans" : 554176,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 107496,
"nChunkSkips" : 0,
"millis" : 327928,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor transaction.product_1",
"n" : 1107248,
"nChunkSkips" : 0,
"nYields" : 272384,
"nscanned" : 1107248,
"nscannedAllPlans" : 1107248,
"nscannedObjects" : 1107248,
"nscannedObjectsAllPlans" : 1107248,
"millisShardTotal" : 665837,
"millisShardAvg" : 332918,
"numQueries" : 2,
"numShards" : 2,
"millis" : 337952
}
I am not an expert but "explain()" output would be helpful to look into
this.
Hi,
I am currently working on a project for which I am using MongoDB as a
back end. I have around 30-40 fields in schema including that of sub
documents. I have over 2.5 million of documents in the collection which
constitute for over 30 GB. It has been distributed in sharded environment
{
_id: UUID,
name: string,
addr: string,
date: ISODate,
dob: ISODate,
.
.
.
transaction:[
{
createdAt: ISODate,
firstTransaction: ISODate,
lastTransaction: ISODate,
amount: float,
product: string
}
],
more sub documents...
}
In above schema I have indexed fields such as- date in main document,
createdAt, firstTransaction, lastTransaction, product in transaction sub
document. Sub document length varies from 0-50 or so. My question is that,
despite of indexing those fields, queries such as getting all users who
have done transaction within a day or two takes over two minutes which is
very slow. If the duration is month it takes over 10 minutes to fetch the
records. If I use multiple fields in same query (which are all indexed) the
time whoppingly increases to over 30 minutes. I have checked all the
configuration & it really is suitable for the scenario. For e.g.: RAM to
sum over all the shards constitutes to more that 10 GB, with each shard
having 7.5 GB of RAM. Index size is 4 GB. Servers are especially dedicated
to MongoDB only, so no other overhead for the server. All the RAM is
dedicated to MongoDB. Still it takes so much time. For the multiple fields
query I tried to build compound index but to my surprise instead of
performance gain, the query seems to get stuck & never returns the result
(I mean it "NEVER"). I don't know where I went wrong. I am newbie to the
system, so any expert who can help me over here is welcome. I tried the
solution online & found out that there is problem with the range query in
MongoDB, which sometimes doesn't take into consideration one bound (either
lower or upper). So for that I found out the solution, which is min(),
max() cursor functions & they work properly (thank God! :)) for main
document field date. But for sub document fields such as firstTransaction
*planner returned error: unable to find relevant index for max/min query*
*planner returned error: hint provided does not work with min query*
*db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).explain()*
*db.user.find().min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).hint("transaction.firstTransaction_1").explain()*
Above two queries returned me the two errors mentioned above.
*1)
db.user.find({transaction:{$elemMatch:{product:'tv'}}}).min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*
*2) db.user.find({transaction:{$elemMatch:{product:'fridge',
firstTransaction:{$in:[ISODate("2015-01-01"),ISODate("2015-01-02")]}}}}).explain()
*
*3) db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain() *
This query retrieves over half of the collection & takes over 20 minutes
despite of having the field indexed.
*4)
db.user.find().min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*
This query takes a while when run at first (around 2 mins). But after
warming up it executes really fast (in matter of milliseconds :)).
So I tried warming up for the queries 1,3 but no improvement.
*db.user.ensureIndex({date:1})*
*db.user.ensureIndex({"transaction.product":1})*
*db.user.ensureIndex({"transaction.firstTransaction":1})*
*db.user.ensureIndex({"transaction.lastTransaction":1})*
*db.user.ensureIndex({**transaction.firstTransaction":1,
"transaction.product":1**})*
*db.user.ensureIndex({date:1, "transaction.product":1})*
My main concern is the 2 query. For other queries I can warm up the
database, but for this query however I couldn't find the solution. I have
tried different versions of it such as with min max functions, $gt, $lt
operators, & with compound index, but no help. Can anyone find out what's
going wrong here? Thanks to bear with my really long question, you can
guess how desperate I am to solve this :P . Please let me know if I have
miss any of the details.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/
msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%
40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/ea0610b1-e56b-47b9-a7f0-7f0cfc9bce7a%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/ea0610b1-e56b-47b9-a7f0-7f0cfc9bce7a%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"email" : "***@mongodb.com",
"blog" : "http://www.askasya.com/",
"twitter": "@asya999" }
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/CAOe6dJCeHnFz3zNB_nu-RsnRW9rw6gFjmAin5P9du7CzYjaLFg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Harshad Shirsat
2015-02-11 06:37:49 UTC
Permalink
Thanks for your elaborated answer, glad to have opinion from expert such as
you. Sorry for the long post, but I had many confusions :P. I had read
about how to build optimal compound indexes exactly specifying what you
mentioned here. But in my experiment with my data set, I found out that
with [range, equality] index I get more performance improvement than
opposite of it (weird though). About range indexes; I know it's tricky to
solve & 10gen people are working hard for it. I had referred to the Jira &
found some workarounds for that issue, which is min(), max() & that's what
I tried to do.

Thanks for clearing the doubt regarding embedded documents & denormalizing
data in MongoDB. I think I should give a shot to that (might work :)).

'date' field in main document corresponds to the date when user was
registered with the system & the purpose of such query is to create the
report where I can come to know who registered in particular duration. I
hope now it's clear.

My MongoDB version is 2.6.6, sorry didn't mention it previously.

I couldn't really understand though what you meant by suboptimal & optimal
indexes. Did you refer to the case of compound indexes, such as [range,
equality] & [equality,range]? Will be glad if you clear that for me. When I
sae through mongostat --discover command, there were many page faults
occurring for the embedded document queries. I hope this indication might
help you to figure out what is happening exactly.

Thanks,
Harshad
Post by Asya Kamsky
TL;DR this is long, but in short you need schema and indexes optimized to
your use case and you need to make sure your queries are structured
correctly.
There are a few possible problems here, mainly that your schema and your
indexes are not optimal for the types of queries that you are doing.
First is the misconception that you must embed documents into a single
record to be using MongoDB "correctly" - this is *not* the case. You want
to embed things that are *always* fetched with the document, and those
things will tend to be limited. If something is a first class entity it
should be in its own collection.
Let me give you an example: users and orders (e-commerce). Do you want
to get back every single order when you fetch a user record? No, not
really - I've been shopping at Amazon for over 15 years, and I can't
imagine how slow it would be if my orders were all part of my document.
This is an antipattern in MongoDB - any time you have an array that's going
to grow indefinitely (unbounded) it's likely that this schema won't perform
well.
You mention needing joins if you don't embed - think of the possibility of
embedding in the other direction - many things you need to know about the
user, you can embed (denormalize) into the orders (or transactions in your
case). But again, just the stuff you need *every* time you fetch a
transaction, use in queries, etc. The rest you can do a second lookup in
the user table for.
Third, indexes and queries. Getting compound indexes right is very
important and that means putting equality fields first, range fields after.
One of the compound indexes you were trying to use, for example, is
exactly the opposite (it has date first and product second). Great
article to read is
http://emptysqua.re/blog/optimizing-mongodb-compound-indexes/ although
written about much older version of MongoDB, the general principles still
apply.
found out that there is problem with the range query in MongoDB, which
sometimes doesn't take into consideration one bound
This is not really so much that it *doesn't* take one bound into
consideration, but that it *cannot* in some cases because in a compound
indexes, there is no information about which field is an array - there is
very extensive discussion of this in
https://jira.mongodb.org/browse/SERVER-15086 which is the feature request
to enhance the index spec or the query language to allow better use of
bounds. Basically, if it used both bounds, it could miss results that
match the query.
So when querying arrays, you don't need to use $elemMatch for single field
test, but you *must* use it if you want a single element to satisfy the
condition. Unfortunately, $elemMatch only matches array elements, it will
*not* match regular fields.
ISODate("2015-01-01")} } );
and
db.user.find({"transaction":{$elemMatch:{"date":{$gt:ISODate("2014-01-01"),$lt:ISODate("2015-01-01")}}}})
The first one says you want all users who have a transaction date >
2014-01-01 in their transaction array and also who have a transaction date
< 2015-01-01 in their array.
That's *everyone* who has a transaction since 2015 *and* a transaction in
2013 and earlier - that's unlikely to be what you want so you use the
second one.
The second one says you want users with at least a single transaction
that's in the range 2014-01-01 and 2015-01-01. This is why both bounds
cannot be used by the query optimizer in your case, you are asking for the
first one of these, not the second one.
If the query optimizer uses bounds 2014-01-02, 2015-01-01 for this query,
it will work correctly in the second case, but it will *not* work correctly
in the first case - in fact, it would miss many documents that it should
return. And that's the core limitation in this case.
When the query optimizer doesn't know which field in a *compound* index
is an array, it cannot use both bounds for a field that doesn't have
$elemMatch because it could then miss values that should be returned. This
is what happens in your compound index "date" and product - date is top
level (not array) and product is inside the array but the optimizer doesn't
know that date cannot be an array (by the way, what does top level date
mean anyway?)
It's also happening with product, firstTransaction because even though
they are both inside the array and you are using $elemMatch (properly) to
indicate you want the same element to match all conditions, the query
optimizer doesn't know that "transaction.firstTransaction" value is not an
array as well - if it did know that then you would see optimal bounds used.
So, because the query optimizer doesn't know which of multiple fields can
be arrays (and technically more than one can) it cannot use both bounds on
range queries like you would like. I hope we will be able to enhance the
query optimizer and indexes in a way that would improve this soon, but it's
very tricky because we have no restrictions on schema and no way to know
what types each field is.
Now, min and max are not really part of the query subsystem (they are not
operators), usually if you are needing to use them, there is something
suboptimal or incorrect in your query. I think this is also why you are
getting an error with hint - you aren't using an index specifier that
exists. Or one that doesn't match the query predicate (so you are either
forcing a full scan of index sub-set or you aren't indicating existing
index.
Now, what is your solution? I can't really speak confidently on this
because I don't understand what top level "date" means - what is it an
indicator of and how is it used? Normally I would expect it refers to the
user and so I'm surprised when I see you using it in combination with
transaction criteria in the transaction array.
One possible solution - get rid of suboptimal indexes, add optimal indexes
and correctly structured queries on current schema (after clarifying what
the top level "date" field represents). Unfortunately it seems that the
way you want to query some of these the queries won't be able to be as
performant as you expect.
Second possible solution and one that I would prefer - if your queries are
mostly against transactions, then they should be in their own collection
(with some user info denormalized into each if appropriate). It will then
be simple to create correct indexes on users and correct indexes on
transactions for all the queries/updates you will be doing. Since you
won't have transactions in an array you won't have the compound multi-key
index issues (at least not the one you're seeing), and as an extra bonus,
I'm guessing the queries will become more readable as well.
There may be a third possibility, but again, without knowing all access
patterns, it's hard to say.
Asya
P.S. you should let us know what version of MongoDB you are using as that
will let us know if there are some other bugs or features that need to be
taken into consideration.
db.user.find({transaction:{$elemMatch:{product:"mobile",
firstTransaction:{$gte:ISODate("2015-01-01"),
$lt:ISODate("2015-01-02")}}}}).hint("transaction.firstTransaction_1_transaction.product_1").explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 622,
"nscannedObjects" : 350931,
"nscanned" : 352000,
"nscannedObjectsAllPlans" : 350931,
"nscannedAllPlans" : 352000,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 119503,
"nChunkSkips" : 0,
"millis" : 375693,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,
ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 547,
"nscannedObjects" : 350984,
"nscanned" : 352028,
"nscannedObjectsAllPlans" : 350984,
"nscannedAllPlans" : 352028,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 132669,
"nChunkSkips" : 0,
"millis" : 891898,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,
ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"n" : 1169,
"nChunkSkips" : 0,
"nYields" : 252172,
"nscanned" : 704028,
"nscannedAllPlans" : 704028,
"nscannedObjects" : 701915,
"nscannedObjectsAllPlans" : 701915,
"millisShardTotal" : 1267591,
"millisShardAvg" : 633795,
"numQueries" : 2,
"numShards" : 2,
"millis" : 891910
}
db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 553072,
"nscannedObjects" : 553072,
"nscanned" : 553072,
"nscannedObjectsAllPlans" : 553072,
"nscannedAllPlans" : 553072,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 164888,
"nChunkSkips" : 0,
"millis" : 337909,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 554176,
"nscannedObjects" : 554176,
"nscanned" : 554176,
"nscannedObjectsAllPlans" : 554176,
"nscannedAllPlans" : 554176,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 107496,
"nChunkSkips" : 0,
"millis" : 327928,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor transaction.product_1",
"n" : 1107248,
"nChunkSkips" : 0,
"nYields" : 272384,
"nscanned" : 1107248,
"nscannedAllPlans" : 1107248,
"nscannedObjects" : 1107248,
"nscannedObjectsAllPlans" : 1107248,
"millisShardTotal" : 665837,
"millisShardAvg" : 332918,
"numQueries" : 2,
"numShards" : 2,
"millis" : 337952
}
I am not an expert but "explain()" output would be helpful to look into
this.
Hi,
I am currently working on a project for which I am using MongoDB as a
back end. I have around 30-40 fields in schema including that of sub
documents. I have over 2.5 million of documents in the collection which
constitute for over 30 GB. It has been distributed in sharded environment
{
_id: UUID,
name: string,
addr: string,
date: ISODate,
dob: ISODate,
.
.
.
transaction:[
{
createdAt: ISODate,
firstTransaction: ISODate,
lastTransaction: ISODate,
amount: float,
product: string
}
],
more sub documents...
}
In above schema I have indexed fields such as- date in main document,
createdAt, firstTransaction, lastTransaction, product in transaction sub
document. Sub document length varies from 0-50 or so. My question is that,
despite of indexing those fields, queries such as getting all users who
have done transaction within a day or two takes over two minutes which is
very slow. If the duration is month it takes over 10 minutes to fetch the
records. If I use multiple fields in same query (which are all indexed) the
time whoppingly increases to over 30 minutes. I have checked all the
configuration & it really is suitable for the scenario. For e.g.: RAM to
sum over all the shards constitutes to more that 10 GB, with each shard
having 7.5 GB of RAM. Index size is 4 GB. Servers are especially dedicated
to MongoDB only, so no other overhead for the server. All the RAM is
dedicated to MongoDB. Still it takes so much time. For the multiple fields
query I tried to build compound index but to my surprise instead of
performance gain, the query seems to get stuck & never returns the result
(I mean it "NEVER"). I don't know where I went wrong. I am newbie to the
system, so any expert who can help me over here is welcome. I tried the
solution online & found out that there is problem with the range query in
MongoDB, which sometimes doesn't take into consideration one bound (either
lower or upper). So for that I found out the solution, which is min(),
max() cursor functions & they work properly (thank God! :)) for main
document field date. But for sub document fields such as firstTransaction
*planner returned error: unable to find relevant index for max/min query*
*planner returned error: hint provided does not work with min query*
*db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).explain()*
*db.user.find().min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).hint("transaction.firstTransaction_1").explain()*
Above two queries returned me the two errors mentioned above.
*1)
db.user.find({transaction:{$elemMatch:{product:'tv'}}}).min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*
*2) db.user.find({transaction:{$elemMatch:{product:'fridge',
firstTransaction:{$in:[ISODate("2015-01-01"),ISODate("2015-01-02")]}}}}).explain()
*
*3)
db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain() *
This query retrieves over half of the collection & takes over 20 minutes
despite of having the field indexed.
*4)
db.user.find().min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()*
This query takes a while when run at first (around 2 mins). But after
warming up it executes really fast (in matter of milliseconds :)).
So I tried warming up for the queries 1,3 but no improvement.
*db.user.ensureIndex({date:1})*
*db.user.ensureIndex({"transaction.product":1})*
*db.user.ensureIndex({"transaction.firstTransaction":1})*
*db.user.ensureIndex({"transaction.lastTransaction":1})*
*db.user.ensureIndex({**transaction.firstTransaction":1,
"transaction.product":1**})*
*db.user.ensureIndex({date:1, "transaction.product":1})*
My main concern is the 2 query. For other queries I can warm up the
database, but for this query however I couldn't find the solution. I have
tried different versions of it such as with min max functions, $gt, $lt
operators, & with compound index, but no help. Can anyone find out what's
going wrong here? Thanks to bear with my really long question, you can
guess how desperate I am to solve this :P . Please let me know if I have
miss any of the details.
--
You received this message because you are subscribed to the Google
Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/
msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%
40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
<javascript:>.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/ea0610b1-e56b-47b9-a7f0-7f0cfc9bce7a%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/ea0610b1-e56b-47b9-a7f0-7f0cfc9bce7a%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/d66f4982-0c3a-418c-8565-143f1b33968e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Asya Kamsky
2015-02-12 22:20:41 UTC
Permalink
found out that with [range, equality] index I get more performance improvement than opposite of it (weird though)
When you have equality and range query, it's more efficient to have
index that is in order of equality first, then range field.

If your experiments showed the opposite performance, it's likely that
on the range, equality field order in index, only the first part of
the index was being used, and some *other* reason was causing query
inefficiency.

When I say "optimal index" I usually mean - if there are no other
factors slowing down query, as an example, if you have to page fault
for index and data lookups, it's pretty much irrelevant which order
your index is in, the performance will be terrible and even worse.
:)

So, get the schema to where it fits your query pattern better, and
then measure how well different order of index fields perform (but
make sure to warm the data first, otherwise you'll be measuring more
disk speed than index effectiveness).

Asya
Thanks for your elaborated answer, glad to have opinion from expert such as
you. Sorry for the long post, but I had many confusions :P. I had read about
how to build optimal compound indexes exactly specifying what you mentioned
here. But in my experiment with my data set, I found out that with [range,
equality] index I get more performance improvement than opposite of it
(weird though). About range indexes; I know it's tricky to solve & 10gen
people are working hard for it. I had referred to the Jira & found some
workarounds for that issue, which is min(), max() & that's what I tried to
do.
Thanks for clearing the doubt regarding embedded documents & denormalizing
data in MongoDB. I think I should give a shot to that (might work :)).
'date' field in main document corresponds to the date when user was
registered with the system & the purpose of such query is to create the
report where I can come to know who registered in particular duration. I
hope now it's clear.
My MongoDB version is 2.6.6, sorry didn't mention it previously.
I couldn't really understand though what you meant by suboptimal & optimal
indexes. Did you refer to the case of compound indexes, such as [range,
equality] & [equality,range]? Will be glad if you clear that for me. When I
sae through mongostat --discover command, there were many page faults
occurring for the embedded document queries. I hope this indication might
help you to figure out what is happening exactly.
Thanks,
Harshad
Post by Asya Kamsky
TL;DR this is long, but in short you need schema and indexes optimized to
your use case and you need to make sure your queries are structured
correctly.
There are a few possible problems here, mainly that your schema and your
indexes are not optimal for the types of queries that you are doing.
First is the misconception that you must embed documents into a single
record to be using MongoDB "correctly" - this is *not* the case. You want
to embed things that are *always* fetched with the document, and those
things will tend to be limited. If something is a first class entity it
should be in its own collection.
Let me give you an example: users and orders (e-commerce). Do you want
to get back every single order when you fetch a user record? No, not
really - I've been shopping at Amazon for over 15 years, and I can't imagine
how slow it would be if my orders were all part of my document. This is an
antipattern in MongoDB - any time you have an array that's going to grow
indefinitely (unbounded) it's likely that this schema won't perform well.
You mention needing joins if you don't embed - think of the possibility of
embedding in the other direction - many things you need to know about the
user, you can embed (denormalize) into the orders (or transactions in your
case). But again, just the stuff you need *every* time you fetch a
transaction, use in queries, etc. The rest you can do a second lookup in
the user table for.
Third, indexes and queries. Getting compound indexes right is very
important and that means putting equality fields first, range fields after.
One of the compound indexes you were trying to use, for example, is exactly
the opposite (it has date first and product second). Great article to read
is http://emptysqua.re/blog/optimizing-mongodb-compound-indexes/ although
written about much older version of MongoDB, the general principles still
apply.
found out that there is problem with the range query in MongoDB, which
sometimes doesn't take into consideration one bound
This is not really so much that it doesn't take one bound into
consideration, but that it cannot in some cases because in a compound
indexes, there is no information about which field is an array - there is
very extensive discussion of this in
https://jira.mongodb.org/browse/SERVER-15086 which is the feature request to
enhance the index spec or the query language to allow better use of bounds.
Basically, if it used both bounds, it could miss results that match the
query.
So when querying arrays, you don't need to use $elemMatch for single field
test, but you *must* use it if you want a single element to satisfy the
condition. Unfortunately, $elemMatch only matches array elements, it will
*not* match regular fields.
ISODate("2015-01-01")} } );
and
db.user.find({"transaction":{$elemMatch:{"date":{$gt:ISODate("2014-01-01"),$lt:ISODate("2015-01-01")}}}})
The first one says you want all users who have a transaction date >
2014-01-01 in their transaction array and also who have a transaction date <
2015-01-01 in their array.
That's *everyone* who has a transaction since 2015 *and* a transaction in
2013 and earlier - that's unlikely to be what you want so you use the second
one.
The second one says you want users with at least a single transaction
that's in the range 2014-01-01 and 2015-01-01. This is why both bounds
cannot be used by the query optimizer in your case, you are asking for the
first one of these, not the second one.
If the query optimizer uses bounds 2014-01-02, 2015-01-01 for this query,
it will work correctly in the second case, but it will *not* work correctly
in the first case - in fact, it would miss many documents that it should
return. And that's the core limitation in this case.
When the query optimizer doesn't know which field in a compound index is
an array, it cannot use both bounds for a field that doesn't have $elemMatch
because it could then miss values that should be returned. This is what
happens in your compound index "date" and product - date is top level (not
array) and product is inside the array but the optimizer doesn't know that
date cannot be an array (by the way, what does top level date mean anyway?)
It's also happening with product, firstTransaction because even though
they are both inside the array and you are using $elemMatch (properly) to
indicate you want the same element to match all conditions, the query
optimizer doesn't know that "transaction.firstTransaction" value is not an
array as well - if it did know that then you would see optimal bounds used.
So, because the query optimizer doesn't know which of multiple fields can
be arrays (and technically more than one can) it cannot use both bounds on
range queries like you would like. I hope we will be able to enhance the
query optimizer and indexes in a way that would improve this soon, but it's
very tricky because we have no restrictions on schema and no way to know
what types each field is.
Now, min and max are not really part of the query subsystem (they are not
operators), usually if you are needing to use them, there is something
suboptimal or incorrect in your query. I think this is also why you are
getting an error with hint - you aren't using an index specifier that
exists. Or one that doesn't match the query predicate (so you are either
forcing a full scan of index sub-set or you aren't indicating existing
index.
Now, what is your solution? I can't really speak confidently on this
because I don't understand what top level "date" means - what is it an
indicator of and how is it used? Normally I would expect it refers to the
user and so I'm surprised when I see you using it in combination with
transaction criteria in the transaction array.
One possible solution - get rid of suboptimal indexes, add optimal indexes
and correctly structured queries on current schema (after clarifying what
the top level "date" field represents). Unfortunately it seems that the
way you want to query some of these the queries won't be able to be as
performant as you expect.
Second possible solution and one that I would prefer - if your queries are
mostly against transactions, then they should be in their own collection
(with some user info denormalized into each if appropriate). It will then
be simple to create correct indexes on users and correct indexes on
transactions for all the queries/updates you will be doing. Since you
won't have transactions in an array you won't have the compound multi-key
index issues (at least not the one you're seeing), and as an extra bonus,
I'm guessing the queries will become more readable as well.
There may be a third possibility, but again, without knowing all access
patterns, it's hard to say.
Asya
P.S. you should let us know what version of MongoDB you are using as that
will let us know if there are some other bugs or features that need to be
taken into consideration.
db.user.find({transaction:{$elemMatch:{product:"mobile",
firstTransaction:{$gte:ISODate("2015-01-01"),
$lt:ISODate("2015-01-02")}}}}).hint("transaction.firstTransaction_1_transaction.product_1").explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 622,
"nscannedObjects" : 350931,
"nscanned" : 352000,
"nscannedObjectsAllPlans" : 350931,
"nscannedAllPlans" : 352000,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 119503,
"nChunkSkips" : 0,
"millis" : 375693,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,
ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"isMultiKey" : true,
"n" : 547,
"nscannedObjects" : 350984,
"nscanned" : 352028,
"nscannedObjectsAllPlans" : 350984,
"nscannedAllPlans" : 352028,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 132669,
"nChunkSkips" : 0,
"millis" : 891898,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
],
"transaction.firstTransaction" : [
[
true,
ISODate("2015-01-02T00:00:00Z")
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor
transaction.product_1_transaction.firstTransaction_1",
"n" : 1169,
"nChunkSkips" : 0,
"nYields" : 252172,
"nscanned" : 704028,
"nscannedAllPlans" : 704028,
"nscannedObjects" : 701915,
"nscannedObjectsAllPlans" : 701915,
"millisShardTotal" : 1267591,
"millisShardAvg" : 633795,
"numQueries" : 2,
"numShards" : 2,
"millis" : 891910
}
db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain()
{
"clusteredType" : "ParallelSort",
"shards" : {
"test0/mrs00.test.com:27017,mrs01.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 553072,
"nscannedObjects" : 553072,
"nscanned" : 553072,
"nscannedObjectsAllPlans" : 553072,
"nscannedAllPlans" : 553072,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 164888,
"nChunkSkips" : 0,
"millis" : 337909,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-31:27017",
"filterSet" : false
}
],
"test1/mrs10.test.com:27017,mrs11.test.com:27017" : [
{
"cursor" : "BtreeCursor
transaction.product_1",
"isMultiKey" : true,
"n" : 554176,
"nscannedObjects" : 554176,
"nscanned" : 554176,
"nscannedObjectsAllPlans" : 554176,
"nscannedAllPlans" : 554176,
"scanAndOrder" : false,
"indexOnly" : false,
"nYields" : 107496,
"nChunkSkips" : 0,
"millis" : 327928,
"indexBounds" : {
"transaction.product" : [
[
"mobile",
"mobile"
]
]
},
"server" : "ip-12-0-0-34:27017",
"filterSet" : false
}
]
},
"cursor" : "BtreeCursor transaction.product_1",
"n" : 1107248,
"nChunkSkips" : 0,
"nYields" : 272384,
"nscanned" : 1107248,
"nscannedAllPlans" : 1107248,
"nscannedObjects" : 1107248,
"nscannedObjectsAllPlans" : 1107248,
"millisShardTotal" : 665837,
"millisShardAvg" : 332918,
"numQueries" : 2,
"numShards" : 2,
"millis" : 337952
}
I am not an expert but "explain()" output would be helpful to look into
this.
Hi,
I am currently working on a project for which I am using MongoDB as a
back end. I have around 30-40 fields in schema including that of sub
documents. I have over 2.5 million of documents in the collection which
constitute for over 30 GB. It has been distributed in sharded environment
{
_id: UUID,
name: string,
addr: string,
date: ISODate,
dob: ISODate,
.
.
.
transaction:[
{
createdAt: ISODate,
firstTransaction: ISODate,
lastTransaction: ISODate,
amount: float,
product: string
}
],
more sub documents...
}
In above schema I have indexed fields such as- date in main document,
createdAt, firstTransaction, lastTransaction, product in transaction sub
document. Sub document length varies from 0-50 or so. My question is that,
despite of indexing those fields, queries such as getting all users who have
done transaction within a day or two takes over two minutes which is very
slow. If the duration is month it takes over 10 minutes to fetch the
records. If I use multiple fields in same query (which are all indexed) the
time whoppingly increases to over 30 minutes. I have checked all the
configuration & it really is suitable for the scenario. For e.g.: RAM to sum
over all the shards constitutes to more that 10 GB, with each shard having
7.5 GB of RAM. Index size is 4 GB. Servers are especially dedicated to
MongoDB only, so no other overhead for the server. All the RAM is dedicated
to MongoDB. Still it takes so much time. For the multiple fields query I
tried to build compound index but to my surprise instead of performance
gain, the query seems to get stuck & never returns the result (I mean it
"NEVER"). I don't know where I went wrong. I am newbie to the system, so any
expert who can help me over here is welcome. I tried the solution online &
found out that there is problem with the range query in MongoDB, which
sometimes doesn't take into consideration one bound (either lower or upper).
So for that I found out the solution, which is min(), max() cursor functions
& they work properly (thank God! :)) for main document field date. But for
sub document fields such as firstTransaction or lastTransaction it shows the
planner returned error: unable to find relevant index for max/min query
planner returned error: hint provided does not work with min query
db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).explain()
db.user.find().min({transaction:{$elemMatch:{firstTransaction:ISODate("2015-01-01")}}}).max({transaction:{$elemMatch:{lastTransaction:ISODate("2015-01-02")}}}).hint("transaction.firstTransaction_1").explain()
Above two queries returned me the two errors mentioned above.
1)
db.user.find({transaction:{$elemMatch:{product:'tv'}}}).min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()
2) db.user.find({transaction:{$elemMatch:{product:'fridge',
firstTransaction:{$in:[ISODate("2015-01-01"),ISODate("2015-01-02")]}}}}).explain()
3) db.user.find({transaction:{$elemMatch:{product:'mobile'}}}).explain()
This query retrieves over half of the collection & takes over 20 minutes
despite of having the field indexed.
4)
db.user.find().min({date:ISODate("2014-12-01")}).max({date:ISODate("2014-12-31")}).explain()
This query takes a while when run at first (around 2 mins). But after
warming up it executes really fast (in matter of milliseconds :)).
So I tried warming up for the queries 1,3 but no improvement.
db.user.ensureIndex({date:1})
db.user.ensureIndex({"transaction.product":1})
db.user.ensureIndex({"transaction.firstTransaction":1})
db.user.ensureIndex({"transaction.lastTransaction":1})
db.user.ensureIndex({transaction.firstTransaction":1,
"transaction.product":1})
db.user.ensureIndex({date:1, "transaction.product":1})
My main concern is the 2 query. For other queries I can warm up the
database, but for this query however I couldn't find the solution. I have
tried different versions of it such as with min max functions, $gt, $lt
operators, & with compound index, but no help. Can anyone find out what's
going wrong here? Thanks to bear with my really long question, you can guess
how desperate I am to solve this :P . Please let me know if I have miss any
of the details.
--
You received this message because you are subscribed to the Google
Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/c34f0a0b-0915-4d6f-9c1b-27ca73bb0449%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/ea0610b1-e56b-47b9-a7f0-7f0cfc9bce7a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/d66f4982-0c3a-418c-8565-143f1b33968e%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"email" : "***@mongodb.com",
"blog" : "http://www.askasya.com/",
"twitter": "@asya999" }
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/CAOe6dJBfSoEvCVushvgwpPZn3p-bJWLWiqn7YN4pYdPaGNXajg%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Continue reading on narkive:
Loading...