Discussion:
[mongodb-user] where is the in-memory storage engine?
stephanos
2015-03-06 12:26:51 UTC
Permalink
Hey guys,

I'd love to try the new experimental in-memory storage engine. But how do I
use it?
In the docs for 'storage.engine' it just says: "Valid options include
mmapv1 and wiredTiger".

Stephan
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/3c6745fc-6ab7-4bc6-b2a1-e8e75568384c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
s.molinari
2015-03-06 18:58:42 UTC
Permalink
It hasn't been released yet.

Scott
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/3dbe4acd-0cc9-4898-905e-7ac33b72e577%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
stephanos
2015-03-06 19:01:45 UTC
Permalink
Oh, what a shame.
I read more than 10 articles that mention the 'new, experimental in-memory
storage engine' in 3.0. One would think this is communicated clearer.

Stephan
Post by s.molinari
It hasn't been released yet.
Scott
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/5585f5c6-c758-4aca-85de-1a3538a7739f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Tim Callaghan
2015-03-06 21:51:35 UTC
Permalink
Welcome to release announcements!
Post by stephanos
Oh, what a shame.
I read more than 10 articles that mention the 'new, experimental in-memory
storage engine' in 3.0. One would think this is communicated clearer.
Stephan
Post by s.molinari
It hasn't been released yet.
Scott
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/9e3426d4-d24c-4dc6-b303-64f07dff8d7f%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
s.molinari
2015-03-06 21:58:25 UTC
Permalink
If you want, you could follow this Jira.

https://jira.mongodb.org/browse/SERVER-1153

Scott
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/13bfd5a3-2c91-4486-bf12-800399d9f86a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Stephen Steneker
2015-03-06 22:17:53 UTC
Permalink
Post by stephanos
I'd love to try the new experimental in-memory storage engine. But how do
I use it?
In the docs for 'storage.engine' it just says: "Valid options include
mmapv1 and wiredTiger".
Hi Stephan,

You can try this in MongoDB 3.0 with storage.engine "inMemoryExperiment".

Note that this is currently in-memory only (there is no option to persist
to disk) and there may be rough edges (thus the "experimental" tag).

Regards,
Stephen
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/dab69b06-2aee-4841-a68a-f90ac0dc5b70%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Asya Kamsky
2015-03-07 17:28:42 UTC
Permalink
Hi Stephan:

Stennie already mentioned the "secret" experimental option to try an
in-memory engine (which is *not* production quality, which is why it's
not mentioned in the docs).

We *are* working on a production quality in-memory storage engine
option, and to make sure that it fulfills the needs of the users, I'd
love to know what sort of use case you have in mind for it - what
features were you planning on testing, etc.

Feel free to reply to me privately if you prefer ( asya at mongodb).

Asya
Post by stephanos
Hey guys,
I'd love to try the new experimental in-memory storage engine. But how do I
use it?
In the docs for 'storage.engine' it just says: "Valid options include mmapv1
and wiredTiger".
Stephan
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/3c6745fc-6ab7-4bc6-b2a1-e8e75568384c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"email" : "***@mongodb.com",
"blog" : "http://www.askasya.com/",
"twitter": "@asya999" }
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/CAOe6dJB43mWiNLtn6cX8vz-Zh%2B7J8i1OJhDJ_6g7mYuD0k1P1Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
stephanos
2015-03-07 17:35:17 UTC
Permalink
@Stephen
Thanks for the 'secret' storage engine option :)

@Asya
My use case is actually very simple: easily run multiple mongo processes in
parallel during my integration/functional tests.

Stephan
Post by Asya Kamsky
Stennie already mentioned the "secret" experimental option to try an
in-memory engine (which is *not* production quality, which is why it's
not mentioned in the docs).
We *are* working on a production quality in-memory storage engine
option, and to make sure that it fulfills the needs of the users, I'd
love to know what sort of use case you have in mind for it - what
features were you planning on testing, etc.
Feel free to reply to me privately if you prefer ( asya at mongodb).
Asya
Post by stephanos
Hey guys,
I'd love to try the new experimental in-memory storage engine. But how
do I
Post by stephanos
use it?
In the docs for 'storage.engine' it just says: "Valid options include
mmapv1
Post by stephanos
and wiredTiger".
Stephan
--
You received this message because you are subscribed to the Google
Groups
Post by stephanos
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google
Groups
Post by stephanos
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
an
<javascript:>.
Post by stephanos
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/3c6745fc-6ab7-4bc6-b2a1-e8e75568384c%40googlegroups.com.
Post by stephanos
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/7fc9f044-2a7f-4862-aaa1-34790eed2498%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Asya Kamsky
2015-03-07 18:30:18 UTC
Permalink
Ah, so it's the same reason our own driver developers wanted it :)

You can use the experimental engine then since it's not a production
system - please do report if you experience any crashes or errors.

Asya
Post by stephanos
@Stephen
Thanks for the 'secret' storage engine option :)
@Asya
My use case is actually very simple: easily run multiple mongo processes in
parallel during my integration/functional tests.
Stephan
Post by Asya Kamsky
Stennie already mentioned the "secret" experimental option to try an
in-memory engine (which is *not* production quality, which is why it's
not mentioned in the docs).
We *are* working on a production quality in-memory storage engine
option, and to make sure that it fulfills the needs of the users, I'd
love to know what sort of use case you have in mind for it - what
features were you planning on testing, etc.
Feel free to reply to me privately if you prefer ( asya at mongodb).
Asya
Post by stephanos
Hey guys,
I'd love to try the new experimental in-memory storage engine. But how do I
use it?
In the docs for 'storage.engine' it just says: "Valid options include mmapv1
and wiredTiger".
Stephan
--
You received this message because you are subscribed to the Google Groups
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/3c6745fc-6ab7-4bc6-b2a1-e8e75568384c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/7fc9f044-2a7f-4862-aaa1-34790eed2498%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"email" : "***@mongodb.com",
"blog" : "http://www.askasya.com/",
"twitter": "@asya999" }
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/CAOe6dJCUNMBTsfWuCmHquYTzWQkfCZZpPKYeefMw6Kuna4dE3w%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Bryan Conklin
2015-03-13 19:44:45 UTC
Permalink
Hi guys, I wanted to throw my use case out there to see if what is coming
might apply.

We're currently storing extremely verbose log data (about 18billion logs
per day, each avg ~650 bytes, so more than 10TB per day) so that our
customers can access the logs that are specific to them and query it for
the subset of data they're interested in using any of the usual mongo query
methods, including regex matching. Basically we want our customers to be
able to access these logs for any use case for server access logs you can
think of. They query our API, we query the database and return the
information so they can use it for whatever business needs they may have.
What we are thinking about doing is storing the hot data (most recent hour
or so) of these logs in a distributed, in-memory database for quicker
access to reads and aggregation operations.

A second use case we have in mind is using an in-memory database for
creating pre-aggregations of the data where we store counters and perform
upsert/increment operations on these counters as the logs mentioned above
flow through the pipeline. The reason we need an in-memory database for
this is mostly because our data set is so large that attempting to do it
all with upsert/increments with our current setup is not fast enough. The
idea here is the counters would be grouped by certain fields including
timestamp ranges and once a reasonable amount of time has passed, these
aggregations would flush to a longer term, cold storage where they would
live indefinitely, only really being altered if some issue made a log come
through really late. So this use case is more about trying to speed up
write operations while the first use case was more about trying to speed up
reads.

Aside from being distributed, we would also need the in-memory solution to
be fairly reliable. You mentioned that it doesn't persist yet (not that I
plan on using it in production before it is ready), but once it does I'd be
interested to see real world benchmarks as far as impacts on speed when
persisting and not. I'd also want to know more about how it handles
persistence and how reliable it is, whether I would need to be able to
replay from a durable source if there were a failure in order to not lose
data, if replica sets would be interchangeable with and as effective as
persisting, etc...

Also, it would be nice if there were some compression options as well ;)

I'm really looking forward to the in-memory options and looking forward to
hearing more about them.
Post by Asya Kamsky
Ah, so it's the same reason our own driver developers wanted it :)
You can use the experimental engine then since it's not a production
system - please do report if you experience any crashes or errors.
Asya
Post by stephanos
@Stephen
Thanks for the 'secret' storage engine option :)
@Asya
My use case is actually very simple: easily run multiple mongo processes
in
Post by stephanos
parallel during my integration/functional tests.
Stephan
Post by Asya Kamsky
Stennie already mentioned the "secret" experimental option to try an
in-memory engine (which is *not* production quality, which is why it's
not mentioned in the docs).
We *are* working on a production quality in-memory storage engine
option, and to make sure that it fulfills the needs of the users, I'd
love to know what sort of use case you have in mind for it - what
features were you planning on testing, etc.
Feel free to reply to me privately if you prefer ( asya at mongodb).
Asya
Post by stephanos
Hey guys,
I'd love to try the new experimental in-memory storage engine. But
how
Post by stephanos
Post by Asya Kamsky
Post by stephanos
do I
use it?
In the docs for 'storage.engine' it just says: "Valid options include mmapv1
and wiredTiger".
Stephan
--
You received this message because you are subscribed to the Google Groups
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it,
send
Post by stephanos
Post by Asya Kamsky
Post by stephanos
an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/3c6745fc-6ab7-4bc6-b2a1-e8e75568384c%40googlegroups.com.
Post by stephanos
Post by Asya Kamsky
Post by stephanos
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google
Groups
Post by stephanos
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google
Groups
Post by stephanos
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
an
<javascript:>.
Post by stephanos
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/7fc9f044-2a7f-4862-aaa1-34790eed2498%40googlegroups.com.
Post by stephanos
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/4bfa9baf-e476-48ce-a34c-c3366352c73c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Asya Kamsky
2015-03-13 21:04:04 UTC
Permalink
Have you tested WiredTiger for your use case? Its got better write
throughput for many use cases over MMAP so it might help significantly
(plus it's got compression so your TBs of data may compress to much
smaller files).

As far as in-memory database goes, you asked about ability to persist
data - what about using replication? The idea here is that if your
server fails, then things fail over to the newly elected primary which
has all the data in memory already...

For pure in-memory, it should really be data that you can always
regenerate if you need to, otherwise you need either disk or replica
node persistence, right?

Asya
Post by Bryan Conklin
Hi guys, I wanted to throw my use case out there to see if what is coming
might apply.
We're currently storing extremely verbose log data (about 18billion logs per
day, each avg ~650 bytes, so more than 10TB per day) so that our customers
can access the logs that are specific to them and query it for the subset of
data they're interested in using any of the usual mongo query methods,
including regex matching. Basically we want our customers to be able to
access these logs for any use case for server access logs you can think of.
They query our API, we query the database and return the information so they
can use it for whatever business needs they may have. What we are thinking
about doing is storing the hot data (most recent hour or so) of these logs
in a distributed, in-memory database for quicker access to reads and
aggregation operations.
A second use case we have in mind is using an in-memory database for
creating pre-aggregations of the data where we store counters and perform
upsert/increment operations on these counters as the logs mentioned above
flow through the pipeline. The reason we need an in-memory database for this
is mostly because our data set is so large that attempting to do it all with
upsert/increments with our current setup is not fast enough. The idea here
is the counters would be grouped by certain fields including timestamp
ranges and once a reasonable amount of time has passed, these aggregations
would flush to a longer term, cold storage where they would live
indefinitely, only really being altered if some issue made a log come
through really late. So this use case is more about trying to speed up write
operations while the first use case was more about trying to speed up reads.
Aside from being distributed, we would also need the in-memory solution to
be fairly reliable. You mentioned that it doesn't persist yet (not that I
plan on using it in production before it is ready), but once it does I'd be
interested to see real world benchmarks as far as impacts on speed when
persisting and not. I'd also want to know more about how it handles
persistence and how reliable it is, whether I would need to be able to
replay from a durable source if there were a failure in order to not lose
data, if replica sets would be interchangeable with and as effective as
persisting, etc...
Also, it would be nice if there were some compression options as well ;)
I'm really looking forward to the in-memory options and looking forward to
hearing more about them.
Post by Asya Kamsky
Ah, so it's the same reason our own driver developers wanted it :)
You can use the experimental engine then since it's not a production
system - please do report if you experience any crashes or errors.
Asya
Post by stephanos
@Stephen
Thanks for the 'secret' storage engine option :)
@Asya
My use case is actually very simple: easily run multiple mongo processes in
parallel during my integration/functional tests.
Stephan
Post by Asya Kamsky
Stennie already mentioned the "secret" experimental option to try an
in-memory engine (which is *not* production quality, which is why it's
not mentioned in the docs).
We *are* working on a production quality in-memory storage engine
option, and to make sure that it fulfills the needs of the users, I'd
love to know what sort of use case you have in mind for it - what
features were you planning on testing, etc.
Feel free to reply to me privately if you prefer ( asya at mongodb).
Asya
Post by stephanos
Hey guys,
I'd love to try the new experimental in-memory storage engine. But how
do I
use it?
In the docs for 'storage.engine' it just says: "Valid options include mmapv1
and wiredTiger".
Stephan
--
You received this message because you are subscribed to the Google Groups
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/3c6745fc-6ab7-4bc6-b2a1-e8e75568384c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google Groups
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/7fc9f044-2a7f-4862-aaa1-34790eed2498%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/4bfa9baf-e476-48ce-a34c-c3366352c73c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
MongoDB World is back! June 1-2 in NYC. Use code ASYA for 25% off!
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/CAOe6dJBuTwkuSyj26zefqVyWWtCgTF4Yz50r9V%3DGCXW81WUmRA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Bryan Conklin
2015-03-13 22:30:36 UTC
Permalink
We are actually currently using TokuMX for storage, which you may be
familiar with but we decided to use it about a year ago for its compression
and document level locks. But we're still experiencing some lag when it
comes to reads that have to traverse a lot of data and our high volume of
writes that require update operations.

So what you're saying about persisting is that it really is just a
different approach from replica sets and doesn't offer any benefits aside
from less servers required?

I also thought of one other question after I made the last post. You may
have noticed my approach requires a separation from hot data in-memory and
cold data on disk. Are there any plans to smoothly make the transition from
one to the other within mongo? I was working on the assumption that we'd
have to write a custom process to read from one and write to the other, but
it would be awesome if at some point when data satisfied a requirement for
being switched from hot to cold, we could issue a command or a trigger
would fire that would just migrate it within mongo from the in-memory
storage engine to an on-disk storage engine.
Post by Asya Kamsky
Have you tested WiredTiger for your use case? Its got better write
throughput for many use cases over MMAP so it might help significantly
(plus it's got compression so your TBs of data may compress to much
smaller files).
As far as in-memory database goes, you asked about ability to persist
data - what about using replication? The idea here is that if your
server fails, then things fail over to the newly elected primary which
has all the data in memory already...
For pure in-memory, it should really be data that you can always
regenerate if you need to, otherwise you need either disk or replica
node persistence, right?
Asya
Post by Bryan Conklin
Hi guys, I wanted to throw my use case out there to see if what is
coming
Post by Bryan Conklin
might apply.
We're currently storing extremely verbose log data (about 18billion logs
per
Post by Bryan Conklin
day, each avg ~650 bytes, so more than 10TB per day) so that our
customers
Post by Bryan Conklin
can access the logs that are specific to them and query it for the
subset of
Post by Bryan Conklin
data they're interested in using any of the usual mongo query methods,
including regex matching. Basically we want our customers to be able to
access these logs for any use case for server access logs you can think
of.
Post by Bryan Conklin
They query our API, we query the database and return the information so
they
Post by Bryan Conklin
can use it for whatever business needs they may have. What we are
thinking
Post by Bryan Conklin
about doing is storing the hot data (most recent hour or so) of these
logs
Post by Bryan Conklin
in a distributed, in-memory database for quicker access to reads and
aggregation operations.
A second use case we have in mind is using an in-memory database for
creating pre-aggregations of the data where we store counters and
perform
Post by Bryan Conklin
upsert/increment operations on these counters as the logs mentioned
above
Post by Bryan Conklin
flow through the pipeline. The reason we need an in-memory database for
this
Post by Bryan Conklin
is mostly because our data set is so large that attempting to do it all
with
Post by Bryan Conklin
upsert/increments with our current setup is not fast enough. The idea
here
Post by Bryan Conklin
is the counters would be grouped by certain fields including timestamp
ranges and once a reasonable amount of time has passed, these
aggregations
Post by Bryan Conklin
would flush to a longer term, cold storage where they would live
indefinitely, only really being altered if some issue made a log come
through really late. So this use case is more about trying to speed up
write
Post by Bryan Conklin
operations while the first use case was more about trying to speed up
reads.
Post by Bryan Conklin
Aside from being distributed, we would also need the in-memory solution
to
Post by Bryan Conklin
be fairly reliable. You mentioned that it doesn't persist yet (not that
I
Post by Bryan Conklin
plan on using it in production before it is ready), but once it does I'd
be
Post by Bryan Conklin
interested to see real world benchmarks as far as impacts on speed when
persisting and not. I'd also want to know more about how it handles
persistence and how reliable it is, whether I would need to be able to
replay from a durable source if there were a failure in order to not
lose
Post by Bryan Conklin
data, if replica sets would be interchangeable with and as effective as
persisting, etc...
Also, it would be nice if there were some compression options as well ;)
I'm really looking forward to the in-memory options and looking forward
to
Post by Bryan Conklin
hearing more about them.
Post by Asya Kamsky
Ah, so it's the same reason our own driver developers wanted it :)
You can use the experimental engine then since it's not a production
system - please do report if you experience any crashes or errors.
Asya
Post by stephanos
@Stephen
Thanks for the 'secret' storage engine option :)
@Asya
My use case is actually very simple: easily run multiple mongo
processes
Post by Bryan Conklin
Post by Asya Kamsky
Post by stephanos
in
parallel during my integration/functional tests.
Stephan
Post by Asya Kamsky
Stennie already mentioned the "secret" experimental option to try an
in-memory engine (which is *not* production quality, which is why
it's
Post by Bryan Conklin
Post by Asya Kamsky
Post by stephanos
Post by Asya Kamsky
not mentioned in the docs).
We *are* working on a production quality in-memory storage engine
option, and to make sure that it fulfills the needs of the users,
I'd
Post by Bryan Conklin
Post by Asya Kamsky
Post by stephanos
Post by Asya Kamsky
love to know what sort of use case you have in mind for it - what
features were you planning on testing, etc.
Feel free to reply to me privately if you prefer ( asya at mongodb).
Asya
Post by stephanos
Hey guys,
I'd love to try the new experimental in-memory storage engine. But how
do I
use it?
In the docs for 'storage.engine' it just says: "Valid options
include
Post by Bryan Conklin
Post by Asya Kamsky
Post by stephanos
Post by Asya Kamsky
Post by stephanos
mmapv1
and wiredTiger".
Stephan
--
You received this message because you are subscribed to the Google Groups
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/3c6745fc-6ab7-4bc6-b2a1-e8e75568384c%40googlegroups.com.
Post by Bryan Conklin
Post by Asya Kamsky
Post by stephanos
Post by Asya Kamsky
Post by stephanos
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google Groups
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it,
send
Post by Bryan Conklin
Post by Asya Kamsky
Post by stephanos
an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/7fc9f044-2a7f-4862-aaa1-34790eed2498%40googlegroups.com.
Post by Bryan Conklin
Post by Asya Kamsky
Post by stephanos
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google
Groups
Post by Bryan Conklin
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google
Groups
Post by Bryan Conklin
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
an
<javascript:>.
Post by Bryan Conklin
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/4bfa9baf-e476-48ce-a34c-c3366352c73c%40googlegroups.com.
Post by Bryan Conklin
For more options, visit https://groups.google.com/d/optout.
--
MongoDB World is back! June 1-2 in NYC. Use code ASYA for 25% off!
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/de570fcc-ae6e-4ff2-8b2a-d5a7bc3f26b9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
s.molinari
2015-03-14 08:19:10 UTC
Permalink
I think Asya's point is more about the question "Do you need to make sure
the data is actually stored and safe (as in persisted)?" and it sound like
you do. From what I understand, the in-memory database being discussed here
is more like using Mongo like a caching system and if the server dies, your
data dies with it. If you go with a fully in-memory system (like Redis or
Aerospike), if anything, you'd need a whole lot more hardware, because
you'd need to fit your 10TB data (or 4-6TB with compression) all in RAM,
where as with MongoDB, you'd could say you have a 1:10 ratio between the
working set in RAM and the data stored on disk. So you only need 1/10th the
RAM/ hardware. (all just guesses at numbers, but you get the point).

I'd venture to say, if you are having issues with Toku, then you'd probably
have the same issues with MongoDB with WT too. So the issue isn't really
the database per se, but rather your use of it. I would imagine you have a
sharded system for this kind of activity? If not, that is probably the next
step.

Scott
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/9a3430f5-49da-46a9-8be5-158c8375c33b%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Asya Kamsky
2015-03-14 20:35:29 UTC
Permalink
The goal for different storage engines is similar to what you can
achieve now with sharding, tagging and having different capacity
hardware for each shard.

So for example, if you can figure out how to shard your collection so
that the shard key can reflect whether data is hot or not (and
remember that shard key values are immutable) then you could use
tag-aware sharding and have the data automatically migrated to cold
shards from hot shards when the tags are updated appropriately.

However, I would still encourage you to test out WiredTiger. TokuMX
is very heavily optimized for write throughput so it's possible that
you are seeing slower than you need reads because of that. WiredTiger
may have different performance profile - maybe writes are still fast
enough for your use case and reads get faster (so you won't need a
caching layer).

So in the future, when there are many storage engines, you should be
able to have a hot shard which is in-memory (with durability provided
only via replica nodes, and maybe one DR secondary that's
write-optimized) and cold shards which are some other storage engine,
maybe with larger, cheaper disks, etc.

All this requires is storage engines which are interoperable (i.e.
different nodes in the same replica set and/or shards can run
different storage engines). So any MongoDB storage engine qualifies
(including TokuMxSe) but not TokuMX since it uses different
replication and migration mechanisms.

Asya
Post by Bryan Conklin
We are actually currently using TokuMX for storage, which you may be
familiar with but we decided to use it about a year ago for its compression
and document level locks. But we're still experiencing some lag when it
comes to reads that have to traverse a lot of data and our high volume of
writes that require update operations.
So what you're saying about persisting is that it really is just a different
approach from replica sets and doesn't offer any benefits aside from less
servers required?
I also thought of one other question after I made the last post. You may
have noticed my approach requires a separation from hot data in-memory and
cold data on disk. Are there any plans to smoothly make the transition from
one to the other within mongo? I was working on the assumption that we'd
have to write a custom process to read from one and write to the other, but
it would be awesome if at some point when data satisfied a requirement for
being switched from hot to cold, we could issue a command or a trigger would
fire that would just migrate it within mongo from the in-memory storage
engine to an on-disk storage engine.
Post by Asya Kamsky
Have you tested WiredTiger for your use case? Its got better write
throughput for many use cases over MMAP so it might help significantly
(plus it's got compression so your TBs of data may compress to much
smaller files).
As far as in-memory database goes, you asked about ability to persist
data - what about using replication? The idea here is that if your
server fails, then things fail over to the newly elected primary which
has all the data in memory already...
For pure in-memory, it should really be data that you can always
regenerate if you need to, otherwise you need either disk or replica
node persistence, right?
Asya
Post by Bryan Conklin
Hi guys, I wanted to throw my use case out there to see if what is coming
might apply.
We're currently storing extremely verbose log data (about 18billion logs per
day, each avg ~650 bytes, so more than 10TB per day) so that our customers
can access the logs that are specific to them and query it for the subset of
data they're interested in using any of the usual mongo query methods,
including regex matching. Basically we want our customers to be able to
access these logs for any use case for server access logs you can think of.
They query our API, we query the database and return the information so they
can use it for whatever business needs they may have. What we are thinking
about doing is storing the hot data (most recent hour or so) of these logs
in a distributed, in-memory database for quicker access to reads and
aggregation operations.
A second use case we have in mind is using an in-memory database for
creating pre-aggregations of the data where we store counters and perform
upsert/increment operations on these counters as the logs mentioned above
flow through the pipeline. The reason we need an in-memory database for this
is mostly because our data set is so large that attempting to do it all with
upsert/increments with our current setup is not fast enough. The idea here
is the counters would be grouped by certain fields including timestamp
ranges and once a reasonable amount of time has passed, these aggregations
would flush to a longer term, cold storage where they would live
indefinitely, only really being altered if some issue made a log come
through really late. So this use case is more about trying to speed up write
operations while the first use case was more about trying to speed up reads.
Aside from being distributed, we would also need the in-memory solution to
be fairly reliable. You mentioned that it doesn't persist yet (not that I
plan on using it in production before it is ready), but once it does I'd be
interested to see real world benchmarks as far as impacts on speed when
persisting and not. I'd also want to know more about how it handles
persistence and how reliable it is, whether I would need to be able to
replay from a durable source if there were a failure in order to not lose
data, if replica sets would be interchangeable with and as effective as
persisting, etc...
Also, it would be nice if there were some compression options as well ;)
I'm really looking forward to the in-memory options and looking forward to
hearing more about them.
Post by Asya Kamsky
Ah, so it's the same reason our own driver developers wanted it :)
You can use the experimental engine then since it's not a production
system - please do report if you experience any crashes or errors.
Asya
Post by stephanos
@Stephen
Thanks for the 'secret' storage engine option :)
@Asya
My use case is actually very simple: easily run multiple mongo processes
in
parallel during my integration/functional tests.
Stephan
Post by Asya Kamsky
Stennie already mentioned the "secret" experimental option to try an
in-memory engine (which is *not* production quality, which is why it's
not mentioned in the docs).
We *are* working on a production quality in-memory storage engine
option, and to make sure that it fulfills the needs of the users, I'd
love to know what sort of use case you have in mind for it - what
features were you planning on testing, etc.
Feel free to reply to me privately if you prefer ( asya at mongodb).
Asya
Post by stephanos
Hey guys,
I'd love to try the new experimental in-memory storage engine. But how
do I
use it?
In the docs for 'storage.engine' it just says: "Valid options include
mmapv1
and wiredTiger".
Stephan
--
You received this message because you are subscribed to the Google Groups
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/3c6745fc-6ab7-4bc6-b2a1-e8e75568384c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google Groups
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/7fc9f044-2a7f-4862-aaa1-34790eed2498%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google Groups
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/4bfa9baf-e476-48ce-a34c-c3366352c73c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
MongoDB World is back! June 1-2 in NYC. Use code ASYA for 25% off!
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/de570fcc-ae6e-4ff2-8b2a-d5a7bc3f26b9%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
--
MongoDB World is back! June 1-2 in NYC. Use code ASYA for 25% off!
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/CAOe6dJDZOoM1crJ0SbiP7t7THHDzUfvfyKggqfWFNZ8iuPMzFQ%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Loading...