Hi guys, I wanted to throw my use case out there to see if what is coming
might apply.
We're currently storing extremely verbose log data (about 18billion logs
per day, each avg ~650 bytes, so more than 10TB per day) so that our
customers can access the logs that are specific to them and query it for
the subset of data they're interested in using any of the usual mongo query
methods, including regex matching. Basically we want our customers to be
able to access these logs for any use case for server access logs you can
think of. They query our API, we query the database and return the
information so they can use it for whatever business needs they may have.
What we are thinking about doing is storing the hot data (most recent hour
or so) of these logs in a distributed, in-memory database for quicker
access to reads and aggregation operations.
A second use case we have in mind is using an in-memory database for
creating pre-aggregations of the data where we store counters and perform
upsert/increment operations on these counters as the logs mentioned above
flow through the pipeline. The reason we need an in-memory database for
this is mostly because our data set is so large that attempting to do it
all with upsert/increments with our current setup is not fast enough. The
idea here is the counters would be grouped by certain fields including
timestamp ranges and once a reasonable amount of time has passed, these
aggregations would flush to a longer term, cold storage where they would
live indefinitely, only really being altered if some issue made a log come
through really late. So this use case is more about trying to speed up
write operations while the first use case was more about trying to speed up
reads.
Aside from being distributed, we would also need the in-memory solution to
be fairly reliable. You mentioned that it doesn't persist yet (not that I
plan on using it in production before it is ready), but once it does I'd be
interested to see real world benchmarks as far as impacts on speed when
persisting and not. I'd also want to know more about how it handles
persistence and how reliable it is, whether I would need to be able to
replay from a durable source if there were a failure in order to not lose
data, if replica sets would be interchangeable with and as effective as
persisting, etc...
Also, it would be nice if there were some compression options as well ;)
I'm really looking forward to the in-memory options and looking forward to
hearing more about them.
Post by Asya KamskyAh, so it's the same reason our own driver developers wanted it :)
You can use the experimental engine then since it's not a production
system - please do report if you experience any crashes or errors.
Asya
Post by stephanos@Stephen
Thanks for the 'secret' storage engine option :)
@Asya
My use case is actually very simple: easily run multiple mongo processes
in
Post by stephanosparallel during my integration/functional tests.
Stephan
Post by Asya KamskyStennie already mentioned the "secret" experimental option to try an
in-memory engine (which is *not* production quality, which is why it's
not mentioned in the docs).
We *are* working on a production quality in-memory storage engine
option, and to make sure that it fulfills the needs of the users, I'd
love to know what sort of use case you have in mind for it - what
features were you planning on testing, etc.
Feel free to reply to me privately if you prefer ( asya at mongodb).
Asya
Post by stephanosHey guys,
I'd love to try the new experimental in-memory storage engine. But
how
Post by stephanosPost by Asya KamskyPost by stephanosdo I
use it?
In the docs for 'storage.engine' it just says: "Valid options include mmapv1
and wiredTiger".
Stephan
--
You received this message because you are subscribed to the Google Groups
"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it,
send
Post by stephanosPost by Asya KamskyPost by stephanosan
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/3c6745fc-6ab7-4bc6-b2a1-e8e75568384c%40googlegroups.com.
Post by stephanosPost by Asya KamskyPost by stephanosFor more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google
Groups
Post by stephanos"mongodb-user"
group.
http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google
Groups
Post by stephanos"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
an
<javascript:>.
Post by stephanosVisit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/7fc9f044-2a7f-4862-aaa1-34790eed2498%40googlegroups.com.
Post by stephanosFor more options, visit https://groups.google.com/d/optout.
--
{ "name" : "Asya Kamsky",
"place" : [ "New York", "Palo Alto", "Everywhere else" ],
"blog" : "http://www.askasya.com/",
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/4bfa9baf-e476-48ce-a34c-c3366352c73c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.