j***@findmeon.com
2015-10-10 16:28:28 UTC
We're giving MongoDB a detailed look as the 'manager' for our archived
data. We operate a web spider and need to save snapshots for a while
(potential re-indexing based on new rulesets). Write once, never update,
rarely read, highly likely to delete. In testing different types of
compression strategies, the best results came from bucketing many documents
together.
Naturally, WiredTiger caught my eye. The block_compressor seems to do
exactly what we need. Great!
Running some tests on our sample data, it's not working as well as I'd
hope. I'm only seeing about 50% compression.
I think this might be caused by two things:
1. the level of zlib compression
2. the size of blocks that wiredtiger managers
Does anyone know what these are set at, and if these are controllable? I
found some various info online that says wiredtiger uses zlib level 6 (but
no source) and that the blocksize on the wiredtiger storage were
configurable -- but that was pre-mongodb on a direct install.
Right now, it looks like we'd get better compression on our dataset using
document level compression at zlib level 9.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/c7e8197d-57b1-4ac1-ab87-a33ad7dafd46%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
data. We operate a web spider and need to save snapshots for a while
(potential re-indexing based on new rulesets). Write once, never update,
rarely read, highly likely to delete. In testing different types of
compression strategies, the best results came from bucketing many documents
together.
Naturally, WiredTiger caught my eye. The block_compressor seems to do
exactly what we need. Great!
Running some tests on our sample data, it's not working as well as I'd
hope. I'm only seeing about 50% compression.
I think this might be caused by two things:
1. the level of zlib compression
2. the size of blocks that wiredtiger managers
Does anyone know what these are set at, and if these are controllable? I
found some various info online that says wiredtiger uses zlib level 6 (but
no source) and that the blocksize on the wiredtiger storage were
configurable -- but that was pre-mongodb on a direct install.
Right now, it looks like we'd get better compression on our dataset using
document level compression at zlib level 9.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
For other MongoDB technical support options, see: http://www.mongodb.org/about/support/.
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at http://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/c7e8197d-57b1-4ac1-ab87-a33ad7dafd46%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.