Discussion:
Huge difference between size and storageSize in db.collection.stats()
Abhinav Dhasmana
2012-10-31 04:57:37 UTC
Permalink
Hi

I have a collection where I have object for each user and i do multiple
insert, delete within this object. I moved to a new hardware 7 days ago and
db was complaining of space. When i did db.collection.stats(), I see this

{
"ns" : "****.newsfeeds",
"count" : 8792053,
"size" : 327911847456,
"avgObjSize" : 37296.391122301015,
"storageSize" : 927976117552,
"numExtents" : 453,
"nindexes" : 2,
"lastExtentSize" : 2146426864,
"paddingFactor" : 1.001000006715619,
"systemFlags" : 1,
"userFlags" : 0,
"totalIndexSize" : 674650816,
"indexSizes" : {
"_id_" : 303525824,
"user_id_1" : 371124992
},
"ok" : 1
}

Looking at this, i cannot understand why there is such a huge diff in size
and storage size.

"size" : 327911847456,
"storageSize" : 927976117552,

Thanks
Abhinav
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/***@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/***@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
Andre de Frere
2012-11-01 02:15:37 UTC
Permalink
Hi Abhinav,

MongoDB does not remove space in empty extents when it is no longer using
them. It will keep a list of empty extents so that it can reuse them later
on. This means that when data is deleted from MonogDB it does not actually
reduce the size on disk of the data files.

The output for "size" shows the data size for the collection, while
"storageSize" includes all the empty extents.

The only way to remove the empty extents is to repair the database. This
is a very resource intensive operation, and can use as much as twice the
storage space while the repair is taking place. This is because the repair
will copy all of the database records to a new file before removing the old
file. Be aware that the repair is a blocking operation.

Regards,
André
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/***@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/***@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
Abhinav Dhasmana
2012-11-01 11:20:50 UTC
Permalink
Thanks Andre for the detailed explanation.

I understand. So does that mean that for my scenario, I cannot user
MongoDB? It is not possible to run repair database every week.

Another alternative approach which I found
is https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/jvLlZdQs0og
where we pre-populate the data.

Abhinav
Post by Andre de Frere
Hi Abhinav,
MongoDB does not remove space in empty extents when it is no longer using
them. It will keep a list of empty extents so that it can reuse them later
on. This means that when data is deleted from MonogDB it does not actually
reduce the size on disk of the data files.
The output for "size" shows the data size for the collection, while
"storageSize" includes all the empty extents.
The only way to remove the empty extents is to repair the database. This
is a very resource intensive operation, and can use as much as twice the
storage space while the repair is taking place. This is because the repair
will copy all of the database records to a new file before removing the old
file. Be aware that the repair is a blocking operation.
Regards,
André
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/***@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/***@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
Andre de Frere
2012-11-05 23:46:43 UTC
Permalink
Hi Abhinav,

Reusing the empty extents should not prevent you from using MongoDB in your
scenario. These empty extents should be reused by new data.

Preallocating your data would be one way of reducing the size that each
extent grows. You might also want to test options like "smallfiles" but be
aware that this may have performance implications depending on your use
case. You can find more information on small files (and other startup
options) in the documentation:
http://docs.mongodb.org/manual/reference/configuration-options/#smallfiles

Regards,
André
Post by Abhinav Dhasmana
Thanks Andre for the detailed explanation.
I understand. So does that mean that for my scenario, I cannot user
MongoDB? It is not possible to run repair database every week.
Another alternative approach which I found is
https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/jvLlZdQs0ogwhere we pre-populate the data.
Abhinav
Post by Andre de Frere
Hi Abhinav,
MongoDB does not remove space in empty extents when it is no longer using
them. It will keep a list of empty extents so that it can reuse them later
on. This means that when data is deleted from MonogDB it does not actually
reduce the size on disk of the data files.
The output for "size" shows the data size for the collection, while
"storageSize" includes all the empty extents.
The only way to remove the empty extents is to repair the database. This
is a very resource intensive operation, and can use as much as twice the
storage space while the repair is taking place. This is because the repair
will copy all of the database records to a new file before removing the old
file. Be aware that the repair is a blocking operation.
Regards,
André
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/***@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/***@public.gmane.org
See also the IRC channel -- freenode.net#mongodb
d***@public.gmane.org
2013-08-26 14:20:49 UTC
Permalink
I have similar problem.
My data is 27 GB while the disk usage is 100+ GB, I run db.repairDatabase
() and expected that the disk file size will be about 300 GB but even after
repair operation the storage size is 80GB
I can't understand why 27GB of data occupy 80 GB on disk (even with
pre-allocation).
My use case is similar , 10% of delete insert and update and 90% Read.
Does the storage size also include the index size?
Tx
Dror
Post by Andre de Frere
Hi Abhinav,
Reusing the empty extents should not prevent you from using MongoDB in
your scenario. These empty extents should be reused by new data.
Preallocating your data would be one way of reducing the size that each
extent grows. You might also want to test options like "smallfiles" but be
aware that this may have performance implications depending on your use
case. You can find more information on small files (and other startup
http://docs.mongodb.org/manual/reference/configuration-options/#smallfiles
Regards,
André
Post by Abhinav Dhasmana
Thanks Andre for the detailed explanation.
I understand. So does that mean that for my scenario, I cannot user
MongoDB? It is not possible to run repair database every week.
Another alternative approach which I found is
https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/jvLlZdQs0ogwhere we pre-populate the data.
Abhinav
Post by Andre de Frere
Hi Abhinav,
MongoDB does not remove space in empty extents when it is no longer
using them. It will keep a list of empty extents so that it can reuse them
later on. This means that when data is deleted from MonogDB it does not
actually reduce the size on disk of the data files.
The output for "size" shows the data size for the collection, while
"storageSize" includes all the empty extents.
The only way to remove the empty extents is to repair the database.
This is a very resource intensive operation, and can use as much as twice
the storage space while the repair is taking place. This is because the
repair will copy all of the database records to a new file before removing
the old file. Be aware that the repair is a blocking operation.
Regards,
André
--
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/***@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/***@public.gmane.org
See also the IRC channel -- freenode.net#mongodb

---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit https://groups.google.com/groups/opt_out.
Asya Kamsky
2013-08-29 08:20:35 UTC
Permalink
Please provide output to db.stats() for all databases in that mongod. Yes,
data and indexes both use disk space.
Post by d***@public.gmane.org
I have similar problem.
My data is 27 GB while the disk usage is 100+ GB, I run db.repairDatabase
() and expected that the disk file size will be about 300 GB but even after
repair operation the storage size is 80GB
I can't understand why 27GB of data occupy 80 GB on disk (even with
pre-allocation).
My use case is similar , 10% of delete insert and update and 90% Read.
Does the storage size also include the index size?
Tx
Dror
Post by Andre de Frere
Hi Abhinav,
Reusing the empty extents should not prevent you from using MongoDB in
your scenario. These empty extents should be reused by new data.
Preallocating your data would be one way of reducing the size that each
extent grows. You might also want to test options like "smallfiles" but be
aware that this may have performance implications depending on your use
case. You can find more information on small files (and other startup
options) in the documentation: http://docs.mongodb.org/**
manual/reference/**configuration-options/#**smallfiles<http://docs.mongodb.org/manual/reference/configuration-options/#smallfiles>
Regards,
André
Post by Abhinav Dhasmana
Thanks Andre for the detailed explanation.
I understand. So does that mean that for my scenario, I cannot user
MongoDB? It is not possible to run repair database every week.
Another alternative approach which I found is https://groups.google.com/
**forum/?fromgroups=#!topic/**mongodb-user/jvLlZdQs0og<https://groups.google.com/forum/?fromgroups=#!topic/mongodb-user/jvLlZdQs0og>where we pre-populate the data.
Abhinav
Post by Andre de Frere
Hi Abhinav,
MongoDB does not remove space in empty extents when it is no longer
using them. It will keep a list of empty extents so that it can reuse them
later on. This means that when data is deleted from MonogDB it does not
actually reduce the size on disk of the data files.
The output for "size" shows the data size for the collection, while
"storageSize" includes all the empty extents.
The only way to remove the empty extents is to repair the database.
This is a very resource intensive operation, and can use as much as twice
the storage space while the repair is taking place. This is because the
repair will copy all of the database records to a new file before removing
the old file. Be aware that the repair is a blocking operation.
Regards,
André
--
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group, send email to
See also the IRC channel -- freenode.net#mongodb
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
For more options, visit https://groups.google.com/groups/opt_out.
--
--
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/***@public.gmane.org
To unsubscribe from this group, send email to
mongodb-user+unsubscribe-/***@public.gmane.org
See also the IRC channel -- freenode.net#mongodb

---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+unsubscribe-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
For more options, visit https://groups.google.com/groups/opt_out.
Loading...