Discussion:
"Document exceeds allowed max BSON size." using GridFS.
(too old to reply)
Amanda
2016-07-13 14:02:54 UTC
Permalink
Hello,

I am attempting to insert files of different types (.err, .txt, .log, .csv)
using GridFS. I have files of all sizes ranging from 5MB to 100MB. I am
running a Ruby script to populate MongoDB. Files that are smaller than 16MB
get inserted correctly, but files that are bigger do not get inserted and
generates this error:

[2016-07-13T09:35:11.734995 #20802] DEBUG -- : MONGODB | localhost:27017 |
test2.insert | FAILED | Document exceeds allowed max BSON size. The max is
16777216. | 0.183450212s

I installed the Ruby Driver using the MongoDB documentation
website: https://docs.mongodb.com/ecosystem/tutorial/ruby-driver-tutorial/#ruby-driver-tutorial

To install Ruby Driver I ran:
gem install mongo


I have:


*Ruby version: 2.3.0*
*MongoDB shell version: 3.2.7 *
*Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-36-generic x86_64)*

Thank you for your help,
Amanda
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/f1c3d205-5bfe-4384-bfa6-ebf90ded2be8%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Emily Stolfo
2016-07-13 15:45:59 UTC
Permalink
Hi Amanda

Would you mind showing me the lines of code using the Ruby driver leading
up to this error?
Thanks

Emily
Post by Amanda
Hello,
I am attempting to insert files of different types (.err, .txt, .log,
.csv) using GridFS. I have files of all sizes ranging from 5MB to 100MB. I
am running a Ruby script to populate MongoDB. Files that are smaller than
16MB get inserted correctly, but files that are bigger do not get inserted
[2016-07-13T09:35:11.734995 #20802] DEBUG -- : MONGODB | localhost:27017
| test2.insert | FAILED | Document exceeds allowed max BSON size. The max
is 16777216. | 0.183450212s
https://docs.mongodb.com/ecosystem/tutorial/ruby-driver-tutorial/#ruby-driver-tutorial
gem install mongo
*Ruby version: 2.3.0*
*MongoDB shell version: 3.2.7 *
*Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-36-generic x86_64)*
Thank you for your help,
Amanda
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/f1c3d205-5bfe-4384-bfa6-ebf90ded2be8%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/f1c3d205-5bfe-4384-bfa6-ebf90ded2be8%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/CAKw8aW9aMbKZtdzCqDiEWZRnea9eHGTkd11Gvf_1CytyFCb-JA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Amanda
2016-07-13 18:18:33 UTC
Permalink
Hi Emily,

The script that I am running to populate MongoDB has the following lines to
insert:


- First, I get an array of the content of a folder and then call the
'populate' function:
Dir.chdir "/path/to/folder/"
arr = Array.new
arr = Dir.glob("**/*")

# Create a new instance of the populate class
pop = PopulateMongo.new
pop.populate(arr)



- Here I create the connection to MongoDB and create an instance to a *Gridfsloader
class* that allows me to perform Gridfs actions(I attached the
gridfs_loader.rb file). Then I check that it is a file object, get the
pertinent metadata, and then insert. My code also has to handle inserting
tar files and its content and that is the other half of the code.
*** I attached the gridfs_loader.rb class in this post ***
def populate(arr = nil)
# Create a connection to MongoDB and use test2 database
@grid = GridfsLoader.new
@client = GridfsLoader.mongo_client("mongodb://localhost:27017")
arr.each do |f|
if File.file?(f.to_s) == true
if File.extname(f.to_s) == ".tgz"
meta = getMetadata(f.to_s, false)
@grid.import_grid_file(f.to_s, File.basename(f.to_s), File.extname(f.to_s),
meta)
tar_extract = Gem::Package::TarReader.new(Zlib::GzipReader.open(f))
tar_extract.rewind
populateTar(tar_extract, f.to_s)
tar_extract.close

else
meta = getMetadata(f.to_s, false)
@grid.import_grid_file(f.to_s, File.basename(f.to_s), File.extname(f.to_s),
meta)
end
end
end
end


Also, here is what I included at the top of my script:
require './gridfs_loader'
require 'rubygems/package'
require 'zlib'
require 'mongo'


*** I attached the gridfs_loader.rb class in this post ***

Thank you for your help!
Amanda
Post by Emily Stolfo
Hi Amanda
Would you mind showing me the lines of code using the Ruby driver leading
up to this error?
Thanks
Emily
Post by Amanda
Hello,
I am attempting to insert files of different types (.err, .txt, .log,
.csv) using GridFS. I have files of all sizes ranging from 5MB to 100MB. I
am running a Ruby script to populate MongoDB. Files that are smaller than
16MB get inserted correctly, but files that are bigger do not get inserted
[2016-07-13T09:35:11.734995 #20802] DEBUG -- : MONGODB | localhost:27017
| test2.insert | FAILED | Document exceeds allowed max BSON size. The max
is 16777216. | 0.183450212s
https://docs.mongodb.com/ecosystem/tutorial/ruby-driver-tutorial/#ruby-driver-tutorial
gem install mongo
*Ruby version: 2.3.0*
*MongoDB shell version: 3.2.7 *
*Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-36-generic x86_64)*
Thank you for your help,
Amanda
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
<javascript:>.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/f1c3d205-5bfe-4384-bfa6-ebf90ded2be8%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/f1c3d205-5bfe-4384-bfa6-ebf90ded2be8%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/9c51a0db-63d2-4d2b-9bfb-fb62c67a1393%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Emily Stolfo
2016-07-14 10:20:23 UTC
Permalink
Hi Amanda

I see from your script you're using Grid::File objects and working with
file chunks directly but I would recommend instead using the "stream" API
described here
<https://docs.mongodb.com/ecosystem/tutorial/ruby-driver-tutorial/#working-with-write-streams>
.

That interface implements the common MongoDB driver GridFS API described
here
<https://github.com/mongodb/specifications/blob/master/source/gridfs/gridfs-spec.rst>.
If you stick with those methods, you'll have no problem inserting files
larger than 16MB.

If you continue to get errors when using the stream API, please let me know.

Thanks

Emily
Post by Amanda
Hi Emily,
The script that I am running to populate MongoDB has the following lines
- First, I get an array of the content of a folder and then call the
Dir.chdir "/path/to/folder/"
arr = Array.new
arr = Dir.glob("**/*")
# Create a new instance of the populate class
pop = PopulateMongo.new
pop.populate(arr)
- Here I create the connection to MongoDB and create an instance to a *Gridfsloader
class* that allows me to perform Gridfs actions(I attached the
gridfs_loader.rb file). Then I check that it is a file object, get the
pertinent metadata, and then insert. My code also has to handle inserting
tar files and its content and that is the other half of the code.
*** I attached the gridfs_loader.rb class in this post ***
def populate(arr = nil)
# Create a connection to MongoDB and use test2 database
@grid = GridfsLoader.new
@client = GridfsLoader.mongo_client("mongodb://localhost:27017")
arr.each do |f|
if File.file?(f.to_s) == true
if File.extname(f.to_s) == ".tgz"
meta = getMetadata(f.to_s, false)
@grid.import_grid_file(f.to_s, File.basename(f.to_s),
File.extname(f.to_s), meta)
tar_extract = Gem::Package::TarReader.new(Zlib::GzipReader.open(f))
tar_extract.rewind
populateTar(tar_extract, f.to_s)
tar_extract.close
else
meta = getMetadata(f.to_s, false)
@grid.import_grid_file(f.to_s, File.basename(f.to_s),
File.extname(f.to_s), meta)
end
end
end
end
require './gridfs_loader'
require 'rubygems/package'
require 'zlib'
require 'mongo'
*** I attached the gridfs_loader.rb class in this post ***
Thank you for your help!
Amanda
Post by Emily Stolfo
Hi Amanda
Would you mind showing me the lines of code using the Ruby driver leading
up to this error?
Thanks
Emily
Post by Amanda
Hello,
I am attempting to insert files of different types (.err, .txt, .log,
.csv) using GridFS. I have files of all sizes ranging from 5MB to 100MB. I
am running a Ruby script to populate MongoDB. Files that are smaller than
16MB get inserted correctly, but files that are bigger do not get inserted
[2016-07-13T09:35:11.734995 #20802] DEBUG -- : MONGODB |
localhost:27017 | test2.insert | FAILED | Document exceeds allowed max BSON
size. The max is 16777216. | 0.183450212s
https://docs.mongodb.com/ecosystem/tutorial/ruby-driver-tutorial/#ruby-driver-tutorial
gem install mongo
*Ruby version: 2.3.0*
*MongoDB shell version: 3.2.7 *
*Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-36-generic x86_64)*
Thank you for your help,
Amanda
--
You received this message because you are subscribed to the Google
Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/f1c3d205-5bfe-4384-bfa6-ebf90ded2be8%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/f1c3d205-5bfe-4384-bfa6-ebf90ded2be8%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/9c51a0db-63d2-4d2b-9bfb-fb62c67a1393%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/9c51a0db-63d2-4d2b-9bfb-fb62c67a1393%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/CAKw8aW_GFKF5uyFL_tq66jDofrXfm8rCe6DysmZMvZOdWm_uaA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Amanda
2016-07-14 19:49:59 UTC
Permalink
Hi Emily,

I am still having the same problem even when using the "stream" API.

Here is my new gridfs_loader.rb script: (My main code stayed the same as in
my last post of this conversation thread.)

# GridFS_Loader.rb

require 'rubygems'
require 'mongo'
Mongo::Logger.logger.level = ::Logger::DEBUG

class GridfsLoader
# performs the detailed work to create a new MongoDB connection
def self.create_connection(mongo_url=nil, db_name=nil)
mongo_url ||= "mongodb://localhost:27017"
db_name ||= "test2"
STDERR.puts "creating connection #{mongo_url} #{db_name}"
db_client = Mongo::Client.new(mongo_url)
db_client.use(db_name)
end

#creates and/or returns a MongoDB connection cached in the class
def self.mongo_client(mongo_url=nil, db_name=nil)
@@db_client ||= create_connection(mongo_url, db_name)
end

#sets up the object instance with a MongoDB connection and create the
fs_bucket object
def initialize(mongo_url=nil, db_name=nil, db = nil)
@@db_client = self.class.mongo_client(mongo_url=nil, db_name=nil)
end

# reads the contents of the file and inserts into GridFS along
# with some optional metadata. The :_id of the file is returned.
def import_grid_file(file_path, name=nil, contentType=nil, metadata=nil)
fs_bucket = @@db_client.database.fs()
os_file=File.open(file_path,'r')

description = {}
description[:filename]=name if !name.nil?
description[:contentType]=name if !contentType.nil?
description[:metadata] = metadata if !metadata.nil?
fs_bucket.upload_from_stream(name, os_file, description)
os_file.close()
end
end

I did start to notice that pretty much all of the failures occurred within
the *fs.chunks* collection.

Thanks,
Amanda
Post by Emily Stolfo
Hi Amanda
I see from your script you're using Grid::File objects and working with
file chunks directly but I would recommend instead using the "stream" API
described here
<https://docs.mongodb.com/ecosystem/tutorial/ruby-driver-tutorial/#working-with-write-streams>
.
That interface implements the common MongoDB driver GridFS API described
here
<https://github.com/mongodb/specifications/blob/master/source/gridfs/gridfs-spec.rst>.
If you stick with those methods, you'll have no problem inserting files
larger than 16MB.
If you continue to get errors when using the stream API, please let me know.
Thanks
Emily
Post by Amanda
Hi Emily,
The script that I am running to populate MongoDB has the following lines
- First, I get an array of the content of a folder and then call the
Dir.chdir "/path/to/folder/"
arr = Array.new
arr = Dir.glob("**/*")
# Create a new instance of the populate class
pop = PopulateMongo.new
pop.populate(arr)
- Here I create the connection to MongoDB and create an instance to a *Gridfsloader
class* that allows me to perform Gridfs actions(I attached the
gridfs_loader.rb file). Then I check that it is a file object, get the
pertinent metadata, and then insert. My code also has to handle inserting
tar files and its content and that is the other half of the code.
*** I attached the gridfs_loader.rb class in this post ***
def populate(arr = nil)
# Create a connection to MongoDB and use test2 database
@grid = GridfsLoader.new
@client = GridfsLoader.mongo_client("mongodb://localhost:27017")
arr.each do |f|
if File.file?(f.to_s) == true
if File.extname(f.to_s) == ".tgz"
meta = getMetadata(f.to_s, false)
@grid.import_grid_file(f.to_s, File.basename(f.to_s),
File.extname(f.to_s), meta)
tar_extract = Gem::Package::TarReader.new(Zlib::GzipReader.open(f))
tar_extract.rewind
populateTar(tar_extract, f.to_s)
tar_extract.close
else
meta = getMetadata(f.to_s, false)
@grid.import_grid_file(f.to_s, File.basename(f.to_s),
File.extname(f.to_s), meta)
end
end
end
end
require './gridfs_loader'
require 'rubygems/package'
require 'zlib'
require 'mongo'
*** I attached the gridfs_loader.rb class in this post ***
Thank you for your help!
Amanda
Post by Emily Stolfo
Hi Amanda
Would you mind showing me the lines of code using the Ruby driver
leading up to this error?
Thanks
Emily
Post by Amanda
Hello,
I am attempting to insert files of different types (.err, .txt, .log,
.csv) using GridFS. I have files of all sizes ranging from 5MB to 100MB. I
am running a Ruby script to populate MongoDB. Files that are smaller than
16MB get inserted correctly, but files that are bigger do not get inserted
[2016-07-13T09:35:11.734995 #20802] DEBUG -- : MONGODB |
localhost:27017 | test2.insert | FAILED | Document exceeds allowed max BSON
size. The max is 16777216. | 0.183450212s
https://docs.mongodb.com/ecosystem/tutorial/ruby-driver-tutorial/#ruby-driver-tutorial
gem install mongo
*Ruby version: 2.3.0*
*MongoDB shell version: 3.2.7 *
*Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-36-generic x86_64)*
Thank you for your help,
Amanda
--
You received this message because you are subscribed to the Google
Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/f1c3d205-5bfe-4384-bfa6-ebf90ded2be8%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/f1c3d205-5bfe-4384-bfa6-ebf90ded2be8%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
<javascript:>.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/9c51a0db-63d2-4d2b-9bfb-fb62c67a1393%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/9c51a0db-63d2-4d2b-9bfb-fb62c67a1393%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/368edbef-c057-48cb-bf26-56f165490f46%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Emily Stolfo
2016-07-15 11:43:39 UTC
Permalink
Hi Amanda

I was able to reproduce what you're seeing but the error doesn't have an
effect on the files being uploaded. The driver first tries to send all
chunks of a file to the server in one insert request, but when it finds
that they together exceed the maximum BSON size, it breaks them up into
multiple insert requests. This error message you're seeing is part of the
internal implementation of the Bulk Write API
<https://docs.mongodb.com/ecosystem/tutorial/ruby-driver-tutorial/#bulk-operations>,
used by the GridFS upload API.

Have you found that the files aren't being successfully uploaded? In my
testing, this error message has no effect on the ability to upload files
larger than 16MB. The driver adapts and splits up the inserts into small
enough requests that are successfully executed so I understand that this
error is confusing. We can adapt the Bulk Write implementation to not show
this error.

In the meantime, please let me know if you are finding that files aren't
being successfully uploaded.

Thanks
Emily
Post by Amanda
Hi Emily,
I am still having the same problem even when using the "stream" API.
Here is my new gridfs_loader.rb script: (My main code stayed the same as
in my last post of this conversation thread.)
# GridFS_Loader.rb
require 'rubygems'
require 'mongo'
Mongo::Logger.logger.level = ::Logger::DEBUG
class GridfsLoader
# performs the detailed work to create a new MongoDB connection
def self.create_connection(mongo_url=nil, db_name=nil)
mongo_url ||= "mongodb://localhost:27017"
db_name ||= "test2"
STDERR.puts "creating connection #{mongo_url} #{db_name}"
db_client = Mongo::Client.new(mongo_url)
db_client.use(db_name)
end
#creates and/or returns a MongoDB connection cached in the class
def self.mongo_client(mongo_url=nil, db_name=nil)
@@db_client ||= create_connection(mongo_url, db_name)
end
#sets up the object instance with a MongoDB connection and create the
fs_bucket object
def initialize(mongo_url=nil, db_name=nil, db = nil)
@@db_client = self.class.mongo_client(mongo_url=nil, db_name=nil)
end
# reads the contents of the file and inserts into GridFS along
# with some optional metadata. The :_id of the file is returned.
def import_grid_file(file_path, name=nil, contentType=nil, metadata=nil)
os_file=File.open(file_path,'r')
description = {}
description[:filename]=name if !name.nil?
description[:contentType]=name if !contentType.nil?
description[:metadata] = metadata if !metadata.nil?
fs_bucket.upload_from_stream(name, os_file, description)
os_file.close()
end
end
I did start to notice that pretty much all of the failures occurred within
the *fs.chunks* collection.
Thanks,
Amanda
Post by Emily Stolfo
Hi Amanda
I see from your script you're using Grid::File objects and working with
file chunks directly but I would recommend instead using the "stream" API
described here
<https://docs.mongodb.com/ecosystem/tutorial/ruby-driver-tutorial/#working-with-write-streams>
.
That interface implements the common MongoDB driver GridFS API described
here
<https://github.com/mongodb/specifications/blob/master/source/gridfs/gridfs-spec.rst>.
If you stick with those methods, you'll have no problem inserting files
larger than 16MB.
If you continue to get errors when using the stream API, please let me know.
Thanks
Emily
Post by Amanda
Hi Emily,
The script that I am running to populate MongoDB has the following lines
- First, I get an array of the content of a folder and then call the
Dir.chdir "/path/to/folder/"
arr = Array.new
arr = Dir.glob("**/*")
# Create a new instance of the populate class
pop = PopulateMongo.new
pop.populate(arr)
- Here I create the connection to MongoDB and create an instance to a *Gridfsloader
class* that allows me to perform Gridfs actions(I attached the
gridfs_loader.rb file). Then I check that it is a file object, get the
pertinent metadata, and then insert. My code also has to handle inserting
tar files and its content and that is the other half of the code.
*** I attached the gridfs_loader.rb class in this post ***
def populate(arr = nil)
# Create a connection to MongoDB and use test2 database
@grid = GridfsLoader.new
@client = GridfsLoader.mongo_client("mongodb://localhost:27017")
arr.each do |f|
if File.file?(f.to_s) == true
if File.extname(f.to_s) == ".tgz"
meta = getMetadata(f.to_s, false)
@grid.import_grid_file(f.to_s, File.basename(f.to_s),
File.extname(f.to_s), meta)
tar_extract = Gem::Package::TarReader.new(Zlib::GzipReader.open(f))
tar_extract.rewind
populateTar(tar_extract, f.to_s)
tar_extract.close
else
meta = getMetadata(f.to_s, false)
@grid.import_grid_file(f.to_s, File.basename(f.to_s),
File.extname(f.to_s), meta)
end
end
end
end
require './gridfs_loader'
require 'rubygems/package'
require 'zlib'
require 'mongo'
*** I attached the gridfs_loader.rb class in this post ***
Thank you for your help!
Amanda
Post by Emily Stolfo
Hi Amanda
Would you mind showing me the lines of code using the Ruby driver
leading up to this error?
Thanks
Emily
Post by Amanda
Hello,
I am attempting to insert files of different types (.err, .txt, .log,
.csv) using GridFS. I have files of all sizes ranging from 5MB to 100MB. I
am running a Ruby script to populate MongoDB. Files that are smaller than
16MB get inserted correctly, but files that are bigger do not get inserted
[2016-07-13T09:35:11.734995 #20802] DEBUG -- : MONGODB |
localhost:27017 | test2.insert | FAILED | Document exceeds allowed max BSON
size. The max is 16777216. | 0.183450212s
https://docs.mongodb.com/ecosystem/tutorial/ruby-driver-tutorial/#ruby-driver-tutorial
gem install mongo
*Ruby version: 2.3.0*
*MongoDB shell version: 3.2.7 *
*Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-36-generic x86_64)*
Thank you for your help,
Amanda
--
You received this message because you are subscribed to the Google
Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/f1c3d205-5bfe-4384-bfa6-ebf90ded2be8%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/f1c3d205-5bfe-4384-bfa6-ebf90ded2be8%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/9c51a0db-63d2-4d2b-9bfb-fb62c67a1393%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/9c51a0db-63d2-4d2b-9bfb-fb62c67a1393%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/368edbef-c057-48cb-bf26-56f165490f46%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/368edbef-c057-48cb-bf26-56f165490f46%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/CAKw8aW_MQxq-UZPcetzgP8Tge-gLyTNHtYRTbnz6xsECM3PH8Q%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.
Amanda
2016-07-15 13:24:12 UTC
Permalink
Hi Emily,

I did a test with a smaller amount of data and the data is being loaded
successfully, despite of the error. Thank you so much for all of your help
identifying the problem!

I would have never noticed that was what was happening. Since I was working
with a very large amount of files, seeing the "Failed" insert error I was
not able to check all of it.

Thanks again,
Amanda
Post by Emily Stolfo
Hi Amanda
I was able to reproduce what you're seeing but the error doesn't have an
effect on the files being uploaded. The driver first tries to send all
chunks of a file to the server in one insert request, but when it finds
that they together exceed the maximum BSON size, it breaks them up into
multiple insert requests. This error message you're seeing is part of the
internal implementation of the Bulk Write API
<https://docs.mongodb.com/ecosystem/tutorial/ruby-driver-tutorial/#bulk-operations>,
used by the GridFS upload API.
Have you found that the files aren't being successfully uploaded? In my
testing, this error message has no effect on the ability to upload files
larger than 16MB. The driver adapts and splits up the inserts into small
enough requests that are successfully executed so I understand that this
error is confusing. We can adapt the Bulk Write implementation to not show
this error.
In the meantime, please let me know if you are finding that files aren't
being successfully uploaded.
Thanks
Emily
Post by Amanda
Hi Emily,
I am still having the same problem even when using the "stream" API.
Here is my new gridfs_loader.rb script: (My main code stayed the same as
in my last post of this conversation thread.)
# GridFS_Loader.rb
require 'rubygems'
require 'mongo'
Mongo::Logger.logger.level = ::Logger::DEBUG
class GridfsLoader
# performs the detailed work to create a new MongoDB connection
def self.create_connection(mongo_url=nil, db_name=nil)
mongo_url ||= "mongodb://localhost:27017"
db_name ||= "test2"
STDERR.puts "creating connection #{mongo_url} #{db_name}"
db_client = Mongo::Client.new(mongo_url)
db_client.use(db_name)
end
#creates and/or returns a MongoDB connection cached in the class
def self.mongo_client(mongo_url=nil, db_name=nil)
@@db_client ||= create_connection(mongo_url, db_name)
end
#sets up the object instance with a MongoDB connection and create the
fs_bucket object
def initialize(mongo_url=nil, db_name=nil, db = nil)
@@db_client = self.class.mongo_client(mongo_url=nil, db_name=nil)
end
# reads the contents of the file and inserts into GridFS along
# with some optional metadata. The :_id of the file is returned.
def import_grid_file(file_path, name=nil, contentType=nil, metadata=nil)
os_file=File.open(file_path,'r')
description = {}
description[:filename]=name if !name.nil?
description[:contentType]=name if !contentType.nil?
description[:metadata] = metadata if !metadata.nil?
fs_bucket.upload_from_stream(name, os_file, description)
os_file.close()
end
end
I did start to notice that pretty much all of the failures occurred
within the *fs.chunks* collection.
Thanks,
Amanda
Post by Emily Stolfo
Hi Amanda
I see from your script you're using Grid::File objects and working with
file chunks directly but I would recommend instead using the "stream" API
described here
<https://docs.mongodb.com/ecosystem/tutorial/ruby-driver-tutorial/#working-with-write-streams>
.
That interface implements the common MongoDB driver GridFS API described
here
<https://github.com/mongodb/specifications/blob/master/source/gridfs/gridfs-spec.rst>.
If you stick with those methods, you'll have no problem inserting files
larger than 16MB.
If you continue to get errors when using the stream API, please let me know.
Thanks
Emily
Post by Amanda
Hi Emily,
The script that I am running to populate MongoDB has the following
- First, I get an array of the content of a folder and then call the
Dir.chdir "/path/to/folder/"
arr = Array.new
arr = Dir.glob("**/*")
# Create a new instance of the populate class
pop = PopulateMongo.new
pop.populate(arr)
- Here I create the connection to MongoDB and create an instance to a *Gridfsloader
class* that allows me to perform Gridfs actions(I attached the
gridfs_loader.rb file). Then I check that it is a file object, get the
pertinent metadata, and then insert. My code also has to handle inserting
tar files and its content and that is the other half of the code.
*** I attached the gridfs_loader.rb class in this post ***
def populate(arr = nil)
# Create a connection to MongoDB and use test2 database
@grid = GridfsLoader.new
@client = GridfsLoader.mongo_client("mongodb://localhost:27017")
arr.each do |f|
if File.file?(f.to_s) == true
if File.extname(f.to_s) == ".tgz"
meta = getMetadata(f.to_s, false)
@grid.import_grid_file(f.to_s, File.basename(f.to_s),
File.extname(f.to_s), meta)
tar_extract = Gem::Package::TarReader.new(Zlib::GzipReader.open(f))
tar_extract.rewind
populateTar(tar_extract, f.to_s)
tar_extract.close
else
meta = getMetadata(f.to_s, false)
@grid.import_grid_file(f.to_s, File.basename(f.to_s),
File.extname(f.to_s), meta)
end
end
end
end
require './gridfs_loader'
require 'rubygems/package'
require 'zlib'
require 'mongo'
*** I attached the gridfs_loader.rb class in this post ***
Thank you for your help!
Amanda
Post by Emily Stolfo
Hi Amanda
Would you mind showing me the lines of code using the Ruby driver
leading up to this error?
Thanks
Emily
Post by Amanda
Hello,
I am attempting to insert files of different types (.err, .txt, .log,
.csv) using GridFS. I have files of all sizes ranging from 5MB to 100MB. I
am running a Ruby script to populate MongoDB. Files that are smaller than
16MB get inserted correctly, but files that are bigger do not get inserted
[2016-07-13T09:35:11.734995 #20802] DEBUG -- : MONGODB |
localhost:27017 | test2.insert | FAILED | Document exceeds allowed max BSON
size. The max is 16777216. | 0.183450212s
https://docs.mongodb.com/ecosystem/tutorial/ruby-driver-tutorial/#ruby-driver-tutorial
gem install mongo
*Ruby version: 2.3.0*
*MongoDB shell version: 3.2.7 *
*Ubuntu 14.04.4 LTS (GNU/Linux 4.2.0-36-generic x86_64)*
Thank you for your help,
Amanda
--
You received this message because you are subscribed to the Google
Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it,
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/f1c3d205-5bfe-4384-bfa6-ebf90ded2be8%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/f1c3d205-5bfe-4384-bfa6-ebf90ded2be8%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google
Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google
Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/9c51a0db-63d2-4d2b-9bfb-fb62c67a1393%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/9c51a0db-63d2-4d2b-9bfb-fb62c67a1393%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.
https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups
"mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an
<javascript:>.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit
https://groups.google.com/d/msgid/mongodb-user/368edbef-c057-48cb-bf26-56f165490f46%40googlegroups.com
<https://groups.google.com/d/msgid/mongodb-user/368edbef-c057-48cb-bf26-56f165490f46%40googlegroups.com?utm_medium=email&utm_source=footer>
.
For more options, visit https://groups.google.com/d/optout.
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/c935c0b7-cf7a-4193-89c6-6bf52cb0f5a0%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Loading...