Discussion:
Regular expression limitations in MongoDB
Cory Spencer
2012-04-30 04:37:25 UTC
Permalink
Hi all -

We've been developing a project using MongoDB as a database and I've
run into an issue that I can't figure out. Some of our queries build
a large regular expression (we're attempting to match a list of
biological gene symbols against documents stored in mongo). However,
when the regular expression gets to be too large, MongoDB returns no
results, even though the expression *should match documents in the
database.

Here's an example that should work in the mongo interactive shell:

First I'll insert a document into the database consisting of 50,000
String.prototype.repeat = function(n) { return new Array(n +
1).join( this )
function (n) {
return (new Array(n + 1)).join(this);
}
db.example.insert({ field : "x".repeat(50000) })
Searching for it using a regular expression consisting of less than or
db.example.find({ field : { $regex : "x".repeat(32764) }})
"...." }

However, once you exceed the 32764 character length, the regular
db.example.find({ field : { $regex : "x".repeat(32765) }})
Why is this and is there a work around?

Thank you!

Cory
Eliot Horowitz
2012-04-30 04:52:05 UTC
Permalink
That looks like the default pcre max size.
Can you open a ticket @ jira.mongodb.org to see if its possible to
increase the size?
Post by Cory Spencer
Hi all -
We've been developing a project using MongoDB as a database and I've
run into an issue that I can't  figure out.  Some of our queries build
a large regular expression (we're attempting to match a list of
biological gene symbols against documents stored in mongo).  However,
when the regular expression gets to be too large, MongoDB returns no
results, even though the expression *should match documents in the
database.
First I'll insert a document into the database consisting of 50,000
 > String.prototype.repeat = function(n) { return new Array(n +
1).join( this )
 function (n) {
     return (new Array(n + 1)).join(this);
 }
 > db.example.insert({ field : "x".repeat(50000) })
 >
Searching for it using a regular expression consisting of less than or
 > db.example.find({ field : { $regex : "x".repeat(32764) }})
"...." }
However, once you exceed the 32764 character length, the regular
 > db.example.find({ field : { $regex : "x".repeat(32765) }})
 >
Why is this and is there a work around?
Thank you!
Cory
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
--
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To post to this group, send email to mongodb-user-/JYPxA39Uh5TLH3MbocFF+G/***@public.gmane.org
To unsubscribe from this group, send email to mongodb-user+***@googlegroups.com.
For more options, visit this group at http://groups.google.com/group/mongodb-user?hl=en.
Loading...