Discussion:
Migrating test data from excel to MongoDB
(too old to reply)
Samuel Chow
2018-12-04 18:07:30 UTC
Permalink
I am currently migrating our current test data that is stored in excel
files to MongoDB.

I was planning on using a document database to store the data in but
realized I'm not entirely sure how to store the data for it to be
efficiently read.

The idea is that you have unique keys that help you find the data you are
looking for quicker.

The data columns in the excel are:

- Date
- Model of device
- Version of software
- Type of test
- Bunch of test data from the test run.

Would a nested document be used for data could be read quickly.

i.e. [Type of test] -> [Date] -> [Model of device] -> [Version of software]
-> [Bunch of test data]

Am I even using the right database type?
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/6dae057e-178d-4b5f-85a2-f87ddc1098ea%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Robert Cochran
2018-12-05 16:27:51 UTC
Permalink
Hi!

Your question can be answered best if you take a free MongoDB University
class. Any of the classes in the M101 series would be very helpful to you,
and will expand your knowledge of MongoDB. I highly recommend you take one
of these classes.

Your question also made me wonder about MongoDB's accuracy in performing
imports of comma separated values (csv) files. I wondered because Excel has
an option for exporting workbooks to csv format, but as far as I know, not
to json format.

When you export an excel workbook to a csv file, you would then import your
file and insert it into a MongoDB database and collection using the
mongoimport utility. Unfortunately this utility is very difficult to use.
The tool has a lot of problems.

In MongoDB version 3.6.9, I created a test csv file that looks like this:

2018-12-05,device1,3.3.0495,load records,"[passed, qc passed]"

And used the following mongoimport command:

mongoimport --db chow --collection devtests6 --type csv --columnsHaveTypes
--fields
"testdate.date(2006-01-02),devname.string(),swv.string(),testtype.string(),results.string()"
--file chow4.csv
db.devtests6.find({}).pretty()
{

"_id" : ObjectId("5c07f2463826f29dcbe196e2"),

"testdate" : ISODate("2018-12-05T00:00:00Z"),

"devname" : "device1",

"swv" : "3.3.0495",

"testtype" : "load records",

"results" : "[passed, qc passed]"

}


And what I've learned through trial and error with using mongoimport is
that it is unable to parse and create an ISODate() value with the time
portion of the field specified. Imports fail on an error if you try to
import a date value as 2018-12-05T053441Z, for example. Also, mongoimport
cannot specify array values properly.

I used the db.collection.insert({}) command to create a new collection,
db.devtests7.insert( { "testdate" : ISODate(), "devname" : "device2",
"swv" : "3.6.9431", "testtype" : "dump records", "results" : [ "passed", "qc
failed" ] } )
db.devtests7.find({}).pretty()
{

"_id" : ObjectId("5c07f30e8da2a37bda2d5582"),

"testdate" : ISODate("2018-12-05T15:47:26.606Z"),

"devname" : "device2",

"swv" : "3.6.9431",

"testtype" : "dump records",

"results" : [

"passed",

"qc failed"

]

}


...and it is pretty clear that db.collection.insert() does a much better
job of parsing input data. It probably helps that I used json formatting as
well.

It is certainly possible to check whether Excel can export a workbook to
json format. One could also programmatically take a csv file as input and
reformat it to json, and then insert the json-ified documents to MongoDB.
It is some work, but I think using db.collection.insert({}) with json-ified
documents is much more accurate than csv imports with mongoimport.

This link explains the details of mongoimport. <>
db.devtests7.find({}).pretty() { "_id" :
ObjectId("5c07f30e8da2a37bda2d5582"), "testdate" :
ISODate("2018-12-05T15:47:26.606Z"), "devname" : "device2", "swv" :
"3.6.9431", "testtype" : "dump records", "results" : [ "passed", "qc
failed" ] }>

I do all my work with MongoDB on a Ubuntu 18.04.1 LTS machine, rather than
on Windows 10. I'm unclear on whether mongoimport works better on Windows
10 vs. Linux.

By the way, when you post a question about MongoDB, always list your exact
MongoDB server and shell versions, and post one or two sample documents so
that others can help you more efficiently.

In conclusion, I highly recommend you take an M101-type class from MongoDB
University.

Thanks so much

Bob
I am currently migrating our current test data that is stored in excel
files to MongoDB.
I was planning on using a document database to store the data in but
realized I'm not entirely sure how to store the data for it to be
efficiently read.
The idea is that you have unique keys that help you find the data you are
looking for quicker.
- Date
- Model of device
- Version of software
- Type of test
- Bunch of test data from the test run.
Would a nested document be used for data could be read quickly.
i.e. [Type of test] -> [Date] -> [Model of device] -> [Version of
software] -> [Bunch of test data]
Am I even using the right database type?
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/ae8d5a9e-6122-415a-95d8-7deb6a79f373%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
'Kevin Adistambha' via mongodb-user
2018-12-06 01:18:00 UTC
Permalink
Hi Bob, Samuel,

mongoimport can actually parse a date time value, but by default you would
need to specify the format of the date using the Go language time
formatting scheme <https://golang.org/src/time/format.go>. The post Go by
example: Time formatting <https://gobyexample.com/time-formatting-parsing>
shows an example parsing. You can specify the datetime format using
Microsoft SQL server format or Oracle’s TO_DATE format as well, but I’ll
provide an example using the default Go-based values.

Here’s an example csv file I have, note the Date.date(2006...) field. This
tells mongoimport that the column is called Date, and it contains a date
with the specified format:

$ cat dates.csv
Name.string(),Date.date(2006-01-02T15:04:05)
Abc,2018-01-01T01:02:03
Def,2018-01-02T13:14:15

importing this using mongoimport:

$ mongoimport -d test -c test --type=csv --headerline --columnsHaveTypes dates.csv
db.test.find()
{
"_id": ObjectId("5c0874dbc78fbe08a093e4c6"),
"Name": "Abc",
"Date": ISODate("2018-01-01T01:02:03Z")
}
{
"_id": ObjectId("5c0874dbc78fbe08a093e4c7"),
"Name": "Def",
"Date": ISODate("2018-01-02T13:14:15Z")
}

As you have pointed out, arrays are not really supported since the csv
format don’t really have a facility to embed an array due to its tabular
nature. Having said that, TOOLS-1954
<https://jira.mongodb.org/browse/TOOLS-1954> is a ticket for exactly this
request.

Best regards,
Kevin
​
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/932e461b-315d-4358-9a78-a2ec0912f6d5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Robert Cochran
2018-12-06 01:45:45 UTC
Permalink
Hi Kevin!

Based on your post, I prepared a csv file with this content:

testdate.date(2006-01-02T15:05:05),devname.string(),swv.string(),testtype.
string(),results.string()

2018-12-05T16:45:59,device1,34.43.05495,load records,"[passed, qc passed]"

...and I ran mongoimport like so:

$ mongoimport -d chow -c devtests8 --type csv --headerline --columnsHaveTypes
chow7.csv
db.devtests8.find({}).pretty()
{

"_id" : ObjectId("5c087ca93826f29dcbe19781"),

"testdate" : ISODate("2018-12-05T16:00:59Z"),

"devname" : "device1",

"swv" : "34.43.05495",

"testtype" : "load records",

"results" : "[passed, qc passed]"

}


Look at the time result in the document. Notice how the time is incorrectly
parsed?

I'll get on to JIRA and upvote the TOOLS-1954 ticket.

Thanks so much

Bob
Hi Bob, Samuel,
mongoimport can actually parse a date time value, but by default you
would need to specify the format of the date using the Go language time
formatting scheme <https://golang.org/src/time/format.go>. The post Go by
example: Time formatting <https://gobyexample.com/time-formatting-parsing>
shows an example parsing. You can specify the datetime format using
Microsoft SQL server format or Oracle’s TO_DATE format as well, but I’ll
provide an example using the default Go-based values.
Here’s an example csv file I have, note the Date.date(2006...) field.
This tells mongoimport that the column is called Date, and it contains a
$ cat dates.csv
Name.string(),Date.date(2006-01-02T15:04:05)
Abc,2018-01-01T01:02:03
Def,2018-01-02T13:14:15
$ mongoimport -d test -c test --type=csv --headerline --columnsHaveTypes dates.csv
db.test.find()
{
"_id": ObjectId("5c0874dbc78fbe08a093e4c6"),
"Name": "Abc",
"Date": ISODate("2018-01-01T01:02:03Z")
}
{
"_id": ObjectId("5c0874dbc78fbe08a093e4c7"),
"Name": "Def",
"Date": ISODate("2018-01-02T13:14:15Z")
}
As you have pointed out, arrays are not really supported since the csv
format don’t really have a facility to embed an array due to its tabular
nature. Having said that, TOOLS-1954
<https://jira.mongodb.org/browse/TOOLS-1954> is a ticket for exactly this
request.
Best regards,
Kevin
​
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/27b03450-4e43-48e1-b137-6cc356d5d372%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Robert Cochran
2018-12-06 02:01:15 UTC
Permalink
Hi Kevin,

I apologize. I see from looking at the links you provided that the time
pattern has to be exactly like this:

testdate.date(2006-01-02T15:04:05)...

...I corrected the error in my previous pattern and re-imported the csv
file and that fixed the time parsing.

Thanks so much

Bob
Post by Robert Cochran
Hi Kevin!
testdate.date(2006-01-02T15:05:05),devname.string(),swv.string(),testtype.
string(),results.string()
2018-12-05T16:45:59,device1,34.43.05495,load records,"[passed, qc passed]"
$ mongoimport -d chow -c devtests8 --type csv --headerline --columnsHaveTypes
chow7.csv
db.devtests8.find({}).pretty()
{
"_id" : ObjectId("5c087ca93826f29dcbe19781"),
"testdate" : ISODate("2018-12-05T16:00:59Z"),
"devname" : "device1",
"swv" : "34.43.05495",
"testtype" : "load records",
"results" : "[passed, qc passed]"
}
Look at the time result in the document. Notice how the time is
incorrectly parsed?
I'll get on to JIRA and upvote the TOOLS-1954 ticket.
Thanks so much
Bob
Hi Bob, Samuel,
mongoimport can actually parse a date time value, but by default you
would need to specify the format of the date using the Go language time
formatting scheme <https://golang.org/src/time/format.go>. The post Go
by example: Time formatting
<https://gobyexample.com/time-formatting-parsing> shows an example
parsing. You can specify the datetime format using Microsoft SQL server
format or Oracle’s TO_DATE format as well, but I’ll provide an example
using the default Go-based values.
Here’s an example csv file I have, note the Date.date(2006...) field.
This tells mongoimport that the column is called Date, and it contains a
$ cat dates.csv
Name.string(),Date.date(2006-01-02T15:04:05)
Abc,2018-01-01T01:02:03
Def,2018-01-02T13:14:15
$ mongoimport -d test -c test --type=csv --headerline --columnsHaveTypes dates.csv
db.test.find()
{
"_id": ObjectId("5c0874dbc78fbe08a093e4c6"),
"Name": "Abc",
"Date": ISODate("2018-01-01T01:02:03Z")
}
{
"_id": ObjectId("5c0874dbc78fbe08a093e4c7"),
"Name": "Def",
"Date": ISODate("2018-01-02T13:14:15Z")
}
As you have pointed out, arrays are not really supported since the csv
format don’t really have a facility to embed an array due to its tabular
nature. Having said that, TOOLS-1954
<https://jira.mongodb.org/browse/TOOLS-1954> is a ticket for exactly
this request.
Best regards,
Kevin
​
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/6a37c4cb-e79c-4888-a47e-d804c7a82937%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Samuel Chow
2018-12-07 21:36:48 UTC
Permalink
Thanks guys, this was very helpful! Will make sure to look into the MongoDB
University class.
Post by Samuel Chow
I am currently migrating our current test data that is stored in excel
files to MongoDB.
I was planning on using a document database to store the data in but
realized I'm not entirely sure how to store the data for it to be
efficiently read.
The idea is that you have unique keys that help you find the data you are
looking for quicker.
- Date
- Model of device
- Version of software
- Type of test
- Bunch of test data from the test run.
Would a nested document be used for data could be read quickly.
i.e. [Type of test] -> [Date] -> [Model of device] -> [Version of
software] -> [Bunch of test data]
Am I even using the right database type?
--
You received this message because you are subscribed to the Google Groups "mongodb-user"
group.

For other MongoDB technical support options, see: https://docs.mongodb.com/manual/support/
---
You received this message because you are subscribed to the Google Groups "mongodb-user" group.
To unsubscribe from this group and stop receiving emails from it, send an email to mongodb-user+***@googlegroups.com.
To post to this group, send email to mongodb-***@googlegroups.com.
Visit this group at https://groups.google.com/group/mongodb-user.
To view this discussion on the web visit https://groups.google.com/d/msgid/mongodb-user/74ff2c45-3a99-42dd-b2f8-5e409a37a20a%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.
Continue reading on narkive:
Loading...