MongoDB Performance Based on Document Size

By : Ty.
Source: Stackoverflow.com
Question!

I've been playing around with the samus mongodb driver, particularly the benchmark tests. From the output, it appears the size of the documents can have a drastic effect upon how long operations on those collections take.

alt text

Is there some documentation available that recommends what balance to strive for or some more "real" numbers around what document size will do to query times? Is this poor performance more a result of the driver and any serialization overhead? Has anyone else noticed this?

By : Ty.


Answers

I cannot find a link right now, but the format of the database is such that it should not matter if a document is large or small. For access via index, there is certainly no difference, for a table scan, uninteresting documents (or uninteresting parts of documents) can be skipped quickly thanks to the BSON format. If anything, the overhead of the BSON format affects tiny documents more than large ones.

So I would assume that the performance drop you see is largely due to the serialization costs of loading those documents (of course it takes more time to write a large document to disk than a small document, but it should be about the same for multiple small documents of the same aggregate size).

In your benchmark, can you normalize the numbers to be based on the same amount of data (in bytes, not in document count)?

By : Thilo


You can turn on profiling with db.setProfilingLevel(2) and query db.system.profile for details on the executed queries.

Although this may distort the test results a little, it will give you insight into the query times on the server, eliminating any influence the driver or network may have on the results. If these query times show the same pattern as your test, then the document size does influence query times. If query times are roughly the same regardless of document size, then it's serialization overhead you're looking at.



But is it a good benchmark? Don't think so. Read http://stackoverflow.com/questions/2460063/2465039#2465039 .

I think the exception that happens when the index should have been created is still swallowed. FindOne() medium return 363 with and without the "creation" of the index.

By : TTT


This video can help you solving your question :)
By: admin