MongoDB performing poor with better configuration

Question!

We have two mongo servers one for testing and one for production each of them have a collection named images with ~700M documents.

{
    _id
    MovieId
    ...
}

We have the index on _id and MovieId

We are running the queries of the following format

db.images.find({MovieId:1234})

QA Config:

256GB of RAM with RAID disk

Prod Config:

700GB of RAM with SSD mirror

mongod configuration (/etc/mongod.conf)

QA:

storage:
   dbPath: "/data/mongodb_data"
   journal:
     enabled: false
   engine: wiredTiger
   wiredTiger:
     engineConfig:
        cacheSizeGB: 256
setParameter:
   wiredTigerConcurrentReadTransactions: 256

Prod:

storage:
   dbPath: "/data/mongodb_data"
   directoryPerDB: true
   journal:
     enabled: false
   engine: wiredTiger
   wiredTiger:
     engineConfig:
        cacheSizeGB: 600
setParameter:
   wiredTigerConcurrentReadTransactions: 256

With the better configuration for the prod server, it should perform better than QA server. Surprisingly it is running very slow compared to QA Server.

I checked current ops (using db.currentOp()) on both servers under the same load, lot of queries on the prod server takes 10-20 seconds, but on the QA server No query takes more than 1 second.

The queries are initiated from Mapreduce jobs.

I need help in identifying the problem.

[Edit]: Mongo Version 3.0.11



Answers

There was a difference in the number of open files and, max user processes between our Staging and Production servers. I checked it by using the command ulimit -a

Staging:

 open files                      (-n) 32768
 pipe size            (512 bytes, -p) 8
 POSIX message queues     (bytes, -q) 819200
 real-time priority              (-r) 0
 stack size              (kbytes, -s) 10240
 cpu time               (seconds, -t) unlimited
 max user processes              (-u) 32768

Prod:

 open files                      (-n) 16384
 pipe size            (512 bytes, -p) 8
 POSIX message queues     (bytes, -q) 819200
 real-time priority              (-r) 0
 stack size              (kbytes, -s) 10240
 cpu time               (seconds, -t) unlimited
 max user processes              (-u) *16384*

After I changed the two settings on prod, it started giving better performance. Thanks @Gaurav for advising me on this.



You can debug your mongo queries in multiple ways.

Start with your index usage using below command :
db.images.aggregate( [ { $indexStats: { } } ] )

If this don't give you any useful information, then check the execution plan of the slow queries using : db.setProfilingLevel(2)
db.system.profile.find().pretty()

db.system.profile will give you complete profile of your queries.



C++11 solution:

#include <type_traits>

template<typename From, typename To>
To map(From e) {
    return static_cast<To>(
        static_cast<typename std::underlying_type<To>::type>(
        static_cast<typename std::underlying_type<From>::type>(e)));
}

This casting cascade is very explicit and supports enum classes.

For older C++ versions and for enums without class, static_cast<Enum2>(e) would suffice.

Edit:

With template specialization, you can use map without specifying any types explicitly:

enum class Enum1: int {A, B, C, D};
enum class Enum2: char {A1, B1, C1, D1};

template<typename T>
struct target_enum {
};

template<>
struct target_enum<Enum1> {
    typedef Enum2 type;
};

template<typename From>
typename target_enum<From>::type map(From e) {
    typedef typename target_enum<From>::type To;
    return static_cast<To>(
        static_cast<typename std::underlying_type<To>::type>(
        static_cast<typename std::underlying_type<From>::type>(e)));
}

You can then call map(Enum1::A). Also works with simple, non-class enums.

By : flyx


This video can help you solving your question :)
By: admin