mongodb

How to setup and secure MongoDB on Ubuntu 16.04 and verify with Studio 3T

September 6, 2017 by Simon

Here is how you can install MongoDB 3.4.x on Ubuntu 16.04 and secure it. I use the Studio 3T software form Studio 3T.

Before you install MongoDB ensure you have secured your server and installed it and installed an SSL certificate. Click the links here to set up a Digital Ocean, Vultr or AWS Server. Read my old guide to installing MongoDB on Ubuntu 14.04 and using Studio 3T ( https://studio3t.com/ ).

Install MongoDB 3.6

Official guide here

Add the keyserver

sudo apt-key adv --keyserver hkp://keyserver.ubuntu.com:80 --recv 2930ADAE8CAF5059EE73BB4B58712A2291FA4AD5
Executing: /tmp/tmp.T7UNroLh1A/gpg.1.sh --keyserver
hkp://keyserver.ubuntu.com:80
--recv
2930ADAE8CAF5059EE73BB4B58712A2291FA4AD5
gpg: requesting key 91FA4AD5 from hkp server keyserver.ubuntu.com
gpg: key 91FA4AD5: public key "MongoDB 3.6 Release Signing Key <[email protected]>" imported
gpg: Total number processed: 1
gpg:               imported: 1  (RSA: 1)

echo "deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/3.6 multiverse" | sudo tee /etc/apt/sources.list.d/mongodb-org-3.6.list
deb [ arch=amd64,arm64 ] https://repo.mongodb.org/apt/ubuntu xenial/mongodb-org/3.6 multiverse

Update

sudo apt-get update

Install the latest stable version of MongoDB

sudo apt-get install -y mongodb-org

MongoDB.conf options

FYI: Secure MongoDB: https://docs.mongodb.com/manual/security/ and Security Checklist https://docs.mongodb.com/manual/administration/security-checklist/

MongoDB Hardening Info: https://docs.mongodb.com/manual/core/security-hardening/

Start Mongodb

sudo mongod --port 27017 --dbpath /mongodb_data/ --bind_ip 127.0.0.1 --config /etc/mongod.conf

You should see startup activity

sudo mongod --port 27017 --dbpath /mongodb_data/
2018-01-16T22:46:22.924+1100 I CONTROL  [initandlisten] MongoDB starting : pid=10413 port=27017 dbpath=/mongodb_data/ 64-bit host=ypurservername
2018-01-16T22:46:22.924+1100 I CONTROL  [initandlisten] db version v3.6.2
2018-01-16T22:46:22.924+1100 I CONTROL  [initandlisten] git version: 489d177dbd0f0420a8ca04d39fd78d0a2c539420
2018-01-16T22:46:22.924+1100 I CONTROL  [initandlisten] OpenSSL version: OpenSSL 1.0.2g  1 Mar 2016
2018-01-16T22:46:22.924+1100 I CONTROL  [initandlisten] allocator: tcmalloc
2018-01-16T22:46:22.924+1100 I CONTROL  [initandlisten] modules: none
2018-01-16T22:46:22.924+1100 I CONTROL  [initandlisten] build environment:
2018-01-16T22:46:22.924+1100 I CONTROL  [initandlisten]     distmod: ubuntu1604
2018-01-16T22:46:22.924+1100 I CONTROL  [initandlisten]     distarch: x86_64
2018-01-16T22:46:22.924+1100 I CONTROL  [initandlisten]     target_arch: x86_64
2018-01-16T22:46:22.924+1100 I CONTROL  [initandlisten] options: { net: { port: 27017 }, storage: { dbPath: "/mongodb_data/" } }
2018-01-16T22:46:22.925+1100 I STORAGE  [initandlisten]
2018-01-16T22:46:22.925+1100 I STORAGE  [initandlisten] wiredtiger_open config: create,cache_size=1463M,session_max=20000,eviction=(threads_min=4,threads_max=4),config_base=false,statistics=(fast),log=(enabled=true,archive=true,path=journal,compressor=snappy),file_manager=(close_idle_time=100000),statistics_log=(wait=0),verbose=(recovery_progress),
2018-01-16T22:46:23.012+1100 I CONTROL  [initandlisten]
2018-01-16T22:46:23.014+1100 I STORAGE  [initandlisten] createCollection: admin.system.version with provided UUID: *******
2018-01-16T22:46:23.026+1100 I COMMAND  [initandlisten] setting featureCompatibilityVersion to 3.6
2018-01-16T22:46:23.033+1100 I STORAGE  [initandlisten] createCollection: local.startup_log with generated UUID: d4271258-e51f-407d-af15-975f61e66eaf
2018-01-16T22:46:23.049+1100 I FTDC     [initandlisten] Initializing full-time diagnostic data capture with directory '/mongodb_data/diagnostic.data'
2018-01-16T22:46:23.050+1100 I NETWORK  [initandlisten] waiting for connections on port 27017

In a new terminal window open a MongoDB process

mongo --port 27017 --authenticationDatabase 'admin' --username username --password password
>...
>

Create a user

> use admin
switched to db admin
> db.createUser(
...   {
...     user: "yourdbuser",
...     pwd: "**************************************************************",
...     roles: [ { role: "userAdminAnyDatabase", db: "admin" } ]
...   }
... )
Successfully added user: {
        "user" : "yourdbuser",
        "roles" : [
                {
                        "role" : "userAdminAnyDatabase",
                        "db" : "admin"
                }
        ]
}
>

Create a test db

show dbs
use testdb
s = { Name : "Test Value" }
{ "Name" : "Test Value" }
db.testdb.insert( s );
WriteResult({ "nInserted" : 1 })
db.testdb.find()
{ "_id" : ObjectId("5a5deba7b563981038f32051"), "Name" : "Test Value" }

Run MongoDB at Startup

Note: I tried setting up a service but it failed so I added the following command to /etc/rc.local

/usr/bin/mongod --quiet --port 27017 --dbpath /mongodb_data/ --bind_ip 127.0.0.1,##.##.##.## --config /etc/mongod.conf

Todo: Service Setup.

Ignore the following….

Create a service

sudo nano /etc/systemd/system/mongodb.service

I added

[Unit]
Description=High-performance, schema-free document-oriented database
After=network.target

[Service]
User=mongodb
ExecStart=/usr/bin/mongod --quiet --port 27017 --dbpath /mongodb_data/ --bind_ip 127.0.0.1,##.##.##.## --config /etc/mongod.conf

[Install]
WantedBy=multi-user.target

TIP: Bing to local 127.0.0.1 and also your external IP (if you have hardened and setup IP whitelists for the port).

Make /etc/init.d/mongod executable

sudo chmod +x /etc/init.d/mongod

Firewall

Configure your firewall (tip: whitelist your local development IP to allow your IP access etc)

sudo ufw allow from 123.123.123.12/32 to any port 27017
Rule added

sudo ufw reload
Firewall reloaded

sudo ufw allow out 27017
Rule added
Rule added (v6)

Reload the firewall and show the status. Add port 27017 (or your custom port) to your TCP and UDP firewall. http://icanhazip.com/ will display your public IP.

sudo ufw reload
sudo ufw status
Status: active

To                         Action      From
--                         ------      ----
...
27017                      ALLOW       123.123.123.123
...
27017                      ALLOW OUT   Anywhere
...

Display the MongoDB version

mongod -version
db version v3.6.2
git version: 489d177dbd0f0420a8ca04d39fd78d0a2c539420
OpenSSL version: OpenSSL 1.0.2g 1 Mar 2016
allocator: tcmalloc
modules: none
build environment:
 distmod: ubuntu1604
 distarch: x86_64
 target_arch: x86_64

Make MongoDB configuration changes

sudo nano /etc/mongod.conf

– Consider saving your database data somewhere else (mongodb.conf)

storage:
  dbPath: /folder_to_store_mongodb_data/

– Consider redirecting your log file (mongodb.conf)

systemLog:
  destination: file
  logAppend: true
  path: "/folder_to_store_mongodb_logs/mongo.log"

– Consider changing the default MongoDB port (mongodb.conf)

net:
  port: 27123

– Allow MongoDB to talk locally and globally if need be and optionally enable IPV6 (binding IP’s in mongodb.conf)

TIP: Ensure you bind you localhost port (127.0.0.1) AND your public IP (e.g 123.123.123.123) as you will need to bind to public IP too if you want to connect to MongoDB externally. I did not bing my eternal IP and was blocked for a few days.

net:
  ipv6: true
  bindIp: 127.0.0.1,123.123.123.123

If you allow external access then consider whitelisting your IP and disabling local admin login – more here (mongodb.conf)

setParameter:
   enableLocalhostAuthBypass: false

Official configuration documentation can be found here.

At this stage, MongoDB is open to the world and if you connect to your server with no username or password you will see it is open.

I created an admin user with Studio 3T for MongoDB in the IntelliShell.

I typed the following to create a user (I added a root role after creating the screenshot above).

use admin
db.createUser({user: "yourusername", pwd: "yourpassword", roles:[{role: "userAdminAnyDatabase", db: "admin"},{role: "root", db: "admin"}]})

Verify that the user was created

show users
{ 
    "_id" : "admin.yourusername", 
    "user" : "yourusername", 
    "db" : "admin", 
    "roles" : [
        {
            "role" : "root", 
            "db" : "admin"
        }, 
        {
            "role" : "userAdminAnyDatabase", 
            "db" : "admin"
        }
    ]
}

Now you can use these credentials to log in to the database.

You can see the new credentials are working and now we need to remove anonymous and empty connections.

Add the following to mongodb.conf

security:
 authorization: enabled

Restart MongoDB

sudo systemctl stop mongodb
sudo systemctl start mongodb

/usr/bin/mongod --config /etc/mongod.conf

Now when you connect to your database with no login details you will see no databases.

Show the status of MongoDB

sudo systemctl start mongodb
user@server:~# sudo systemctl status mongodb
● mongodb.service - High-performance, schema-free document-oriented database
   Loaded: loaded (/etc/systemd/system/mongodb.service; enabled; vendor preset: enabled)
   Active: active (running) since Fri 2017-08-18 18:00:38 AEST; 4 days ago
 Main PID: 6946 (mongod)
    Tasks: 19
   Memory: 76.3M
      CPU: 57min 16.367s
   CGroup: /system.slice/mongodb.service
           └─6946 /usr/bin/mongod --quiet --config /etc/mongod.conf

View the last 20 lines of the MongoDB log file.

tail -n 20 /var/log/mongodb/mongod.log

(or replace the path with your log location)

tail -n 20 /yourmongodb_logs/mongod.log

Viewing MongoDB files

ls -al
total 384
drwxr-xr-x 4 root root 4096 Sep 5 18:12 .
drwxr-xr-x 30 root root 4096 Aug 5 18:48 ..
-rw-r--r-- 1 root root 32768 Sep 5 18:12 collection-0--5805649544981213952.wt
-rw-r--r-- 1 root root 36864 Sep 5 18:12 collection-0-6117837641988028070.wt
-rw-r--r-- 1 root root 32768 Sep 5 18:12 collection-2-6117837641988028070.wt
drwxr-xr-x 2 root root 4096 Sep 5 18:12 diagnostic.data
-rw-r--r-- 1 root root 16384 Sep 5 18:47 index-1--5805649544981213952.wt
-rw-r--r-- 1 root root 36864 Sep 5 18:12 index-1-6117837641988028070.wt
-rw-r--r-- 1 root root 32768 Sep 5 18:12 index-3-6117837641988028070.wt
-rw-r--r-- 1 root root 32768 Sep 5 18:47 index-4-6117837641988028070.wt
drwxr-xr-x 2 root root 4096 Sep 5 18:03 journal
-rw-r--r-- 1 root root 32768 Sep 5 18:12 _mdb_catalog.wt
-rw-r--r-- 1 root root 0 Sep 5 18:12 mongod.lock
-rw-r--r-- 1 root root 36864 Sep 5 18:12 sizeStorer.wt
-rw-r--r-- 1 root root 95 Sep 5 18:18 storage.bson
-rw-r--r-- 1 root root 49 Sep 5 18:18 WiredTiger
-rw-r--r-- 1 root root 4096 Sep 5 18:12 WiredTigerLAS.wt
-rw-r--r-- 1 root root 21 Sep 5 18:18 WiredTiger.lock
-rw-r--r-- 1 root root 996 Sep 5 18:12 WiredTiger.turtle
-rw-r--r-- 1 root root 61440 Sep 5 18:12 WiredTiger.wt

MongoDB Users and Roles

Application scalability on a budget (my journey)

August 12, 2016 by Simon Fearby

If you have read my other guides on https://www.fearby.com you may tell I like the self-managed Ubuntu servers you can buy from Digital Ocean for as low as $5 a month (click here to get $10 in free credit and start your server in 5 minutes ). Vultr has servers as low as $2.5 a month. Digital Ocean is a great place to start up your own server in the cloud, install some software and deploy some web apps or backend (API/databases/content) for mobile apps or services. If you need more memory, processor cores or hard drive storage simple shutdown your Digital Ocean server, click a few options to increase your server resources and you are good to go (this is called “scaling up“). Don’t forget to cache content to limit usage.

This scalability guide is a work in progress (watch this space). My aim is to get 2000 concurrent users a second serving geo queries (like PokeMon Go) for under $80 a month (1x server and 1x mongoDB cluster). Currently serving 600~1200/sec.

Buying a Domain

Buy a domain name from Namecheap here.

Estimating Costs

If you don’t estimate costs you are planning to fail.

"By failing to prepare you are preparing to fail." - Benjamin Frankin

Estimate the minimum users you need to remain viable and then the expected maximum uses you need to handle. What will this cost?

Planning for success

Anyone who has researched application scalability has come across articles on apps that have launched and crashed under load at launch. Even governments can spend tens of millions on developing a scalable solution, plan for years and fail dismally on launch (check out the Australian Census disaster). The Australian government contracted IBM to develop a solution to receive up to 15 million census submissions between the 28th of July to the 5th of September. IBM designed a system and a third party performance test planned up to 400 submissions a second but the maximum submissions received on census night before the system crashed was only o154 submissions a second. Predicting application usage can be hard, in the case of the Australian census the bulk of people logged on to submit census data on the recommended night of the 9th of August 2016.

Sticking to a budget

This guide is not for people with deep pockets wanting to deploy a service to 15 million people but for solo app developers or small non-funded startups on a serious budget. If you want a very reliable scalable solution or service provider you may want to skip this article and check out services by the following vendors.

Firebase
Azure (good guides by Troy Hunt: here, here and here).
Amazon Web Services
Google Cloud
NGINX Plus

The above vendors have what seems like an infinite array of products and services that can form part of your solution but beware, the more products you use the more complex it will be and the higher the costs. A popular app can be an expensive app. That’s why I like Digital Ocean as you don’t need a degree to predict and plan you servers average usage and buy extra resource credits if you go over predicted limits. With Digital Ocean you buy a virtual server and you get known Memory, Storage and Data transfer limits.

Let’s go over topics that you will need to consider when designing or building a scalable app on a budget.

Application Design

Your application needs will ultimately decide the technology and servers you require.

A simple business app that shares events, products and contacts would require a basic server and MySQL database.
A turn-based multiplayer app for a few hundred people would require more server resources and endpoints (a NGINX, NODEJS and an optimized MySQL database would be ok).
A larger augmented reality app for thousands of people would require a mix of databases and servers to separate the workload (a NGINX webserver and NodeJS powered API talking to a MySQL database to handle logins and a single server NoSQL database for the bulk of the shared data).
An augmented reality app with tens of thousands of users (a NGINX web server, NodeJS powered API talking to a MySQL database to handle logins and NoSQL cluster for the bulk of the shared data).
A business critical multi-user application with real-time chat – are you sure you are on a budget as this will require a full solution from Azure Firebase or Amazon Web Serers.

A native app, hybrid app or full web app can drastically change how your application works ( learn the difference here ).

Location, location, location.

You want your server and resources to be as close to your customers as possible, this is one rule that cannot be broken. If you need to spend more money to buy a server in a location closer to your customers do it.

My Setup

I have a Digital Ocean server with 2 cores and 2GB of ram in Singapore that I use to test and develop apps. That one server has MySQL, NGINX, NodeJS , PHP and many scripts running on it in the background. I also have a MongoDB cluster (3 servers) running on AWS in Sydney via MongoDB.com. I looked into CouchDB via Cloudant but needed the Geo JSON features with fair dedicated pricing. I am considering moving to an Ubuntu server off Digital Ocean (in Singapore) and onto AWS server (in Sydney). I am using promise based NodeJS calls where possible to prevent non blocking calls to the operating system, database or web. Update: I moved to a Vultr domain (article here)

Here is a benchmark for HTTP and HTTPS request from Rural NSW to Sydney Australia, then Melbourne, then Adelaide the Perth then Singapore to a Node Server on an NGINX Server that does a call back to Sydney Australia to get a GeoQuery from a large database and return to back to the customer via Singapore.

SSL will add processing overheads and latency period.

Here is a breakdown of the hops from my desktop in Regional NSW making a network call to my Digital Ocean Server in Singapore (with private parts redacted or masked).

traceroute to destination-server-redacted.com (###.###.###.##), 64 hops max, 52 byte packets
 1  192-168-1-1 (192.168.1.1)  11.034 ms  6.180 ms  2.169 ms
 2  xx.xx.xx.xxx.isp.com.au (xx.xx.xx.xxx)  32.396 ms  37.118 ms  33.749 ms
 3  xxx-xxx-xxx-xxx (xxx.xxx.xxx.xxx)  40.676 ms  63.648 ms  28.446 ms
 4  syd-gls-har-wgw1-be-100 (203.221.3.7)  38.736 ms  38.549 ms  29.584 ms
 5  203-219-107-198.static.tpgi.com.au (203.219.107.198)  27.980 ms  38.568 ms  43.879 ms
 6  tengige0-3-0-19.chw-edge901.sydney.telstra.net (139.130.209.229)  30.304 ms  35.090 ms  43.836 ms
 7  bundle-ether13.chw-core10.sydney.telstra.net (203.50.11.98)  29.477 ms  28.705 ms  40.764 ms
 8  bundle-ether8.exi-core10.melbourne.telstra.net (203.50.11.125)  41.885 ms  50.211 ms  45.917 ms
 9  bundle-ether5.way-core4.adelaide.telstra.net (203.50.11.92)  66.795 ms  59.570 ms  59.084 ms
10  bundle-ether5.pie-core1.perth.telstra.net (203.50.11.17)  90.671 ms  91.315 ms  89.123 ms
11  203.50.9.2 (203.50.9.2) 80.295 ms  82.578 ms  85.224 ms
12  i-0-0-1-0.skdi-core01.bx.telstraglobal.net (Singapore) (202.84.143.2)  132.445 ms  129.205 ms  147.320 ms
13  i-0-1-0-0.istt04.bi.telstraglobal.net (202.84.243.2)  156.488 ms
    202.84.244.42 (202.84.244.42)  161.982 ms
    i-0-0-0-4.istt04.bi.telstraglobal.net (202.84.243.110)  160.952 ms
14  unknown.telstraglobal.net (202.127.73.138)  155.392 ms  152.938 ms  197.915 ms
15  * * *
16  destination-server-redacted.com (xx.xx.xx.xxx)  177.883 ms  158.938 ms  153.433 ms

160ms to send a request to the server. This is on a good day when the Netflix Effect is not killing links across Australia.

Here is the route for a call from the server above to the MongoDB Cluster on an Amazon Web Services in Sydney from the Digital Ocean Server in Singapore.

traceroute to redactedname-shard-00-00-nvjmn.mongodb.net (##.##.##.##), 30 hops max, 60 byte packets
 1  ###.###.###.### (###.###.###.###)  0.475 ms ###.###.###.### (###.###.###.###)  0.494 ms ###.###.###.### (###.###.###.###)  0.405 ms
 2  138.197.250.212 (138.197.250.212)  0.367 ms 138.197.250.216 (138.197.250.216)  0.392 ms  0.377 ms
 3  unknown.telstraglobal.net (202.127.73.137)  1.460 ms 138.197.250.201 (138.197.250.201)  0.283 ms unknown.telstraglobal.net (202.127.73.137)  1.456 ms
 4  i-0-2-0-10.istt-core02.bi.telstraglobal.net (202.84.225.222)  1.338 ms i-0-4-0-0.istt-core02.bi.telstraglobal.net (202.84.225.233)  3.817 ms unknown.telstraglobal.net (202.127.73.137)  1.443 ms
 5  i-0-2-0-9.istt-core02.bi.telstraglobal.net (202.84.225.218)  1.270 ms i-0-1-0-0.pthw-core01.bx.telstraglobal.net (202.84.141.157)  50.869 ms i-0-0-0-0.pthw-core01.bx.telstraglobal.net (202.84.141.153)  49.789 ms
 6  i-0-1-0-5.sydp-core03.bi.telstraglobal.net (202.84.143.145)  107.395 ms  108.350 ms  105.924 ms
 7  i-0-1-0-5.sydp-core03.bi.telstraglobal.net (202.84.143.145)  105.911 ms 21459.tauc01.cu.telstraglobal.net (134.159.124.85)  108.258 ms  107.337 ms
 8  21459.tauc01.cu.telstraglobal.net (134.159.124.85)  107.330 ms unknown.telstraglobal.net (134.159.124.86)  101.459 ms  102.337 ms
 9  * unknown.telstraglobal.net (134.159.124.86)  102.324 ms  102.314 ms
10  * * *
11  54.240.192.107 (54.240.192.107)  103.016 ms  103.892 ms  105.157 ms
12  * * 54.240.192.107 (54.240.192.107)  103.843 ms
13  * * *
14  * * *
15  * * *
16  * * *
17  * * *
18  * * *
19  * * *
20  * * *
21  * * *
22  * * *
23  * * *
24  * * *
25  * * *
26  * * *
27  * * *
28  * * *
29  * * *
30  * * *

It appears Telstra Global or AWS block the tracking of network path closer to the destination so I will ping to see how long the trip takes

bytes from ec2-##-##-##-##.ap-southeast-2.compute.amazonaws.com (##.##.##.##): icmp_seq=1 ttl=50 time=103 ms

It is obvious the longest part of the response to the client is not the GeoQuery on the MongoDB cluster or processing in NodeJS but the travel time for the packet and securing the packet.

My server locations are not optimal, I cannot move the AWS MongoDB to Singapore because MongoDB doesn’t have servers in Singapore and Digital Ocean don’t have servers in Sydney. I should move my services on Digital Ocean to Sydney but for now, let’s see how far this Digital Ocean Server in Singapore and MongoDB cluster in Sydney can go. I wish I knew about Vultr as they are like Digital Ocean but have a location in Sydney.

Security

Secure (SSL) communication is now mandatory for Apple and Android apps talking over the internet so we can’t eliminate that to speed up the connection but we can move the server. I am using more modern SSL ciphers in my SSL certificate so they may slow down the process also. Here is a speed test of my servers cyphers. If you use stronger security so I expect a small CPU hit.

fyi: I have a few guides on adding a commercial SSL certificate to a Digital Ocean VM and Updating OpenSSL on a Digital Ocean VM. Guide on configuring NGINX SSL and SSL. Limiting ssh connection rates to prevent brute force attacks.

Server Limitations and Benchmarking

If you are running your website on a shared server (e.g CPanel domain) you may encounter resource limit warnings as web hosts and some providers want to charge you more for moderate to heavy use.

Resource Limit Is Reached 508
The website is temporarily unable to service your request as it exceeded resource limit. Please try again later.

I have never received a resource limit reached warning with Digital Ocean.

Most hosts (AWS/Digital Ocean/Azure etc) all have limitations on your server and when you exceed a magical limit they restrict your server or start charging excess fees (they are not running a charity). AWS and Azure have different terminology for CPU credits and you really need to predict your applications CPU usage to factor in the scalability and monthly costs. Servers and databases generally have a limited IOPS (Input/Output operations a second) and lower tier plans offer lower IOPS. MongoDB Atlas lower tiers have < 120 IOPS a second, middle tiers have 240~2400 IOPS and higher tiers have 3,000,20,000 IOPS

Know your bottlenecks

The siege HTTP stress testing tool is good, the below command will throw 400 local HTTP connections to your website.

#!/bin/bash
siege -t1m -c400 'http://your.server.com/page'

The results seem a bit low: 47.3 trans/sec. No failed transactions through 🙂

** SIEGE 3.0.5
** Preparing 400 concurrent users for battle.
The server is now under siege...
Lifting the server siege.. done.

Transactions: 2803 hits
Availability: 100.00 %
Elapsed time: 59.26 secs
Data transferred: 79.71 MB
Response time: 7.87 secs
Transaction rate: 47.30 trans/sec
Throughput: 1.35 MB/sec
Concurrency: 372.02
Successful transactions: 2803
Failed transactions: 0
Longest transaction: 8.56
Shortest transaction: 2.37

Sites like http://loader.io/ allow you to hit your web server or web page with many hits a second from outside of your server. Below I threw 50 concurrent users at a node API endpoint that was hitting a geo query on my MongoDB cluster.

The server can easily handle 50 concurrent users a second. Latency is an issue though.

I can see the two secondary MongoDB servers being queried 🙂

Node has decided to only use one CPU under this light load.

I tried 100 concurrent users over 30 seconds. CPU activity was about 40% of one core.

I tried again with a 100-200 concurrent user limit (passed). CPU activity was about 50% using two cores.

I tried again with a 200-400 concurrent user limit over 1 minute (passed). CPU activity was about 80% using two cores.

It is nice to know my promised based NodeJS code can handle 400 concurrent users requesting a large dataset from GeoJSON without timeouts. The result is about the same as siege (47.6 trans/sec) The issue now is the delay in the data getting back to the user.

I checked the MongoDB cluster and I was only reaching 0.17 IOPS (maximum 100) and 16% CPU usage so the database cluster is not the bottleneck here.

Out of curiosity, I ran a 400 connection benchmark to the node server over HTTP instead of HTTPS and the results were near identical (400 concurrent connections with 8000ms delay).

I really need to move my servers closer together to avoid the delays in responding. 47.6 served geo queries (4,112,640 a day) a second with a large payload is ok but it is not good enough for my application yet.

Limiting Access

I may limit access to my API based on geo lookups ( http://ipinfo.io is a good site that allows you to programmatically limit access to your app services) and auth tokens but this will slow down uncached requests.

Scale Up

I can always add more cores or memory to my server in minutes but that requires a shutdown. 400 concurrent users do max my CPU and raise the memory to above 80% so adding more cores and memory would be beneficial.

Digital Ocean does allow me to permanently or temporarily raise the resources of the virtual machine. To obtain 2 more cores (4) and 4x the memory (8GB) I would need to jump to the $80/month plan and adjust the NGINX and Node configuration to use the more cores/ram.

If my app is profitable I can certainly reinvest.

Scale Out

With MongoDB clusters, I can easily clone ( shard ) a cluster and gain extra throughput if I need it, but with 0.17% of my existing cluster being utilised I should focus on moving servers closer together.

NGINX does have commercial level products that handle scalability but this costs thousands. I could scale out manually by setting up a Node Proxies to point to multiple servers that receive parent calls. This may be more beneficial as Digital Ocean servers start at $5 a month but this would add a whole lot of complexity.

Cache Solutions

Nginx Caching
OpCache if you are using PHP.
Node-cache – In memory caching.
Redis – In memory caching.

Monitoring

Monitoring your server and resources is essential in detecting memory leaks and spikes in activity. HTOP is a great monitoring tool from the command line in Linux.

http://pm2.keymetrics.io/ is a good node package monitoring app but it does go a bit crazy with processes on your box.

Communication

It is a good idea to inform users of server status and issues with delayed queries and when things go down inform people early. Update: Article here on self-service status pages.

The Future

UPDATE: 17th August 2016

I set up an Amazon Web Services ECS server ( read AWS setup guide here ) with only 1 CPU and 1GB ram and have easily achieved 700 concurrent connections. That’s 41,869 geo queries served a minute.

Creating an AWS EC2 Ubuntu 14.04 server with NGINX, Node and MySQL and phpMyAdmin

The MongoDB Cluster CPU was 25% usage with 200 query opcounters on each secondary server.

I think I will optimize the AWS OS ‘swappiness’ and performance stats and aim for 2000 queries.

This is how many hits I can get with the CPU remaining under 95% (794 geo serves a second). AMAZING.

Another recent benchmark:

UPDATE: 3rd Jan 2017

I decided to ditch the cluster of three AWS servers running MongoDB and instead setup a single MongoDB instance on an Amazon t2.medium server (2 CPU/4GB ram) server for about $50 a month. I can always upgrade to the AWS MongoDB cluster later if I need it.

Ok, I just threw 2000 concurrent users at the new AWS single server MongoDB server and the server was able to handle the delivery (no dropped connections but the average response time was 4,027 ms, this is not ideal but this is 2000 users a second (and that is after API handles the ip (banned list), user account validity, last 5 min query limit check (from MySQL), payload validation on every field and then MongoDB geo query).

The two cores on the server were hitting about 95% usage. The benchmark here is the same dataset as above but the API has a whole series of payload, user limiting, and logging

Benchmarking with 1000 maintained users a second the average response time is a much lower 1,022 ms. Honestly, if I have 1000-2000 users queries a second I would upgrade the server or add in a series of lower spec AWS t2.miro servers and create my own cluster.

Security

Cheap may not be good (hosting or DIY), do check your website often in https://www.shodan.io and see if it has open software or is known to hackers.
If this guide has helped please consider donating a few dollars.

Donate and make this blog better

Ask a question or recommend an article
[contact-form-7 id=”30″ title=”Ask a Question”]

v1.7 added self-service status page info and Vultr info

Short: https://fearby.com/go2/scalability/

Personal Development Blog...

Coding for fun since 1996, Learn by doing and sharing.

Buy a domain name, then create your own server (get $25 free credit)

View all of my posts.

mongodb

How to setup and secure MongoDB on Ubuntu 16.04 and verify with Studio 3T

Application scalability on a budget (my journey)

Popular

Security

Code

Tech

Wordpress

General

mongodb

Footer

Popular

Security

Code

Tech

Wordpress

General