Site /
NjmonLinuxV83ManualPage
njmon for Linux version 83 Manual Page
If you install with the "ninstall" supplied simple shell script as root user, it installs the manual pages for njmon and nimon (the same page covers both).
Access with the regular UNIX/Linux command: man njmon
njmon(nimon) njmon(nimon) NAME njmon and nimon are the same program. The name of the binary file deter- mines the output format. This manual page covers both: - njmon: save performance statistics in JSON format. - nimon: Save same stats in InfluxDB Line Protocol format. BRIEFLY This performance statistics agent outputs in two formats: - JSON format for Time-Series databases including InfluxDB and others. One JSON snapshot per line i.e. linefeed separated JSON records. - InfluxDB Line Protocol format can be injected directly to InfluxDB. One measure per line. Default is stdout to allow using a shell pipe to other commands. njmon can also directly connect to a central daemon or InfluxDB to save the data to a server. The ninstall shell script puts njmon and nimon into /usr/local/bin. Also it installs the njmon and nimon manual pages. Need to be run by the root user. The nmeasure programs allow adding your own data to InfluxDB. Version Version: 83 SYNOPSIS Output njmon version and stop - also works for nimon: njmon -! Includes the njmon version, njmon source code details, OS distro, OS version on which it was compiled and compiled date. Output help information and stop - also works for nimon: njmon -h or -? Take the command line options from a file - also works for nimon: njmon -a file Collect performance statistics to a file - also works for nimon but the file format will be different: njmon -s seconds -c count [-m dir] -f [other options] Collect performance statistics for a central njmon.py daemon service on a different server: njmon -s seconds -c count -i hostname -p port [other options] Collect performance statistics and send to an InfluxDB database. The data - base administrator sets the user name and password - which are optional. nimon -s seconds -c count -i hostname -p port -x database [-y user] [-z passwd] [other options] Other options - see below: -a [file] [-A Hostname] [-b] [-B] [-d] [-D] [-F] [-I] [-J] [-k] [-K file] [-m directory] [-M] [-n] [-O org] [-P] [-q tags] [-r] [-R] [-t percent] [-T token] [-v] [-w] [-W] Other options for njmon only: [-e] Other options for nimon only: [-A hostname] [-H] ALL OPTIONS -a file Command line arguments are to be found in the file. This hides passwords from the command and "ps" output. Use the same arguments, all on the first line, space separated Only have the -a option on the actual command line. Don't include the command name of the -a option in the file. Seconds between snapshots of data (default 60 seconds) -A hostname Use this hostname instead of the real hostname. Apparently, some crazy system administrators have multiple servers with the same hostname, so this is a workaround. -b Switch off adding pid to the process names. For example: "ksh_76927" becomes "ksh". This allows all the ksh process stats to be added to- gether for a total number. -B Switch on BTRFS statistics collecting. BTRFS = Better filesystem found in the /proc/diskstats file. This file system is prefered by some Linux distrobutions including SAP. If you use BTRFS then add this switch, otherwise ignore it. -c count Number of statistics snapshots (default forever). See the -s option for the time in seconds between snapshots. -d Switch on debug tracing (output may no longer be JSON or InfluxDB Line Protocol format). This is only useful for debugging the njmon code where the output format has a problem. Used this with the -f option to save the debug output to the .err file. This can be useful to determine why some features are getting turned off like GPFS or socket issues. -D By default njmon and nimon, used the lsblk command to determine reg- ulars disks in the /proc/diskstats file. Thus ignoring then stats for disk partitions, LVM resoures and multple paths. This stops the dupilication of disk stats due to the various layers. The -D option switches off this behaviour, so that all stats in the /proc/diskstats are reported. -e NJMON mode: Switch on "elastic" (also called ElasticSearch) sub-sec- tions are arrays rather than JSON structures for example: disks, networks & filesystems. nimon mode only. -f Output the statistics to following files (instead of stdout): NJMON mode data: hostname_<year><month><day>_<hour><minutes>.json NIMON mode data: hostname_<year><month><day>_<hour><minutes>.in- fluxlp Error: hostname_<year><month><day>_<hour><minutes>.err Note: problems can occur if using this option and the -i option as the data can't go both to a file and to a socket. -ff In NJMON mode: If you add a second "f" like "-ff" or "-f -f" then each snapshot has its own file. The file name format is host- name_<year><month><day>_<hour><minutes>_<6 digit sequence_num- ber>.json. The sequence number starts at 000000. There is only a single error file. In NIMON mode: If you add a second "f" like "-ff" or "-f -f" then the .influxlp output file has a nano seconds since epoch start time- stamp at the end of every measure. This should allow it to be in- serted in to the InfluxDB using the "influx" command. -F Switch off filesystem stats (autofs and tmpfs can cause issues) -h Output the helpful manual information. -H NIMON mode: This option makes nimon send to InfluxDB the full host- name (as the host tag) instead of the short hostname. This is impor- tant if many servers have the same shortname in different domains. For example, myserver.achme.com normally has host=myserver. With the -H option the Fully Qualified Domain Name (FQDN) is used host=myserver.achme.com. -i IP-address or hostname NJMON mode: The IP address or Hostname of the njmon central nj- mond.py daemon. NIMON mode: The IP address or Hostname of the InfluxDB server. -I Normally, the name of the njmon or nimon command determines the out- put format i.e. JSON format for njmon or InfluxDB Line Protocol for- mat for nimon. The -I option forces output to be be InfluxDB Line Protocol format regardless of the command name. So these two are equivant: nimon njmon -I -J Normally, the name of the njmon or nimon command determines the out- put format i.e. JSON format for njmon or InfluxDB Line Protocol for- mat for nimon. The -I option forces output to be be InfluxDB Line Protocol format regardless of the command name. So these two are equivant: njmon nimon -J Note: the ninstall script on Linux installs the njmon binary for example njmon_Ubunru_X86_64_v83 in to /usr/local/bin/njmon. Then a ln commnd used to Link the /usr/local/bin/njmon to /use/local/bin/nimon. ln usr/local/bin/njmon /use/local/bin/nimon We then have one binary with two file names in the /usr/local/bin directory. -k The njmon command uses a PID file /tmp/njmon.pid and nimon uses /tmp/nimon.pid. If the PID file is not found this the command con- tinues running. If the file is found, the process PID is read from it & if it is found that the process is still running then this com- mand exits. If no process is found running then this command con- tinues running. This allows you to try starting command say once an hour from crontab. This will restart the command only if the previ- ous one stopped. The command will remove the PID file, when it stops normally. If the PID file is found but can't be accessed, the command will print a warning and stop. The file could be owned by a different user or no read permission. -K pidfile This is same as the -k option but the user decides the directory and filename for the file containing the Process ID(PID). Plus you can run multiple command processes, using different PID files. Some users prefer to use a file in, for example, /var/logs but that re- quires root user access to start njmon. Others prefer to avoid /tmp. Alternative, if you need to use a different filename then set the shell environemnt NJIMON_PID_FILE or NIMON_PID_FILE, before starting njmon or nimon. If the PID file is found but can't be accessed, the command will print a warning and stop. The file could be owned by a different user or no read permission. -m directory The program will change to the directory before outputing any sta- tistics or creating a file. -M Filesystems listed by mount point (like AIX njmon) and not filesys- tem name. -n No PID printed out i when the command is started. The PID can be useful in scripts to stop (kill) the njmon command later. -O org For the nimon mode sets the Organisation of the InfluxDB 2+. Organi- sation is used to separate data, users and dashboard. If not set the organisation is "default". This could be a company name, if hosting more than one. Or the name of a department or group of users or workload. -p port NJMON mode: port number of the central njmond.py host. For example: 8181. NIMON mode: port number used by InfluxDB. If not set the InfluxDB default port 8086 is used. -P Add individual process stats. If the systems has hundreds or more processes, this option can drastically increase njmon CPU time, drastically increase the number of stats and make files much larger. This has implications on database sizes. -q tags Add extra stats tags for the NIMON mode measures. For example, the server owner, contact details, department, key applications, project, storage details, data centre and location. This information can aid filtering (selecting servers, diags and escalation pro- cesses. Limitations 250: characters, no spaces, comma separated. For example: nimon ... -q dept=WebUK,dc=Lon- don2,app1=DB2,app2=WAS14,project=B42,disk=FlashSystems -r Random start pause. Stops cron making every program send data in sync. Assuming the virtual machines have exactly the same time by using Network Time Protocol (NTP). -R Reduced stats - skip logical CPU stats for SMT threads. -s seconds Seconds between snapshots of data (default 60 seconds). njmon at- tempts to keep exaclty to this number of seconds, even allowing for delays in scheduling geting on the processor and the runtime of the njmon itself. -t percent Set ignore process CPU use percent threshold (default 0.01%). This is used to ingore processes using very little CPU time and so reduce data sizes. On Linux there can be a few hundred processes doing nothing or using very little CPU time. -T <token> For nimon mode, sets the Security Token and switches on InfluxDB 2+ mode. This uses a different REST API to InfluxDB 2.0 and higher. See also the -O org option. -v Debugging aid: show data + response to InfluxDB on stderr. NIMON mode only. Currently, InfluxDB security certificates are not imple- mented -w Switch on Telegraf or Prometheus output mode. This removes the HTTP REST API: POST and Content-Length details from the output stream. NIMON mode only. -W Switch off warning messages in error output stream. Use this once you are confident that the warnings are ignorable. This includes badly behaved file systems like those that are mounted but require root user access to read the stats and njmon run as a regular user. -x database NIMON Mode: the name of the InfluxDB database in to which the sta- tistics are placed. If this option is not set, then a data- base/bucket name of "njmon" is used. With InfluxDB2+ the database is called a bucket but means roughly the same thing = the place where the data goes. -y user The user name for accessing the InfluxDB database. Only needed, if you have switched on authentication for the API the the InfluxDB configuration. If it is not needed InfluxDB silently ignores this value. -z password The user password for accessing the InfluxDB database. Only needed, if you have switched on authentication for the API the the InfluxDB configuration. If it is not needed InfluxDB silently ignores this value. Notes on Spectrum Scale (GPFS) njmon and nimon automatically collects GPFS if te files /usr/lpp/mmfs/bin/mmksh and /usr/lpp/mmfs/bin/mmpmon are found on you systems. However, if GPFS is present but off the command can get confused. Use: export NOGPFS=1 to set a shell variable before running njmon or nimon will disable GPFS stats. You can also com- pile out GPFS support with -D NOGPFS NJMON EXAMPLES These examples using nimon but these also work for nimon. Assuming njmon and/or nimon is found at /usr/local/bin and this directory is in your $PATH 1 Collect JSON stats every 5 mins for 24 hours and save the file in /home/perf. njmon -s 300 -c 288 -m /home/perf -f 2 Piping the JSON stats to data handler program (perhaps in Python), all day at 30 second intervals. njmon -s 30 -c 1440 | myprog 3 Use the defaults (-s 60 -c forever) and JSON save to a file. njmon > my_server_today.json 4 Send JSON data to the central njmond.py daemon njmon -s 30 -c 2880 -i admin.acme.com -p 8181 5 Send JSON data to the central njmond.py daemon with process (-P) stats njmon -s 60 -c 1440 -P -i admin.acme.com -p 8181 6 Crontab entry to save JSON data to a file (2 minutes past midnight) in the /home/perf directory 2 0 * * * /usr/local/bin/njmon -s 60 -c 1440 -f -m /home/perf >/dev/null 7 Crontab entry to start daily at midnight and sending JSON data to central server 0 0 * * * /usr/local/bin/njmon -s 30 -c 2880 -i admin.acme.com -p 8181 >/dev/null 8 Crontab entry to send JSON data to central server, every hour of the day (if njmon is found to be running already this invokation stops quitely) 0 * * * * /usr/local/bin/njmon -k -s 30 -i admin.acme.com -p 8181 >/dev/null 9 Crontab use ssh to start njmon, send JSON data back via socket to a local InfluxDB "injector" program. This ssh conmmand is run on the InflusDB server and remotely starts njmon on the "endpoint" server. 0 0 * * * ssh root@endpoint /usr/local/bin/njmon -s 30 -c 2880 -i admin.acme.com -p 8181 >/dev/null NIMON EXAMPLES Assuming nimon is found at /usr/local/bin and in your $PATH 1 Collect stats every 3 seconds for 4 snapshots and place the InfluxDB Line Protocal format data file in /home/perf. This is useful to find the measure and statastics names. The output file: will be <hostname>_<date>_<time>.influxlp Without the -m flag and directory, the file is created in the current working directory. nimon -s 3 -c 4 -f -m /home/perf 2 Run the nimon command capturing stats every 30 seconds for one day. where influxbox if the server hostname of the server running In- fluxDB and 8086 is the port number (this is the default port), nji- mon is the InfluxDB database name and the login is user "Nigel" and password "passw0rd". The number of seconds in a day is 30 x 2880. nimon -s 30 -c 2880 -i influxbox -p 8086 -x njmon -y Nigel -z passw0rd 3 The same as example 2 but using the IP Address and adding process statistics (-P). Not using user and password. Using the default In- fluxDB database name "njmon" and the default InfluxDB port 8086. nimon -s 30 -c 2880 -P -i 9.137.62.3 4 The same as example 2 but checking if nimon is still running (-k). If yes, quietly stop this new njmon process. nimon -k -c 2880 -i influxbox -p 8086 -x njmon 5 Run nimon and send the data to Telegraf (-w and with port 9090) and then to Prometheus (this Prometheus details are in the Telegraf set- tings). nimon -s 30 -c 2880 -w -i influxbox -p 9090 6 Crontab entry to run 1 day - start 1 minute after midnight, save lo- cal data every 60 seconds 1 0 * * * /usr/local/bin/nimon -s 60 -c 1440 -i influxbox -p 8086 -x nimon 7 Crontab entry to hourly check/restart nimon (if nimon is not running = -k). Not using user and password and using defaults for database njmon and InfluxDB port 8086. 0 * * * * /usr/lbin/nimon -k -s 60 -i influxbox 8 Crontab entry to as above but output going to Telegraf (with port 9090) and then to Prometheus. Collecting stats every 60 seconds. 0 * * * * /usr/lbin/nimon -k -s 60 -w -i influxbox -p 9090 9 For InfluxDB2.0 or higher and the new REST API. Note the token is shortened in this example. Normally it is ~50 characters and to be found within InfluxDB 2+ CLI or GUI. /usr/lbin/nimon -k -s 30 -i influxbox -O IBM - T HyksUKH762-98...aB== AUTHOR Developer Nigel Griffiths (nigelargriffiths@hotmail.com) COPYRIGHT License GPLv3+: GNU GPL version 3 SEE ALSO nmeasure Add your own data to the InfluxSB database. For example, stats from a shell script or other commands. ninstall Simple shell script to install njmon, njmon and the manual pages Linux 1 njmon(nimon)