Skip to content

NOC 19.1

In accordance to our Release Policy we're proudly present release 19.1.

19.1 release contains of 272 bugfixes, optimisations and improvements.

Highlights

Usability

NOC Theme

19.1 introduces genuine NOC theme intended to replace venerable ExtJS' gray. New flat theme is based upon Triton theme using NOC-branded colors. NOC theme can be activated via config on per-installation basis. We expect to make it default several releases later.

Collection Sharing

Collections is the viable part of NOC. We're gracefully appreciate any contributions. In order to make contribution process easier we'd added Share button just into JSON preview. Enable collections sharing in config and create collections Merge Requests directly from NOC interface by single click.

New fm.alarm

Alarm console was thoroughly reworked. Current filters settings are stored in URL and can be shared with other users. Additional filters on services and subscribers were also added.

New runcommands

Run Commands interface was simplified. Left panel became hidden and working area was enlarged. List of objects can be modified directly from commands panel. Configurable command logging option was added to mrt service.

Alarm acknowledgement

Alarms can be acknowledged by user to show that alarm has been seen and now under investigation.

Integration

We continue to move towards better integration with external systems. Our first priority is clean up and document API to be used by external systems to communicate with NOC.

NBI

A new nbi service has been introduced. nbi service is the host for Northbound Interface API, allowing to access NOC's data from upper-level system.

objectmetrics API <api-nbi-objectmetrics> for requesting metrics has introduced

DataStream

DataStream service <services-datastream> got a lots of improvements:

  • alarm datastream <api-datastream-alarm> for realtime alarm status streaming
  • managedobject datastream <api-datastream-managedobject> got asset part containing hardware inventory data

API Key ACL

API Key <reference-apikey> got and additional ACL, allowing to restrict source addresses for particular keys.

Threshold Profiles

Threshold processing became more flexible. Instead of four fixed levels (Low error, low warning, high warning and high error) an arbitrary amount of levels can be configured via Threshold Profiles. Arbitrary actions can be set for each threshold violation, including:

  • raising of alarm
  • sending of notification
  • calling handlers

Threshold closing condition can differ from opening one, allowing hysteresis to suppress unnecessary flapping.

Syslog archiving

Starting from 19.1 NOC can be used as long-term syslog archive solution. ManagedObjectProfile got additional Syslog Archive Policy setting. When enabled, syslogcollector <service-syslogcollector> service mirrors all received syslog messages to long-term analytic ClickHouse database. ClickHouse supports replication, enforces transparent compression and has very descent IOPS requirements, making it ideal for high-load storage.

Collected messages can be queried both through BI interface and direct SQL queries.

STP Topology metrics

STP topology changes metrics supported out-of-box. Devices' dashboards can show topology changes on graphs and further analytics can be applied. In combination with BI analytics network operators got the valuable tool to investigate short-term traffic disruption problems in large networks.

New platform detection policy

Behavior on new platform detection became configurable. Previous behavior was to automatically create platform, which can lead to headache in particular cases. Now you have and options configured from Managed Object Profile:

  • Create - preserve previous behavior and create new platform automatically (default)
  • Alarm - raise umbrella alarm and stop discovery

Firmware Policy

Behavior on firmware policy violation also became configurable. ManagedObjectProfile allow to configure following options:

  • Ignore - do nothing (default)
  • Ignore&Stop - Stop discovery
  • Raise Alarm - Raise umbrella alarm
  • Raise&Stop - Raise umbrella alarm and stop discovery

New Profiles

19.1 contains support for TV optical-to-RF converters widely used in cable TV networks. 2 profiles has introduced:

  • IRE-Polus.Taros
  • Vector.Lambda

In addition, an NSM.TIMOS <profile-NSM.TIMOS> profile became available

Performance, Scalability and optimisations

Caps Profile

caps discovery <discovery-box-caps> used to collect all known capabilities for platform. Sometimes it is not desired behavior. So Caps profiles are introduced. Caps Profiles allows to enable or disable particular group of capabilities checking. Group of capabilities can be explicitly enabled, disabled or enabled only if required for configured topology discovery.

High-precision timers

19.1 contains time.perf_counter backport to Python 2.7. perf_counter uses CPU counters to measure time intervals. It's about 2x faster than time.time and allows more granularity in time interval measurements (time.time changes only \~64 times per second). This greatly increases precision of span interval measurements and of ping's RTT metrics.

Pymongo connection pool tuning

Our investigations showed that current pymongo's connection pool implementation has design flaw that leads to Pool connection poisoning problem under the common NOC's workfload: once opened mongo connection from discovery never been closed, leaving lots of connection after the spikes of load. We'd implemented own connection pool and submitted pull request to pymongo project (See LIFO connection pool policy).

ClickHouse table cleanup policy

ClickHouse table retention policy may be configured on per-table basis. partition dropping is automated and may be called manually or from cron.

Redis cache backend

Our investigations showed that memcached is prone to randomly forget keys while enough memory is available. This leads to random discovery job states loss, leading to resetting the state of measured snmp counters, loosing random metrics and leaving empty gaps in grafana dashboards. Problem is hard to diagnose and only cure is to restart memcached process. Problem lies deeply in memcached internal architecture and unlikely to be fixed.

So we'd introduced support for Redis cache backend. We'll make decision to make or not to make it default cache backend after testing period.

SO_REUSEPORT & SO_FREEBIND for collectors

syslogcollector <service-syslogcollector> and trapcollector <service-trapcollector> services supports SO_REUSEPORT and SO_FREEBIND options for listeners.

SO_REUSEPORT allows to share single port by several collector' processes using in-kernel load balancing, greatly improving collectors' throughoutput.

SO_FREEBIND allows to bind to non-existing address, opening support for floating virtual addresses for collector (VRRP), CARP) etc), adding necessary level of redundancy.

In combination with new Syslog Archive <release-19.1-syslog-archive> and ClickHouse table cleanup policy <release-19.1-clickhouse-cleanup> features NOC can be turned to high-performance syslog archiving solution.

GridVCS

GridVCS is NOC's high-performance redundant version control system used to store device configuration history. 19.1 release introduces several improvements to GridVCS subsystem.

  • built-in compression - though Mongo's Wired Tiger uses transparent compression on storage level, explicit compression on GridVCS level reduces both disk usage and database server traffic.
  • Previous releases used mercurial's mdiff to calculate config deltas. 19.1 uses BSDIFF4 format by default. During our tests BSDIFF4 showed better results in speed and delta size.
  • ./noc gridvcs <man-gridvcs> command got additional compress subcommand, allowing to apply both compression and BSDIFF4 deltas to already collected data. While it can take a time for large storages it can free up significant disk space.

API improvements

profile.py

SA profiles <profiles> used to live in __init__.py file. Our code style advises to keep __init__.py empty for various reason. Some features like profile loading from custom will not work with __init__.py anyway.

So starting with 19.1 it is recommended to place profile's code into profile.py file. Loading from __init__.py is still supported but it is a good time to plan migration of custom profiles.

OIDRule: High-order scale functions

Metrics scale can be defined as high-order functions, i.e. function returning other functions. It's greatly increase flexibility of scaling subsystem and allows external configuration of scaling processing.

IPAM seen propagation

Workflow's seen signal can be configured to propagate up to the parent prefixes. Address and Prefix profiles got new Seen propagation policy setting which determines should or should not parent prefix will be notified of child element seen by discovery.

Common usage pattern is to propagate seen to aggregate prefixes to get notified when aggregate became used.

Phone workflow

phone module got full-blown workflow support. Each phone number and phone range has own state which can be changed manually or via external signals.

Breaking Changes

Migration

New features

MR Title
MR1515 Add estimate param to job command.
MR1525 Collection sharing
MR1498 DataStream: asset part of ManagedObject
MR1516 APIKey ACL
MR1518 Add export/import to ./noc beef command.
MR1514 Configurable behavior on new platforms and firmware policy violations
MR1512 new fm-alarm
MR1508 IRE-Polus.Taros profile
MR1507 Summary glyph display order
MR1501 Add Errors Out and Discards In for ddash
MR1595 Add periodic diagnostic to alarm diagnostic.
MR1460 ThresholdProfile: Flexible thresholds configuration
MR1497 Alarm acknowledge/unacknowledge
MR1491 network stp topology changes on graph
MR1476 GridVCS: bsdiff4 patches and zlib compression
MR1432 Add initial support for NSN.TIMOS profile
MR1475 High-precision timers
MR1458 Add Network \| STP \| Topology Changes metric.
MR1455 CapsProfile
MR1396 redis cache backend
MR1404 #794: IPAM seen propagation policy
MR1384 card: project card
MR1390 #942: Remove Root container
MR1352 #694 ClickHouse table cleaning policy
MR1363 Vector.Lambda profile
MR1283 NOC theme
MR1336 OIDRule: High-order scale functions
MR1338 #539 Syslog archiving
MR1255 nbi service
MR1345 #497 syslogcollector/trapcollector: SO_REUSEPORT and IP_FREEBIND support
MR1252 datastream: Alarm datastream
MR1226 #636 Phone Workflow integraton
MR1113 Profiles should be moved to profile.py

Improvements

MR Title
MR1534 Set default loglevel on command to info.
MR1535 Update RU translation.
MR1527 FM Alarms localization
MR1529 Add full_name to PlatformApplication query fields.
MR1522 Update/report interface status3
MR1510 Update DLink.DxS profile
MR1556 Update Rotek.BT profile (get_version)
MR1539 Update settings by snmp requests for Dlink.DxS
MR1500 Update Juniper.JUNOS profile
MR1503 Speedup NetworkSegment Service Summary count.
MR1502 Update Report for Interfaces Status
MR1490 Generic.get_chassis_id disable Multicast MAC address check.
MR1494 SKS.SKS and BDCOM.IOS config volatile.
MR1488 Add platform to Linksys.SPS2xx profile.
MR1451 Unified loader interface
MR1485 Add caps profile to managedobject profile ETL loader.
MR1484 Add to Linksys.SPS24xx platform OID
MR1434 ./noc dnszone import: Parse complex \$TTL directives
MR1452 Move methods from SegmentTopology to BaseTopology
MR1449 inv.networksegment: Bulk fields calculation
MR1454 Add to_python method to ClickHouse model.
MR1466 Add to Huawei.VRP profile get Serial Number attributes.
MR1453 ResourceGroup: TreeCombo
MR1461 Add config_volatile to Orion.NOS and SKS.SKS
MR1447 Increase query interval for core.pm.utils function.
MR1417 Extendable Generic.get_chassis_id script
MR1441 Add patern more to Huawei.MA5600T profile.
MR1440 Optimize reportalarmdetail and reportobjectdetail.
MR1439 Update/eltex mes execute snmp
MR1437 Delete aggregateinterface bi model
MR1420 Add dynamically loader BI models.
MR1418 RepoPreview MVVC
MR1427 Migrate Alstec.24xx.get_metrics to new model.
MR1414 networkx 2.2 and improvend spring layout implementation
MR1413 dns.dnsserver: Remove sync field
MR1400 requests 2.20.0
MR1392 Diverged permissions
MR1382 #961 Process All addresses and Loopback address syslog/trap source types
MR1408 Add Generic.get_vlans and get_switchport scripts.
MR1409 Add get_lldp_snmp capabilities for Cisco.IOS
MR1410 Change Iface Name OID for get_ifindexes Plante.WCDG profile
MR1374 migrate inv map to leafletjs
MR1381 #971 trapcollector: Gentler handling of BER decoding errors
MR1371 dnszone: Ignore addresses with missed FQDNs
MR1369 Add theme variable to login page render.
MR1368 Add "Up/10M" to reportcolumndatasource for report object detail.
MR1391 CODEOWNERS file
MR1353 #788 Try to determine VRF's for DHCP address discovery
MR1361 DataStream: Load from custom
MR1251 Customized PyMongo connection pool
MR1397 Juniper.junos
MR1398 auto logout remove msg
MR1385 Dead code cleanup
MR1284 runcommands refactoring
MR1375 Cleanup pyrule from classifier trigger.
MR1341 theme body padding for form
MR1362 Add convert ifname for MA4000
MR1349 Cleanup AlliedTelesis profiles.
MR1346 snmp: Try to negotiate broken error_index
MR1344 Add Interface packets dashboard in MO dash.
MR1318 Migrate ReportProfileCheck report to ReportStat Backend.
MR1228 Move numpy import to parse_table_header in lib/text.
MR1316 Additional LLDP constants and caps conversion functions
MR1324 Add TZ parameter to NBI query.
MR1126 #260 add password widget
MR1322 Add get_lldp_neighbors and get_capabilities for Qtech2500 profile
MR1264 Add clean to events command.
MR1307 Update Alcatel.OS62xx profile
MR1285 Hp.1910
MR1190 Update Rotek.RTBSv1 profile
MR1297 Add Rotek.RTBSv1.get_metrics script.
MR1296 add get_config script for Dlink.DVG profile
MR1291 Extend job command.
MR1276 Add clean_id_bson to alarm datastream.
MR1274 threadpool: Cleanup worker result just after setting future
MR1286 Add late_alarm metric to seflmon fm collector.
MR1249 Profile.cli_retries_super_password parameter
MR1250 perm: response layout
MR1229 ldap: Additional check of username format
MR1214 Add telemetry to MRT service.
MR1244 Add physical iface count metrics to selfmon.
MR1216 Add vv (very verbose parameter) to test command.

Bugfixes

MR Title
MR1487 Use ch_escape function on syslogcollector.
MR1478 Fix Report Unknown Model Summary.
MR1477 Fix Generic.get_capabilities snmp_v1
MR1474 Fix load metric priority. Profile first, Generic second.
MR1473 Fix Radio and SLA graph template for CH use.
MR1481 Fix displaying platform in some Cisco Stackable switches
MR1479 Fix Rotek RTBSv1 Tx Power metric
MR1438 Fix Huawei.VRP.get_mac_address_table script
MR1422 Fix MikroTik.RouterOS.get_interface_status_ex script
MR1462 Fix heavy cpu load on show vlan command
MR1469 Fix Huawei.VRP.get_version SerialNumber rogue chart.
MR1467 Fix DLink.DxS profile
MR1463 Fix Extreme.XOS.get_interfaces script
MR1465 Fix PrefixBookmark import loop.
MR1464 Fix selfmon FM metric name.
MR1457 Fix getting single oid from multiple metrics.
MR1444 Fix Iskratel.MSAN profile
MR1450 Fix Orion.NOS.get_lldp_neighbors script
MR1433 Fix Cisco.IOSXR profile
MR1436 Fix Cisco.NXOS.get_arp script
MR1448 Fix c.id in card.base.f_object_location.
MR1445 login button width fixed
MR1459 Lambda fix metrics
MR1468 Huawei.VRP.get_version strip serial number.
MR1435 InfiNet fix init.py pattern_prompt
MR1426 inv.map fix performance
MR1443 Fix Object.get_coordinate_zoom method.
MR1428 Fix Huawei.MA5600T profile
MR1430 Fix Alstec.24xx metric name.
MR1289 Fix Juniper.JUNOS.get_lldp_neighbors Parameter 'remote_port' required.
MR1423 Fix managedobject and object card for delete Root.
MR1429 Fix avs Object.get_address_text method
MR1424 Fix getting container path in Alarm Web and Card.
MR1425 Fix typo in ManagedObject console UI.
MR1483 Fix Raisecom.ROS.get_lldp_neighbors script
MR1395 Fix container field type when remove Root.
MR1401 ip.ipam: Fix prefix style
MR1411 Fix Add Objects to Maintenance from SA !582
MR1386 fix error "Отсутствуют адреса линка" in dns.reportmissedp2p
MR1405 Fix Discovery Problem Detail report trace.
MR1394 Fix get_lldp_neighbors by SNMP
MR1407 Fix Plantet.WGSD Profile
MR1403 #976 Fix closing of already closed session
MR1406 Fix avs environments graph tmpl 148
MR1402 jsloader fixed
MR1399 Fix Ubiquiti profile and Generic.get_interfaces(get_bulk)
MR1389 Fix Report Discovery Poison
MR1378 Fix theme variable in desktop.html template.
MR1379 Fix etl managedobject resourcegroup
MR1367 Fix prompt in Rotek.RTBS.v1 profile.
MR1366 Fix workflow CH dictionary.
MR1365 Fix selfmon FM collector.
MR1364 Fix update operation for superuser on secret field.
MR1376 noc/noc#952 Fix metric path for Environment metric scope.
MR1310 #964 Fix SA sessions leaking
MR1357 Natex_fix_sn
MR1355 Cisco_fix_snmp
MR1370 Increase ManagedObject cache version for syslog archive field.
MR1356 Fix Interface name Eltex.MES
MR1354 Fix Interface name QSW2500
MR1335 Fix get_interfaces, add reth aenet
MR1343 Fix profilecheckdetail.
MR1342 Fix secret field.
MR1351 InfiNet-fix-get_version
MR1350 Fix get_interfaces for Telindus profile
MR1348 Fix stacked packets graph.
MR1360 Fix Interface name ROS
MR1326 Fix ch_state ch datasource.
MR1332 Fix Span Card view from ClickHouse data.
MR1331 Fix Huawei.MA5600T.get_cpe.
MR1328 Fix Cisco.IOS.get_lldp_neighbors regex
MR1327 Fix get_interfaces for Rotek.RTBSv1, add rule for platform RT-BS24
MR1325 Fix CLIPS engine in slots.
MR1320 Fix SNMP Trap OID Resolver
MR1323 Fix get_interfaces for QSW2500 (dowwn -> down)
MR1269 Fix Juniper.JUNOSe.get_interfaces script
MR1278 Fix Huawei.MA5600T.get_cpe ValueError.
MR1314 Fix Generic.get_chassis_id script
MR1306 Fix AlliedTelesis.AT8000S.get_interfaces script
MR1313 Fix Cisco.IOS.get_version for ME series
MR1262 Fix Raisecom.RCIOS password prompt matching
MR1238 Fix Juniper.JUNOS profile
MR1279 Fixes empty range list in discoveryid.
MR1305 Fix Rotek.RTBS profiles.
MR1304 Fix some attributes for Span in MRT serivce
MR1303 Fix selfmon escalator metrics.
MR1300 fm.eventclassificationrule: Fix creating from event
MR1295 Fix ./noc mib lookup
MR1298 Fix custom metrics path in Generic.get_metrics.
MR1290 Fix custom metrics.
MR1225 noc/noc#954 Fix Cisco.IOS.get_inventory script
MR1275 Fix InfiNet.WANFlexX.get_lldp_neighbors script
MR1281 Delete quit() in script
MR1280 Fit get_config
MR1277 Fix Zhone.Bitstorm.get_interfaces script
MR1254 Fix InfiNet.WANFlexX.get_interfaces script
MR1272 Fix vendor name in SAE script credentials.
MR1246 Fix Huawei.VRP pager
MR1268 Fix scheme migrations
MR1245 Fix Huawei.VRP3 prompt match
MR1259 fix_error_web
MR1258 Fix managed_object_platform migration.
MR1260 Fix pm.util.get_objects_metrics if object_profile metrics empty.
MR1253 Fix path in radius(services)
MR1203 Fix prompt pattern in Eltex.DSLAM profile
MR1247 Fix consul resolver index handling
MR1239 #911 consul: Fix faulty state caused by changes in consul timeout behavior
MR1237 #956 fix web scripts
MR1221 Fix Generic.get_lldp_neighbors script
MR1243 Fix now shift for selfmon task late.
MR1231 noc/noc#946 Fix ManagedObject web console.
MR1235 Fix futurize in SLA probe.
MR1234 Fix Huawei.MA5600T.get_cpe.
MR1220 Fix Generic.get_interfaces script
MR1204 Fix Raisecom.ROS.get_interfaces script
MR1215 Fix platform field in Platform Card.
MR1210 ManagedObject datastream: Fix links property. capabilities property
MR1212 Fix save empty metrics threshold in ManagedObjectProfile UI.
MR1211 Fix interface validation errors in Huawei.VRP, Siklu.EH, Zhone.Bitstorm.
MR1317 sa.managedobjectprofile: Fix text
MR1340 noc/noc#966
MR1294 selfmon typo in mo
MR1105 #856 Rack view fix
MR1208 #947 Fix MAC ranges optimization